What does a data scientist do, anyway?

"Everything Starts Out Looking Like a Toy" (No.22)

Nov 28, 2020

This week’s toy: an emulator that allows you to play classic MS-dos games in your browser. I’ll be playing Wheel of Fortune and Jordan v Bird as one did in the old days. Edition No. 22 of this newsletter is here - it’s November 28, 2020.

The Big Idea

What’s the best way to learn a new skill? I’m exploring the road of becoming a Data Scientist, along with the pitfalls of just trying to determine what that means.

The Plan for the #NextPlay

Linkedin’s former CEO Jeff Weiner coined the phrase “Next Play” as a leadership idea that he borrowed from long time Duke Basketball coach Mike Krzyzewski. The idea is simple: focus on your next challenge whenever there is a change in possession in a game. At work, this might mean building a new skill, trying a new approach, or just applying yourself to something new to see what happens.

I’ve been involved in #data in one way or another for a number of years, but it’s hard to quantify exactly what it means or how best to capitalize on that skill. Does it mean “good with building in spreadsheet models?” Does it mean “able to visualize statistical models”? Does it mean “apt at answering questions that depend upon data analysis”? It means all of those things and none of them, paradoxically. So I decided to learn the basics of being a data scientist to see which parts of this job I wanted to be my #nextplay.

What does a Data Scientist Do, Anyway?

In researching the curriculum of “data boot camps” and finding online descriptions of Data Science positions, a few things are clear to me:

Data Scientists use mathematical models and structured ways of handling data to answer complex problems that often involve machine learning or complex data analysis
They use scripting tools like R and Python to write the glue that holds their models together
They use libraries like Pandas and Numpy to do the calculations needed to answer their questions
They understand a lot of math concepts and use sets, sorted sets, and other concepts to analyze problems

Putting aside for the moment questions about how to pose questions to be answered by Data Science and how to visualize or present the answers, these items above seemed like concepts I could learn and practice on the journey of becoming a Data Scientist, or at least someone who is a bit more scientific at handling their data.

How does one get started with Data Science?

There are quite literally a million courses out there and bootcamps that purport to be able to teach you data science, and so I’ve started with a Udemy Course on data science with Python and looking at a few in-browser courses like Datacamp and Dataquest. All of these resources get you started by orienting you to Python - a scripting language used frequently in data science - and then walk you through the process of building your own projects according to a template.

Finishing courses like this definitely gets you started along the road to learning Data Science skills, but it’s sort of like learning the basics of driving a car without being instructed in the way that other people drive. Learning how to script and building basic competency in math libraries is definitely a necessary task for becoming a data science professional, but is only the first step on the road. The true “learning” to becoming a data professional happens when you apply these new skills to solve data-driven problems in newly structured ways, and then adjust based on your learning.

What’s the takeaway? The skills required to be a data science professional are the first step to acquiring true fluency in using these tools to solve problems in the way you think of when you think “Data Scientist.” However, there’s no better way to know whether you are learning something than to try. So we’ll see what happens 😉

We’d like to know …

Learning technical skills is a personalized task. What method works the best for you? Online classes, taking on a personal project, or doing something new at work?

Click the tweet to tell us how you’re voting.

Greg Meyer@grmeyer

When learning new things, what methods do you use? I learn new technical skills best by ...

10:12 PM · Nov 28, 2020

Links for Reading and Sharing

These are links that caught my eye.

1/ A ‘Jarring’ discovery - Mason Jars are in demand. And it turns out demand for canning jars is also a bellwether of the general economy. We should be a bit worried by the past times when canning jar demand spiked.

2/ Unlikely treatment for spinal injuries - Mom may have asked you to eat your Asparagus because it was good for you, but she never might have imagined that the cellular structure of this vegetable could aid in spinal injury recovery. This is a great example of unexpected discovery.

3/ “Sure, I’ll walk you through my spreadsheet…” - Microsoft researchers have created a prototype that visualizes spreadsheet information in 3d. This is interesting in particular when you consider the difficulties in seeing the patterns of data in some models. Imagine instead that you could just fly around them in VR/AR and see contextual information.

On the Reading/Watching List

I’m not always a fan of sequels, but I’m excited to read Ready Player Two, Ernest Cline’s jump back into the virtual worlds he created in Ready Player One. I’m sure there will be more 1980s video game references I can access by using MS-DOS games in my browser ;)

But while we are reading sequels, I’m also looking forward to Parable of the Talents, the update to Octavia Butler’s dystopian novel Parable of the Sower.

What to do next

Hit reply if you’ve got links to share, data stories, or want to say hello.

I’m grateful you read this far. Thank you. If you found this useful, consider sharing with a friend.

Want more essays? Read on Data Operations or other writings at gregmeyer.com.

The next big thing always starts out being dismissed as a “toy.” - Chris Dixon

Discussion about this post

Ready for more?