Profile picture
Matt Brems @matthewbrems
, 20 tweets, 4 min read Read on Twitter
This is a *great* question! "What are the skills and mindset crucial for thinking about general data science problems?" THREAD 1/
Mindset first: What mindset do you need to tackle general data science problems? You should be, in three words: curious, skeptical, realistic. 2/
Why is being curious an advantage? Being curious allows you to understand the data better. You search for things - mistakes, anomalies, surprising trends. If you're curious, you're likelier to look at the data from multiple perspectives. 3/
Curiosity (the desire to understand what/how things work) will give you the intrinsic benefit of motivation! You'll be less likely to throw your hands up and give up - although everyone throws their hands up out of frustration at some point. :) /4
Why is being skeptical an advantage? Whether through human error, technological mishaps, lack of availability of data we "want," or misunderstanding the problem, data often doesn't mean what we think it means and thus we make mistakes when we accept the data at face value. 5/
Anything we learned through being curious, we now question because we're being skeptical. We check the source. We check for inconsistencies. That trend we "found?" Why did it happen? Is it real or did a data entry error cause something to look significant? 6/
Too frequently, we trust the data until there's a reason to not trust it. It's way better to mistrust the data until you're convinced of its veracity. Blind faith in the data is... well, blind. It's only a matter of time until this blindness negatively affects our work. 7/
Why is being realistic an advantage? We have this pipe dream of the perfect visualization that communicates our point with clarity and urgency and the perfect model that brings tears of joy to the eyes of the masses. (Maybe that's just my dream.) We rarely realize that dream. 8/
Depending on what you're trying to do, there are a million things that detract from that perfect model/viz. Dirty data. Time crunch. Scope creep. Rabbit holes. Competing projects/interests. We must be realistic about what we can do. /9
Being too idealistic blurs our obligation to stakeholders/ourselves. Realism encourages us to complete an MVP (minimum viable product) and helps us to be confident in the work we do rather than deflated in the work we don't. This dovetails nicely into your next question. 10/
As for the skills you need to tackle data science problems - I *hate* the idea that there are certain skills needed. It's so exclusionary. If I say "statistics and programming," that seemingly excludes most people who don't have academic and work experience in this! 11/
In fact, many would remove themselves based on that distinction! "Well, I look at StackOverflow too much to be good at programming." "I haven't taken stats in years, so there's no way I know enough." There's so much self-doubt that many say they can't do data science. 12/
👏 We 👏 Need 👏 To 👏 Get 👏 Beyond 👏 This 👏 Interpretation! 👏 Let's stop thinking of data science as the intersection of statistics + programming + subject-matter expertise! Instead, let's think of data science as "using data to better inform the decisions we make." 13/
By making that small mental shift, we go from thinking "I don't know enough programming to possibly do data science" to "I'm using programming to make better decisions than I would have if I weren't using my data!" 14/
That shift is *everything*. It's empowering. It's confidence-building. It's freeing. 15/
I'm not saying that stats/programming/context for our data are unimportant within the context of data science. The opposite, rather. They're *so* important to data science that we need to stop being limited by the complexes and anxieties surrounding "Am I good enough at X?" 16/
Use the right tools/skills for *your job*, not because of some list of the "10 best technologies for data scientists." If you're working with databases, try to learn SQL. If you're not working with DBs, your time is better spent elsewhere. Repeat for literally everything. 17/
Caveat: Of course employers want certain skills. They have a set of problems and there are a set of tools needed/helpful in solving them. But for a general set of skills needed to do data science - just get data, start solving problems, and then learn tools *as needed*! 18/
When tackling a data science problem: Be curious. Be skeptical. Be realistic. Flip the narrative of data science from "I don't know X" to "I'm using X to be better than if I didn't use X!" Leverage the skills you have & acquire skills you need in order to solve the problem. /END
P.S. Thank you @S_Canchi for letting me use one of your questions more broadly! I can touch on domain/subject-matter expertise in another thread, but this one was already long enough! :)
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Matt Brems
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!