, 20 tweets, 9 min read
My Authors
Read all threads
Transitioning into #DataScience: A (long) thread—
Transitioning into a new field can be cumbersome, especially #datascience which usually requires some level of #math #statistics #machinelearning #programming. But fret not!
This thread will go over the fundamentals of what you should know to become a #datascientist. First thing first, there are hundreds of tools for visualization (Power BI, Tableau, Dash), cloud computing (GCP, AWS, Azure) and SQL variants (Postgres, MSSQL, MySQL).
Don’t get caught up trying to learn all of these tools/platforms/software! You will be onboarded to these technologies and they differ highly from company to company and sometimes even department to department. But, you should be familiar with what they do at a high level!
As someone newly transitioning to #datascience you should figure out your tech stack. #Python or #R? #Jupyternotebook or #visualstudio? It might take some playing around to decide! But, first you need to set up your environment to start learning.
@anacondainc is the greatest gift of life! You will still have to download and install whatever language you choose to learn and become proficient in addition to #anaconda! Once that is done, get familiar with the IDE by playing around and clicking buttons.
Once you have a decent grasp of the IDE, it’s time to learn the fundamentals! Those will be both practical and conceptual! It’s best to create a study chart of what to become familiar with for all of these topics and rotate often to limit burnout.
For programming, it’s helpful to first learn the built in #datastructures. For python this includes list, tuple, dictionary and set. These are native to python and will also bring about concepts such as mutable/immutable, ordered/unordered, indexed/not indexed.
Additionally you should take a look at #pep8 the style guide for writing efficient python code! Also, there are several paradigms of programming - procedural, OOP and functional. For #datascience I suggest becoming familiar with functional programming first.
Example of non-coding python questions asked during interviews include - naming the built-in data structures, difference between list and tuple, is a list still immutable if it is an element within a tuple.
@LeetCode and @hackerrank have some great questions! I would stick to the easy level for these and take additional online courses whenever necessary! There are so many resources available to nail down the fundamentals. You want to keep efficiency and readability in mind.
For #sql I would become familiar with querying relational databases. The variant you choose to learn isn’t a major decision, they are all similar. They mostly differ in syntax and CTEs (common table expressions). Learn the terminology including schema...
Foreign key, primary key, and the main clauses and joins and the order!!! Select, from, where, group by, order by and so on. There are also additional keywords including and, or, like. Take a look at the diagram for joins (inner, left, right) and null values.
I would suggest taking a course or getting a book to help with the fundamentals. I used #HeadFirstSQL. But often use resources like @wwwschools to remind myself of syntax. Once you feel comfortable you should use @leetcode for problems!
Knowing #SQL is highly beneficial for pulling data, creating reports, and all things data! So many new tools are built upon SQL like pyspark, pig, Hadoop, presto, hive. You aren’t expected to know those but if you know SQL you can easily transfer it!
Onto #Machinelearning I would focus on supervised versus unsupervised learning. For supervised, be familiar with classification versus regression and the algorithms utilized in both, the assumptions they make and the advantages/disadvantages of all.
You should be familiar with the metrics utilized for evaluating performance of these algorithms and different situations in which you would be optimizing one over the other. For classification this would be recall versus precision to minimize false positives or false negatives
I would suggest Introduction to Machine Learning with Python for a landscape, but Elements of Statistical Learning is 🐐. This will work it’s way into being familiar with data preprocessing and wrangling which goes hand in hand with #stats.
For statistics, the level of your knowledge is based on the role you are going for since some emphasize experimentation (A/B testing), designing experiments, causal inference. Most require a solid understanding of the fundamentals!
That starts with probability distributions, data types, transformations (linear and nonlinear), normalization, standardization. You may want to throw in probability and Bayesian statistics too!
TLDR; there is a lot to learn to transition to #datascience but focus on the transferable skills and fundamentals including python/R, SQL, stats, and basic ML to maximize ROI! You can learn everything else on the job!
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with pip install alise

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!