Are you interested in learning statistics or data analysis?
I think learning how to analyze data is tricky because it's actually 3 independent skills.
- Coding
- Applied Knowledge
- Probability Theory 🧵👇
When I first started learning data analysis, it was frustrating for me to realize that being good at one of these skills didn't mean I was good at all of the others. So, If you've ever felt that way, you're not alone. 2/8
Coding: Being good at coding allows you to implement your ideas. While it's possible to get by using software, it will limit you as a data analyst. 3/8
Applied Knowledge: For me, this means knowing what kinds of problems tend to occur in real data and knowing how algorithms tend to work (or not work) in real world situations. Lacking this skill is like sketching with your eyes closed. Doable but not ideal. 4/8
Probability Theory: Random events are at the heart of statistics so a strong grasp of mathematical probability helps a lot. It's possible to depend on good software and intuition but much of the reliability of statistics comes from the correctness of the mathematics. 5/8
Honestly, I think you only need to be good at 2 out of 3 of these skills to have a solid statistics career. You can probably even get away with only having a strong grasp of one of them if you have the right collaborators. 6/8
However, in my opinion, mastering coding, applied statistical knowledge and probability theory will put you in the position to handle any data analysis problem. 7/8
One final remark. In academia, these skills are often taught in this order:
- Probability Theory
- Applied Knowledge
- Coding
But for self-study, I would recommend learning them in this order:
- Coding
- Applied Knowledge
- Probability Theory 8/8
Good luck and I hope this thread helps you structure your approach to learning statistics and data analysis.
I'll be tweeting more about practical problems in learning data science and statistics:
- common roadblocks
- how to get around them
- specific book recommendations, etc.
If you want to hear more about these topics, follow me.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
I gave up on talking about race on Twitter because I was having the same argument over and over again. In this thread, let me explain THE ANATOMY OF A TWITTER RACE ARGUMENT.
Whenever someone says "X is white supremacy" on Twitter where X is perfectionism or individualism or math worship, there are a constellation of reactions. Many of them predictable.
If X is a genuine point of division, it will often be the case that most white Americans tend to do and like X while most black Americans tend to dislike and not do X. This cultural difference may or may not be problematic.
I've been in science for a while now and as far as I can tell, there are two types of people in this line of work. Those that think we should give everything to science and those that don't.
These two mindsets produce two types of work environments. I'll call them the results-first workplace and the people-first workplace.
In the people-first environment, they prioritize healthy work habits and relationships. Science is a critical piece of a whole and healthy life. In the results-first environment, all that matters is the outcome. People get the job done whatever the cost.
At the beginning of a science, the first step is always to declare the thingness of something that we want to study. This is a star. That is a cow. This is a society. That is a race. This is a mind. That first step is actually a huge step which we rarely ever talk about.
It's just kind of assumed that obviously we can just identify things as clearly being real using our senses and our intuitions and as long as our scientific conclusions seem predictive to us (using the same senses and intuition) then we assume we must be on the right track.
Social phenomena present a real challenge here because we can't perceive social reality directly with our senses and different people have different intuitions which seem to lead to different frameworks which all seem to have some predictive validity.
I've decided to pull back from talking about race on social media. There are many reasons for this but the most important one for me is it has come to feel like a pointless energy drain that doesn't seem to make a difference.
During the summer, I was inspired to use my "platform" to be a "voice" but I don't think it has been very productive. Although many commenters have accused me of talking about race out of self-interest, I actually see it as a moral duty to help. A duty and often a burden.
I'm sure it has professionally hurt me. For instance, many people have made assumptions about my competence and intellectual background that simply aren't true.
Any philosophers of science willing to vouch for the accuracy of this chart?
I don't think my view is represented here. Basically, I think scientific models start out lower down and can be moved upwards through different degrees of reality as work on them.
So my perspective is sort of a No-Free-Lunch or Very-Little-Free-Lunch perspective on scientific realism. Before I accept your theory, I want to characterize how much work you did and what kind. I don't want to give you "scientific reality" for free.
This is why I'm kind of a skeptic on "2+2=4" because in almost all cases one has done no real work to verify that a statement like this describes all of physical reality in practice. "All of reality" is very big you see.