Learning data & software engineering skills will help you improve as a data scientist a lot faster than just knowing how to run code in a Jupyter Notebook
To clarify, being able to run and test in a Jupyter Notebook is great. I love notebooks.
But if you have an understanding of software and building actual pipelines you'll be far more valuable to a company/employer.
Learning how things all work together in pipelines has helped me understand a lot more of how stuff actually works in business and how I can improve my code, models, and ideas to fit business needs.
Before I first started getting into sports analytics I didn’t know how people were creating such cool graphics and visualizations.
For everyone who wonders how it all works, here’s the inside scoop on how to create them and resources for each:
1. Python 🐍
Definitely the most popular programming language. Packages such as @matplotlib and seaborn make it easy to create great visualizations
mplsoccer (@numberstorm), @FC_Python, and my YouTube channel are great Python guides for getting started
2. R 📐
Another programming language that excels in statistical analysis and data visualization. A little bit of a steeper learning curve in my opinion, but ggplot is great for creating visualizations.