Discover and read the best of Twitter Threads about #DataScientist

Most recents (24)

Wer die Leiden eines #DataScientist|s kennt, der weiß, woran es in #Deutschland dramatisch hapert. Zwar spricht jeder vom Zeitalter #KünstlicherIntelligenz, aber es fehlt am grundsätzlichen Verständnis der Voraussetzungen & einer geschulten Denkweise. 1/x @LageNation
@LageNation Wir haben uns in Deutschland daran gewöhnt, bei der #Digitalisierung abgehängt zu sein. Aber es geht längst nicht mehr nur um die richtige Ausstattung mit PCs und eine schnelle Internetverbindung. 2/x
@LageNation Heutzutage basieren mehr und mehr #Entscheidungsprozesse überall um uns herum in unserem Alltag auf Algorithmen, die wir gesammelt als #KI bezeichnen. 3/x
Read 7 tweets
Learn Data Science in 180 days🤑📈 and start your data science career.

Bookmark this thread

A thread🧵👇
First Month 🗓️
Day 1 to 15 - Learn Python for Data Science
Day 16 to 30 - Learn Statistics for Data Science
Second Month 🗓️
Day 31 to 45 - Explore Python Packages( Numpy, Pandas, Matplotlib, Seaborn, Scikit-Learn)
Day 16 to 30 - Implement EDA on real-world datasets.
Read 8 tweets
1/ "Software is eating the world. Machine learning is eating software. Transformers are eating machine learning."

Let's understand what these Transformers are all about

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataAnalytics
2/ #Transformers architecture follows Encoder and Decoder structure.

The encoder receives input sequence and creates intermediate representation by applying embedding and attention mechanism.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI
3/ Then, this intermediate representation or hidden state will pass through the decoder, and the decoder starts generating an output sequence.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
Read 14 tweets
But what p-value means in #MachineLearning - A thread

It tells you how likely it is that your data could have occurred under the null hypothesis

1/n

#DataScience #DeepLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat
2/n
What Is a Null Hypothesis?

A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

#DataScience #MachineLearning #100DaysOfMLCode #Python #stat #Statistics #Data #AI #Math #deeplearning
3/n
A P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis

#DataScience #MachineLearning #100DaysOfMLCode #Python #DataScientist #Statistics #Data #DataAnalytics #AI #Math
Read 11 tweets
1/ One way to test whether a time series is stationary is to perform an augmented Dickey-Fuller test - A Thread

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics #programming #ArtificialIntelligence
2/ H0: The time series is non-stationary. In other words, it has some time-dependent structure and does not have constant variance over time.

HA: The time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
3/ If the p-value from the test is less than some significance level (e.g. α = .05), then we can reject the null hypothesis and conclude that the time series is stationary.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist
Read 8 tweets
2/ It is important to standardize variables before running Cluster Analysis. It is because cluster analysis techniques depend on the concept of measuring the distance between the different observations we're trying to cluster.

#DataScience #MachineLearning #DeepLearning
3/ If a variable is measured at a higher scale than the other variables, then whatever measure we use will be overly influenced by that variable.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python #pythoncode #AI #DataScientist #DataAnalytics #Statistics
Read 16 tweets
Did you know how TensorFlow can run on a single mobile device as well as on an entire data center? Read this thread

1/n

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data
2/n
Google has designed TensorFlow such that it is capable of dividing a large model graph whenever needed.

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
3/n
It assigns special SEND and RECV nodes whenever a graph is divided between multiple devices (CPUs or GPUs).

#TensorFlow #DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #AI
Read 9 tweets
2/16

"roc_auc_score" is defined as the area under the ROC curve, which is the curve having False Positive Rate on the x-axis and True Positive Rate on the y-axis at all classification thresholds.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python
Read 16 tweets
2/n

Alibi Detect is a Python library for detecting outliers, adversarial data, and drift. Accommodates tabular data, text, images, and time series that can be used both online and offline. Both TensorFlow and PyTorch backends are supported

#DataScience #DeepLearning
3/n

Supports a variety of outlier detection techniques, including Mahalanobis distance, Isolation forest, and Seq2seq

#DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat #pythoncode
Read 10 tweets
1/ Can you classify something without seeing it before - that's what Zero-Shot Learning is all about - A Thread

👉 One of the popular methods for zero-shot learning is Natural Language Inference (NLI).

#DataScience #DeepLearning #MachineLearning #100DaysOfMLCode #Pytho
3/ In Zero-shot classification, we ask the model to classify a sentence to one of the classes (label) that the model hasn't seen during training.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI
Read 13 tweets
1/ Why do we need the bias term in ML algorithms such as linear regression and neural networks ? - A thread

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI #Stats #DeepLearning #100DaysOfCode Image
2/ In linear regression, without the bias term your solution has to go through the origin. That is, when all of your features are zero, your predicted value would also have to be zero.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming Image
Read 7 tweets
Google has released Imagen: a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode #DataScientist #Statistics Image
Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation.

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode
This generator is scarily accurate with super-resolution! "A photo of a raccoon wearing an astronaut helmet, looking out of the window at night."

#Python3 #MachineLearning #DataScience #100DaysOfCode #DataScience #DataAnalytics #100DaysOfMLCode #DataScientist #Statistics Image
Read 4 tweets
What is p-value - A thread

It tells you how likely it is that your data could have occurred under the null hypothesis.

1/n

#DataScience #DeepLearning #MachineLearning #ComputerVision #100DaysOfMLCode #Python #DataScientist #Statistics #programming #Data #Math #Stat Image
2/n
What Is a Null Hypothesis?

A null hypothesis is a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

#DataScience #MachineLearning #100DaysOfMLCode #Python #stat #Statistics #Data #AI #Math #deeplearning
3/n
A P-value is the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis

#DataScience #MachineLearning #100DaysOfMLCode #Python #DataScientist #Statistics #Data #DataAnalytics #AI #Math
Read 10 tweets
1/ #MachineLearning #Interview questsion -
Why L1 regularizations causes parameter sparsity whereas L2 regularization does not?

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data #pythoncode #AI #Stats
2/ L1 & L2 regularization add constraints to the optimization problem. The curve H0 is the hypothesis. The solution to this system is the set of points where the H0 meets the constraints.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming Image
3/ Regularizations in statistics or in the field of machine learning is used to include some extra information in order to solve a problem in a better way.

#DataScience #MachineLearning #100DaysOfMLCode #DataScientist #Statistics #programming #ArtificialIntelligence #Data
Read 7 tweets
Cases of Mokeypox by Location (Casos de Viruela del Mono por Lugar) #MonkeyPox #RStats #IDtwitter #ViruelaDelMono #VirueladelSimio #VarioleSinge #VarioleDuSinge #DataScientist #elcarteldeSINADEF #Analytics $BAVA $BAVA.CO $SIGA #AI #100DaysofCode #AWS #TensorFlow #Python🧵(1/2) Image
Cumulative Cases of Monkeypox per Day (Acumulado de Casos de Viruela del Mono por Día) & Statistical Trend in the Count of Cases (Tendencia Estadística en Casos) #MonkeyPox #ViruelaDelMono #VirueladelSimio #VarioleSinge #VarioleDuSinge #IDtwitter $BAVA $BAVA.CO #RStats 🧵(2/2) Cumulative Confirmed Cases ...
Read 3 tweets
👉 Let’s have a look at the #PythonLibraries that every #DataScientist should know in 2022, to maintain and improve their #Coding journey: kdnuggets.com/2022/04/python…
1. Pandas was created by Wes McKinney in 2008, as a Python library for data manipulation and analysis. Wes McKinney built Pandas based on their need for a powerful and flexible analysis tool.
2. NumPy is another library used for Python, which is used for mathematical functions. It is popular in processing multidimensional array objects, and various derived objects (such as masked arrays and matrices) and is mostly used in machine learning computations.
Read 8 tweets
Are you looking for an affordable option to pursue an MBA or MS in Data Science from an international university? If yes, Intellipaat has got you covered!
Intellipaat has collaborated with IU Germany to offer MBA and MS in Data Science at affordable prices. These courses will cost you just 10% of the expenses you would have incurred if you would have opted for the traditional way to do an MBA or MS from a foreign university.
After your enrollment in the course,our team will offer you various services such as visa assistance, German language classes, job assistance with global partners, etc. You can also opt for an 18-month post-study work visa to kick-start your professional journey in Germany.
Read 4 tweets
Tactical behavior in #Football has a spatial and a temporal component, and results from interaction with the opponent. It’s key to account for all these aspects in data-driven tactical analysis, as well as to respect the complexity of the temporal and spatial dimensions 🧵
Two years ago I published a systematic review in @EurJSportSci on using big data in #soccer for tactical performance analysis that illustrates the associated challenges and provides a data-driven scientific framework. #DataScience tinyurl.com/mrxky6ca
The most common analysis issue is the fact that spatial and/or temporal complexity is not respected. For example by aggregating data over multiple minutes, or constructing spatial features aggregating 11 player positions into a single variable.
Read 9 tweets
2/16

"roc_auc_score" is defined as the area under the ROC curve, which is the curve having False Positive Rate on the x-axis and True Positive Rate on the y-axis at all classification thresholds.

#DataScience #MachineLearning #DeepLearning #100DaysOfMLCode #Python
Read 17 tweets
Preparing for a technical interview for a #DataScience position? These are some of the questions that typically allow me as an interviewer to quickly distinguish between juniors and mediors, including some quick tips 🧵. #Python #pythonprogramming #DataScientist #Jobs
All questions about SQL. Not the hardest thing to learn, but many #DataScientists only start to learn the value of SQL when they actually become part of a dev team. I’m not only talking about SELECT * FROM table, but also about joins, truncates, partitions and constraints.
Interacting with an API. Make sure you know your requests (GET, POST, PUT, DELETE, PATCH), as well as the #Python requests library.
Read 10 tweets
#DataScientist in a software dev team and #pythonprogramming code for production pipelines? You should think carefully about scalability and integration. One of the things to consider is datatypes, here are some helpful tips 🧵
#Python is a dynamically typed language, but that doesn't mean you shouldn't care about types. Know you dtypes, from "str" to "bool" to "int8" to "float64", and understand their memory footprint and restrictions. Especially when working with larger objects, choose wisely.
Loose the strings. 9/10 times strings can be replaced by categoricals (Pandas) or even better by Enums (docs.python.org/3/library/enum…). This can reduce memory footprint of large dataframes with >30%, and improves performance.
Read 8 tweets
Yesterday I shared a small thread about getting into #DataScience. Today I’ll build on that and share a bit about my own journey into sports analytics, specifically as a #DataScientist in the #football industry. 🧵
My path began with a MSc in Sport & Movement Science @VU_FBW. It’s not computer science or anything, but it does involve quite some #Math, #Statistics and #Physics, as well as a course in programming. Mainly it learned me Science, and gave me a lot of domain knowledge in sports.
I wasn’t planning to become a #DataScientist, but I wanted to work in sports. I did various stints as an embedded sports scientist, mostly internships/part-time, before joining @ZZLEIDENBASKETB. Those jobs involved data & science, but it wasn’t anything close to #DataScience.
Read 13 tweets
Many young people ask me how they can become a #DataScientist, specifically in #football. Lately I have also seen a lot of posts on how to get into #DataScience in (1)50 days or so, which is a joke imo. Here is my realistic take on it. Warning: it will be closer to 1500 days. 🧵
#DataScience is an umbrella of roles & fields that require different competencies. But they all have two things in common: you have to know #Science and you have to be able to work with #data. The first requires learning to do research, the second learning to do #programming.
Go to uni and get a masters degree that at least requires some #math skills. I’m not saying you need a #PhD and 5 publications before calling yourself a #DataScientist, nor that you can’t be one without a MSc, but is helps a lot in acquiring the right competencies.
Read 10 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!