Ever wondered how a Data Scientist thinks about a problem? Here are the major steps involved in tackling a data science problem.

Thread 🧵👇

#DataScience #MachineLearning #100DaysOfCode
1. Business Understanding: We should have clarity of what is the exact problem we are going to solve.

What is the problem that we are trying to solve? - Asking the right questions as a Data Scientist starts with understanding the goal of the business.
2. Analytical Approach: How can we use data to answer the question? We should decide the analytical approach to follow which can be of 4 types
- Descriptive
- Statistical
- Predictive
- Prescriptive
and it indicates the necessary data content, formats, and sources to be gathered
3. Data requirements: What data do we need to answer the question?
- Identifying the necessary data contents, format, and sources for initial data collection. During the process, one should find the answers to questions like ‘what’, ‘where’, ‘when’, ‘why’, ‘how’ & ‘who’.
4. Data Collection: Where is the data coming from (identify sources) and how will we get it?

- In this stage, the data requirements are revised and decisions are made as to whether or not the collection requires more data.
5. Data Understanding: Is the data that we collected representative of the problem to be solved?

Data understanding comprises all the activities related to constructing the data set.
6. Data Preparation: What additional work is required to manipulate & work with the data? This phase shapes the data into a state where it may be easier to work with. Involves Data cleansing:-
- Missing Data
- Invalid Values
- Remove Duplicates
- Formatting
& Feature engineering
7. Modeling - Answers two key questions:
- What is the purpose of data modeling?
- What are the characteristics of this process?

It focuses on developing models that are either descriptive or predictive. The choice of model is based on the approach chosen in step 2.
8. Evaluation: Does the model used, answer the initial question, or need to be adjusted?

It undergoes:
- The Diagnostic Measures: the model works as intended and where are modifications required

- The Statistical Significance: ensures proper data handling and interpretation
9. Deployment: Can we put the model into practice?

As the model is effectively evaluated it is made ready for deployment in the business market. The deployment phase checks how much the model can withstand in the external environment and perform superiorly as compared to others
10. Feedback:
Feedback is the necessary purpose that helps in refining the model and accessing its performance and impact. Steps involved in feedback define the review process, track the record, measure effectiveness, and review with refining.
That's it for the thread Waving hand.

A retweet for the first one would really mean a lot 🙏

Follow @PiyalBanik for more threads on #DataScience #MachineLearning #ArtificialIntelligence #Python .

Feel free to DM.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Piyal Banik

Piyal Banik Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PiyalBanik

18 Jun
Top 7 interesting careers related to Data Science to explore. Pick one and start learning.

Thread 🧵👇

#DataScience #ArtificialIntelligence #MachineLearning #BigData
1. Data Scientist

Data scientist use their analytical and technical capabilities to extract meaningful insight from data.
2. Machine Learning Engineer

Machine Learning engineer's final output is the working software, and their audience for this output consists of other software components that run automatically with minimal human supervision. The decisions are made by machines.
Read 9 tweets
18 Jun
Everything you need to know about Strings in Python for Data Science

Thread 🧵👇

#DataScience #Python #100daysofcodechallenge
📌Looping Through a String

Since strings are arrays, we can loop through the characters in a string, with a for loop.
📌String Length
To get the length of a string, use the len() function.

📌Check String
To check if a certain phrase or character is present in a string, we can use the keyword in.
Read 8 tweets
17 Jun
Python operators are easy and every aspiring Data Scientist need to know the common ones.

Thread 🧵👇

#Python #DataScience #100DaysOfCode #code #CodeNewbie
📌Python Arithmetic Operators:

Arithmetic operators are used with numeric values to perform common mathematical operations Image
📌Python Assignment Operators:

Assignment operators are used to assign values to variables Image
Read 8 tweets
16 Jun
There are 8 built in classes (variable types) in Python. A thread🧵👇

#Python #100DaysOfCode #CodeNewbie
1. Boolean- The Boolean data type is a truth value, either True​ ​ or Fals​e. Image
2, 3. Integer and Float - An integer is a positive or negative number without floating point. A float is a
positive or negative number with floating point
precision. Image
Read 9 tweets
16 Jun
Top 5 things I am currently following to boost my learning curve in Data Science as a beginner 🧵👇

#DataScience #Python #100DaysOfCode
1. Trying to implement large projects from start to finish🧑‍💻: Well I am believer of "Learn Best by Doing". As I implement a project from scratch, I do get a lot of errors which ultimately teach me even more.
2. Working with friends 👬: Being a grauduate student, I am very much aware of the importance of working in groups. This method of learning improves my thinking and increases my confidence level.
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(