SINGLE BAR CHART
This is basically used to visualize a single categorical variable (univariate analysis)
On the horizontal axis, we have the variable… on the vertical axis, we have the frequency.
MULTIPLE / GROUPED BAR CHART.
This is a bar chart that we can use to visualize two categorical variables (bivariate analysis)
We plot the bars next to each other.
COMPONENT/STACKED BAR CHART
Just like the multiple bar chart… also used to compare two categorical variables.
The bars are stacked on each other.
The MBC and CBC are literally the same… using any boils down to preference
PIE CHART / DONUT CHART
This is just like the bar chart, it is use to visualize single categorical variables.
Bar charts make use of bars, pie chart uses sectors.
HISTOGRAM
This is one powerful tool that can be used to visualize the distribution of a single variable.
By knowing the distribution of a data, you can tell a lot about it.
Comparing the histogram with the bar chart and quote or reply with their similarities and differences 👀
SINGLE LINE CHART/ GRAPH
This is basically use to visualize the change of a variable over a TIME.
MULTIPLE LINE CHART
If you want to compare 2 variables and how they change with time, then we use a multiple line chart.
COMPOUND LINE CHART.
This is use to visualize layers of data and also the proportion that makes up the total data.
You can combine 2 charts to form a single one, this is known as COMBINATION CHART (combo chart for short)
For example, you can combine a single line and bar chart
This way you see how the variables changes over a period of time and also how it distributes itself at the same time
BOX PLOT.
This is one special and powerful tool.
Special in the sense that it is a bit different from others in terms of how it is structured but powerful cos it can visualize spread of data.
It can also detect and remove outliers
SCATTER PLOT.
This is basically used for bivariate analysis when we want to compare the strength of association between 2 numerical variable.
We actually have couple of data visualization tools here and there but these are the popular ones that we see in everyday dashboards.
Thanks for making it to the end of this thread.
Please retweet, like and follow me for more threads like this 🥰
I also share resources for data analysis and data science and have a YouTube channel where I teach statistics needed for DA and DA… you can check out the link below
I can also perform both statistical and data analysis for your project and research.
My DM is opened for work 🎉.
We see in the next one and have a nice day ahead 😇
• • •
Missing some Tweet in this thread? You can try to
force a refresh
During the analysis of the SUPERSTORE DATASET, I realized that the profit from CALIFORNIA is slightly greater than that of NEW YORK.
From the surface… I should conclude California is a better state than New York…. But I did not 😏
Why 🧐…. STATISTICAL SIGNIFICANCE
Let me explain STATISTICAL SIGNIFICANCE to you like a 5 year old.
Retweet because….. IT’S A THREAD 🧵
You see, the whole of INFERENTIAL STATISTICS is all about decision making.
You extract a sample or couple samples from a population or populations and you compare and contrast to see if there is a difference or relationship between them and make a final conclusion in the long run.
The conclusion you will be making will be referenced to two sentences - NULL and ALTERNATIVE HYPOTHESIS.
Let me start with 2 samples and in this case I will be using the profit from the state of California and New York regarding the superstore dataset.
This is my PHASE 2 of the superstore dataset analysis.
In this thread, I will be talking about the categories and sub categories of goods and how they relate with PROFIT, SALES, and DISCOUNT all based on REGION 🗺️
Kindly retweet like and follow for more.
It’s a THREAD 🧵
In the PHASE 1 of the analysis of the superstore dataset, I talked about profit, discount and sales regarding the states.
The conclusion was that NEW YORK is the most profitable state to pay attention to 🤗
Below is the link to that thread… you might want to read that before getting to this ⬇️
I will also be linking the dataset right here…. This way anyone can use their approach to draw insights from this dataset.
Please do tag me when you make a thread 🙏🏻
STATISTICAL ANALYSIS is the usage of statistical concepts and techniques to summarize and draw out conclusions from data set.
STATISTICAL ANALYSIS can be used by DATA ANALYST for exploring and uncovering patterns while DATA SCIENTIST use it to build models
We have 7 types of STATISTICAL ANALYSIS… this is a THREAD about them.
Kindly retweet, like and follow for more 🙏🏻
DESCRIPTIVE ANALYSIS
This is the combination of GRAPHS and NUMBERS to summarize the data…… emphasis on the word “summarize”.
The usage of GRAPHS is known as “DATA VISUALIZATION” and the usage of numbers can either be the measure of tendency which consist of mean, median and mode, or the measure of dispersion which consist of mean absolute deviation (MAD), variance, range etc
INFERENTIAL ANALYSIS
If descriptive analysis helps us to summarize the data set, inferential analysis helps us to make conclusion from the data as a whole using these summaries 💯
This type of analysis make conclusion on a POPULATION by making inference on the SAMPLE
Inferential statistics includes the likes of
-Hypothesis test (Z test, T test, ANOVA, etc)
-Confidence interval
-Sampling and sampling distributions
MULTIPLE LINEAR REGRESSION is one powerful concept in statistics.
It is the basis of SUPERVISED LEARNING.
But some conditions must be satisfied before we can use this technique
This is a thread of the assumptions for MULTIPLE LINEAR REGRESSION
Retweet cos…. It’s a THREAD 🧵
Before we start, just want to let you know that I have a YOUTUBE CHANNEL where I teach the needed statistics for DATA ANALYSIS AND DATA SCIENCE… you can check it out below ⬇️
1) LINEARITY
The dependent variable (y) must have a linear relationship with each the independent variables (x1, x2, x3….. xn).
The linear relationship can either be positive (both changing in the same direction) or negative (both changing in opposite direction)
What are they??
And is their any form of relationship between these 3?
Well let’s find out in 3 minutes ☺️
Retweet …… cos it’s a THREAD 🧵
CORRELATION
Correlation is used to test for the strength and direction of association between 2 variables
If the association causes the variables to change in the SAME direction, we have a POSITIVE CORRELATION.
If the change is in OPPOSITE direction, we have NEGATIVE correlation
If we can’t see any form of change between the two variables… then we have a ZERO or NO CORRELATION.
A scatter plot is a graph that we can use to visualize correlation.
A correlation coefficient is a number (between -1 and +1) to quantify correlation
SPSS is one of the best tool out there when it comes to STATISTICAL ANALYSIS for research and project.
It can also be used for DATA ANALYSIS too
So I’m putting up these thread of steps on how to download for free and install it😊
Retweet cos, it’s a THREAD 🧵
First I need y’all to know that I can perform statistical and data analysis for your projects, research and academics with SPSS, mini tab, stata and excel.
My dm is opened for business 😊
And if you are willing to start DATA ANALYSIS or DATA SCIENCE… I have a YouTube channel where I teach the needed statistics from the basics.
Check my pinned tweet for the syllabus and link below to my statistics playlist ⬇️