Pandas can create a really simple box plot using the .boxplot() method.
You just need to specify the category you are grouping by and the column with your numeric values:
Advantage of box plots: The box plot clearly shows the outliers in the data. They are visualized separately.
The disadvantage of box plots: We cannot see the density of the values on the Y-axis.
To correct this issue we will use violin plots.
2. Violin plots
They can plot the density on the y-axis.
The density is mirrored and flipped over, and the resulting shape is filled in, creating an image resembling a violin.
For the exercise, we are using seaborn's .violinplot():
We need to specify the data, X and Y-axis, and set the title.
This is what we got:
Advantage of violin plots: The violin plot clearly shows the density of the data.
The disadvantage of violin plots: Hard to see the outliers since they are included in the violin.
Combining the 2 methods we can get great info about our data.
It turned out that skill is not so important for Goalkeepers, but there are some outliers with higher skills, and the data is pretty dense at around 20.