If you want to create great data visualizations, you need to understand color palettes.

Here are a few quick tips:

[1/n]

#datascience #datavisualization #Python #rstats
[2/n]

For data that has a sequential ordering (i.e., low to high), you should use sequential color scales.

matplotlib.org/stable/tutoria…

#Python #matplotlib Image
[3/n]

Sequential color scales incrementally change saturation or lightness.

For example, this is a red-sequential color palette: Image
[4/n]

Having said that, there are a variety of sequential color palettes in Matplotlib (and a similar set in R).

#datascience #datavisualization #Python #rstats Image
[5/n]

There are also "perceptually uniform" sequential palettes.

In a perceptually uniform palette, equal steps in data are perceived as equal steps in color space.

#datascience #datavisualization Image
[6/n]

The end result, is that visualizations with perceptually uniform color palettes are typically much easier to interpret.

They're also typically very beautiful (I created these in R using the "viridis" palette).

#datascience #datavisualization #rstats ImageImage
[7/n]

Now, let's talk about a different type of color palette: diverging palettes.

#data #datascience #DataVisualization
[8/n]

Diverging palettes typically have three main colors:

– 2 anchor colors (one at either end of the palette)
– 1 midpoint color (somewhere in the middle of the palette)

Between the anchor colors, the palette changes and the colors "meet" in the middle at a central color. Image
[9/n]

Here are a few examples of diverging color palettes

seaborn.pydata.org/tutorial/color…

#data #datascience #DataVisualization Image
[10/n]

Notice in the examples above that the two anchor colors are more "colorful"

And the central color is typically black, white, or grey.
[11/n]

In more technical terms, the two end colors are "saturated" colors.

And the midpoint color is a "desaturated" color.
[12/n]

That's typically how a diverging palette works:

The two end colors are "saturated" "colorful" hues (like red, blue, purple, etc) ...
[13/n]

And the palette changes decreases saturation until it reaches the midpoint, where there's typically a desaturated color like white, black, or grey.

#data #datascience #DataVisualization Image
[14/n]

There are also diverging palettes that change *lightness* instead of or along with saturation.

Common examples of this are diverging palettes with yellow at the midpoint.

#data #datascience #DataVisualization Image
[15/n]

So when do we use diverging palettes?

As you might guess, the best use of diverging palettes is when there is a natural midpoint.

#data #datascience #DataVisualization
[16/n]

So for example, you can use a diverging palette where you the data have a bad/OK/good ranking.

Here's one such visualization I made of the "Best States For Business": Image
[17/n]

Diverging palettes are perhaps most commonly used in US political maps, where we show % republican vs % democrat

Here's a beautiful example of an election map by @htmldon that uses a Red-Grey-Blue diverging palette. Image
[18/n]

Note:

You can find my full analysis of that visualization in this thread:

[19/n]

Diverging color palettes can be a little harder to use, because you sometimes need to wrangle your data so the right colors map to the right data values.
[20/n]

This is particularly true for setting the midpoint color.

Sometimes the midpoint color is set to 0, and sometimes it's set to something else like a mean or median value.
[21/n]

So diverging color palettes can be a little harder to use.
[22/n]

But if you use them properly ...

Diverging palettes are extremely valuable for telling certain types of stories with your data.
[23/n]

Again: In particular, you often need diverging palettes for any data where there are natural endpoints, along with a natural midpoint.

#data #datascience #DataVisualization
Source, Matplotlib:

matplotlib.org/stable/tutoria…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Joshua Ebner

Joshua Ebner Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Josh_Ebner

31 Aug
The big thing that I'd change here is the color palette.

This color palette is hard to interpret and frankly, just look a little ugly.

#datascience #DataVisualization

[1/11]
[2/11]

The fix here is pretty simple.

The data are sequential in nature. There's a low and a high.

When you have sequential data, you should almost always look at sequential color palettes.

[3/11]

More specifically:

For sequential data, your go-to palettes should almost always be perceptually uniform sequential palettes like viridis or magma.

Read 11 tweets
28 Aug
@JoshuaSteinman My understanding based on some research last year and beyond, is that these are deep-water.

... and there's possibly more ports with shallow depth
@JoshuaSteinman For example, @PeterZeihan wrote that Texas has "thirteen world-class deepwater ports"

amzn.to/3BihKN6
@JoshuaSteinman @PeterZeihan Army Corps of Engineers puts it at "15 deep draft ports" and "13 shallow draft ports" along the TX coast.

swd.usace.army.mil/About/Texas-Po…
Read 5 tweets
28 Aug
@JoshuaSteinman Regarding: An American Shenzhen

There's a *lot* of good ports along the Texas coast, and I think much of it under-used.

Great for logistics into the American Heartland, and also into LatAm and Mexico.
@JoshuaSteinman The Texas/Mexico combo provides a unique mix of high-skill, medium-skill, and low-skill labor.

High end design and MFG in TX, lower skill MFG and assembly in MX.
@JoshuaSteinman Also great energy resources in TX (although nuclear would augment).
Read 5 tweets
27 Aug
If you want to master data science in Python, you need to learn Pandas method chaining.



[thread: 1/14]

#data #datascience #Python
[2/14]

Pandas method chains enable you to combine together several individual Pandas techniques in complex ways.
[3/14]

When most people do this, they do it with very long chains of techniques, *all on a single line*.

These are hard to read and hard to debug.

They get more challenging the longer they get.
Read 14 tweets
26 Aug
How to Add New Variables to a Python Dataframe

sharpsightlabs.com/blog/pandas-as…

[Thread: 1/9]

#data #datascience #Python Image
[2/9]

There are several ways to add a variable to a Python dataframe ...

But my preferred way is the Pandas "assign" method.
[3/9]

The Pandas assign method has fairly simple syntax.

You can use the technique to add a single new variable like this: Image
Read 9 tweets
3 May 20
If there's a large migration of talented people from SF and NYC to Austin, Austin has a shot at being the next Silicon Valley.

#SF #NewYork #SiliconValley #technology #Austin #Texas

(Thread)
2/n

Remember: Texas actually has a long tradition of innovation.

For example, the integrated circuit was invented at Texas Instruments.

en.wikipedia.org/wiki/Texas_Ins…

#technology #tech #Texas ImageImage
3/n

More recently, Amazon, Facebook, Google, and Apple have announced major campuses in Austin.

Google is building a new riverside building, and Facebook already has one.

austin.curbed.com/2019/4/2/18291… ImageImage
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(