🤫 I'm going to let you in on a secret... I find picking colours really tricky! Thankfully, I've found few ways round that.
My top tip is to let others help you! But first, a broad principle...
When picking colours for story telling, I try to make the colours as intuitive as possible.
Here's the adventure I took the Palmer Penguins on in a recent talk involving the #GreatPenguinBakeOff. See if you can guess the details. (The next tweet should give you a few clues!)
It's not about making your plots into a guessing game. It's about reducing cognitive load by making it easy to remember what's what.
And this allows me to illustrate one way to let others help you: photos! All of the colours in the previous plots were taken from these photos.
Take a look at this plot. I've plotted "Yumminess" along the Y axis.
Without even thinking about it, you can tell that:
- unripe bananas do not make for tasty banana bread
- no amount of tinkering with banana quality will help
It's a silly example - but it works!
And now for our first #rstats coding tip: use a named vector to apply your colours! It avoids the colours jumping around if you reorder a factor somewhere along the way.
The project I have in mind for this week would require blending together two similar colours, so we may need to think of something else! But even if the spectrum isn't diverse enough, this can be a nice way of finding an "anchor" colour for your theme (more on that tomorrow!)
If you don't know how many colours you need, colorRampPalette() is your friend. Set some anchor colours and let it do the rest. Here's an example using colours from this painting by Leon Morrocco (Untitled, Jean Resting).
Can we combine the two to have a "generic" branded colour scheme and a highlight colour for some story telling in a changing context? Yes!
Here's how!
For consistency, you can reuse your anchor vector across plots, or even add it to your own package!
It's important in all this to remember accessibility. Make sure your colours work regardless of colour perception & when printed in black and white!
Thankfully, we have:
- 📦 {colorblindr} to see what your ggplot looks like
- vis4.net/palettes to check the palette
Finally, here are some extra resources I've found really helpful in exploring colour in #dataviz - let others help you!
What are your sources of inspiration for dataviz colour palettes? Would be lovely to see a few images of how you've applied these principles in your own work!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
2) Fonts. Picking fonts can be really tricky, but there are some really great resources out there (see below, where we're back to our "let others help you" mantra!).
Here, I've simply applied the fonts from my own website, changing the family element of element_text().
3) Text size. You can manipulate text size within theme() either by setting absolute sizes (e.g. size = 16), or relative sizes (e.g. size = rel(1.2)).
The relative size is a good idea if you're going to reuse this theme: change the base size as needed and everything follows!
At this point, we've done most of the work, but we can still make our data story easier to take in by giving everything a bit more space to breathe.
First, let's move the legend to reduce unnecessary eye movements, fade the grid, and remove an unnecessary axis title.
Here are three reasons why I think you should do this:
- Help orient your readers with text hierarchy
- Give everything some space to breathe
- Achieve effortless consistency with one extra line of code
Sound good? Let's dig in!
My starting point for creating a custom theme is typically theme_minimal(). It has sensible defaults such as relative text size and margins that we can build on, by just replacing some elements.
Hi folks! I'm Cara, an Edinburgh-based freelance data consultant, and I'm excited to be bringing you a week of #rstats content based around the things I enjoy creating the most: data visualisations and "enhanced" reproducible outputs.
Over the course of the week, we'll be exploring the different components that go into making a nice "branded" parameterised report, each day building on the days before until we get to our finished product: a parameterised celebration of RLadies within the NHS.
Here's the menu:
- Day 1: Setting up a colour scheme
- Day 2: Building a custom ggplot theme
- Day 3: Writing functions to create parameterised graphs
- Day 4: Manipulating text with R
- Day 5: A worked example of designing a pdf template (and a case for PDFs)
- Day 6: Our finished product!
Today I'd like to share how I debug R code, highlighting some neat RStudio tools and tricks. It's not a linear process so this thread isn't exactly in order. 1/n
It may sounds silly, but first, I take a breath & *fully* read the error/warning message. The breath is because it may not be the first error of the day & I might be getting frustrated. The careful reading is because I want all available information before moving ahead. 2/n
I think about if I've seen the error before. My most common errors are easily fixed & include
object 'X' not found -> I forgot to load X or misspelled it
unexpected ')' -> I have an extra ) somewhere. Also for ] and }
3/n
This morning, I'll share some tips for my favorite thing to do in R: make plots! I'm firming in the ggplot world so many tips are related to this and related packages.
So not surprisingly, my first tip is to checkout ggplot! I think the most powerful aspect it it's legends as they are created automatically and change as you change the plot. No fear of an incorrect legend!
ggplot is in the tidyverse & thus, works seamlessly with dplyr, tidyr, etc. If I need to transform data *just* for a plot, I use pipes (%>%) to avoid saving the data and cluttering up my environment. For example:
This afternoon I'd like to highlight a recent project that I'm very proud of! My team developed a data analysis R package called **kimma** or Kinship In Mixed Model Analysis. Yes, it's also a play on my name and the well-known package limma 1/n
kimma provides flexible modeling of RNAseq data where you have many responses (genes) tested against the same model. You can run simple linear or mixed effects models with covariates, weights, random, & (new with this package) covariance effects like genetic kinship 2/n
kimma also provides model fit metrics like AIC, BIC, & R-squared so you can determine the best model without the biases associated with picking based on how many significant results you get. 3/n