Albert Rapp Profile picture
🎓 Math PhD student & freelancer 👨‍🏫 Bite-sized insights on dataviz, web dev & data science with R at https://t.co/M34b5BzHTD
4 subscribers
Sep 9, 2023 7 tweets 3 min read
Three steps to use color in your title instead of wasting space on a huge legend. Image 1 // Wrap your subtitle into <span> tags

These span-tags are HTML notation for inline text. So in principle, adding them should change nothing.

But as you can see, it does have an impact.
Image
Image
Aug 26, 2023 15 tweets 6 min read
Paired bar charts suck at comparing values. The only reason they're used all the time is because they are easy to create.

But there are better alternatives that are just as easy.

Here's how to create 4 better alternatives with #rstats. Image 0 // Where's the code?

The code for all plots can be found at

This thread walks you through the code quickly.albert-rapp.de/posts/ggplot2-…
Aug 19, 2023 9 tweets 4 min read
R makes it dead-simple to use some of the most effective dataviz principles.

Here are six principles that are so easy that any ggplot beginner’s course should teach them. 1 // Make sure your labels are legible

Too many plots use waaaay too small texts.
With ggplot, it just takes one line to fix this.

Img 1: Way too small fonts & unclear labels
Img 2: Fixed with labs() and theme_gray(base_size = 20)
Img 3: Full code

Image
Image
Image
Jun 17, 2023 4 tweets 2 min read
Need to extract days, months, years or more from time data?

Don't compute them all manually with {lubridate}. That's way too tedious.

The {timetk} package has a nice function that does all the heavy lifting for you.

LEFT: {lubridate} workflow
RIGHT: {timetk} workflow
#rstats ImageImage BONUS: Maybe you don't want use all of the stuff that {timetk} computes for you.

Here's a simple function that extracts only the parts you want.

All of the code can be found on GitHub at gist.github.com/AlbertRapp/2c9… Image
Jun 10, 2023 11 tweets 5 min read
Everybody loves colors but only few know how to use them well.

With the right guidelines, using colors becomes super easy.

Let me show you how to implement these guidelines with ggplot 🧵
#rstats Anyone can create a stacked bar chart with ggplot.

But that can end up in a colorful & messy plot.

Let's implement a couple of guidelines from this datawrapper blog post to level up our color game blog.datawrapper.de/10-ways-to-use… Image
Jun 7, 2023 11 tweets 5 min read
Sometimes people ask me if I can do one-on-one R tutoring.

Sure I can. But then my hourly rate applies. And there are many amazing *free* resources. Want to try them first?

Here are a few that I recommend. #rstats 1 // Yet Again: R + Data Science

Find it at yards.albert-rapp.de

I'll start with one of my own bc I assume that you like my style (otherwise why ask me?)

Beware though: YARDS is a graduate-level course that I taught for math students w/ a bit of programming experience. Image
May 31, 2023 8 tweets 4 min read
Data cleaning is tedious.

But it's much easier with the {janitor} package. Especially if you work with Excel files.

Here are 5 underrated features from {janitor}. #rstats 1 // Create clean names

This is absolutely the best function. It transforms column names such that they are easier to use for programming.

Left: Bad for programming
Right: Good for programming ImageImage
May 19, 2023 26 tweets 10 min read
Ever heard of logistic regression? Or Poisson regression? Both are generalized linear models (GLMs).

They're versatile statistical models. And by now, they've probably been reframed as super hot #MachineLearning.

Brush up on their math with this thread. #rstats Let's start with logistic regression. Assume you want to classify a penguin as male or female based on its

* weight,
* species and
* bill length

Better yet, let's make this specific. Here's a dataviz for this exact scenario. It is based on the {palmerpenguins} data set. Image
May 12, 2023 24 tweets 11 min read
The best way to learn data analysis is to actually practice it.

Each week, the #tidyTuesday challenge gives you plenty of opportunity for this.

Don't know how to get started with the challenge? In case you missed it, I've put together an #rstats guide in January. First, get the data.

Head over to the tidyTuesday's GitHub repo at github.com/rfordatascienc…

Just copy the code from the "Get the data" section. Image
May 5, 2023 16 tweets 6 min read
I used to think tables are boring.

But they can be beautiful & engaging.

Here's a nice example from @infobeautiful.

It uses many eye-catching elements but you don't need them to create a great table.

Just stick to these guidelines 🧵#dataviz A huge table describing wha... Let's start with a not so great table and improve it.

Here's a table I would have created just a few months ago.

Not so sexy, right? Let's clean that up. Image
Apr 28, 2023 11 tweets 5 min read
I hate code duplication. It's just a sure way to bloat code and do copy-and-paste mistakes 🙈

In Shiny, modules help me to avoid that.

BONUS: They move the app's logic to separate + reusable functions for cleaner code.

Here's how modules work. #rstats Let's build an app that displays a scatterplot of two variables of a given data set.

Let's imagine that each data set needs its own page in our app. Here's how that could look. ImageImage
Apr 4, 2023 15 tweets 7 min read
Paired bar charts suck at comparing values. The only reason they're used all the time is because they are easy to create.

But there are better alternatives that are just as easy.

Here's how to create 4 better alternatives with #rstats. 0 // Where's the code?

The code for all plots can be found at albert-rapp.de/posts/ggplot2-…

This thread walks you through the code quickly.
Mar 31, 2023 19 tweets 7 min read
Tired of lackluster visualizations that don't tell you anything?

Discover how storytelling and nuanced color use can
- transform your bar charts.
- inform readers on key insights & actions

Here's a step-by-step guide (with full code at the end). #rstats Here's our starting point.

Note that this tutorial is a ggplot2 recreation of

(And once you've mastered the technique you can enhance this visual with advanced stats beyond comparing error rates to average.)
Mar 29, 2023 9 tweets 5 min read
Data visualization doesn't have to be complicated. 🤯

In fact, ggplot makes it dead-simple to implement some of the most effective dataviz principles.

Here are six dataviz principles that are so easy that any beginner’s course should teach them. #rstats 1 // Make sure your labels are legible

This one is super easy to fix. Any beginner can do it.

Img 1: Way too small fonts & unclear labels
Img 2: Fixed with labs() and theme_gray(base_size = 20)
Img 3: Full code ImageImageImage
Mar 25, 2023 13 tweets 7 min read
Manually sifting through mountains of data is annoying. 🥱

But with the point & click interface of analytics dashboards, data exploration is more fun.

And building a dashboard is simple too, especially with R & Shiny. Here's how to get started now. #rstats 1 // Data

First, you need data.

It's always fun to work with your personal data, so I will use my the last three months of my Twitter analytics data. You can download yours at analytics.twitter.com
Mar 11, 2023 14 tweets 6 min read
Ever found yourself stuck trying to visualize data that's only available as PDF?

I faced this exact issue when recreating an interactive plot on the democracy index.

Here's how I circumvented that issue with #rstats. (plus code at the end) 0 // Find the data

The data in question can be downloaded at eiu.com/n/campaigns/de…

This will give you access to a PDF that contains multiple pages of data.
Mar 4, 2023 9 tweets 4 min read
Text manipulation is an essential data cleaning skill. Often, this is step 1 before you can get any work done.

But with the right functions you can speed up that process. Here's how. #rstats
(With full code at the end) 0 // Get the data

First, we need an example data set. Here's one from TidyTuesday.

Take a look at the company names. They contain words like "Inc." and "Corporation". That's not something I'd use in a dataviz (too much clutter).

So, let us fix the names.
Feb 26, 2023 26 tweets 13 min read
Over the past year, I've shared hundreds of dataviz tricks.

This thread compiles the best tricks and will help you to

A) build clear visualizations
B) choose better chart types
C) use colors much more efficiently
D) create a dynamic experience with interactive elements
#rstats A1 // Two things every {ggplot2} course should teach:

1️⃣ Use proper labels and create a title with labs()
2️⃣ Increase the text size with theme_grey(base_size = ...)

Just two lines of code. But a considerable amount of respect for your audience (which has to read your graph).
Feb 9, 2023 11 tweets 5 min read
Sometimes people ask me if I can do one-on-one R tutoring.

Sure I can. But then my hourly rate applies. And there are many amazing *free* resources. Want to try them first?

Here are a few that I recommend. #rstats 1 // Yet Again: R + Data Science

Find it at yards.albert-rapp.de

I'll start with one of my own bc I assume that you like my style (otherwise why ask me?)

Beware though: YARDS is a graduate-level course that I taught for math students w/ a bit of programming experience.
Feb 4, 2023 10 tweets 5 min read
Many people don't actually need a complex dashboard.

An interactive document with a few toggles and sliders is easier to build and can give the same dynamic experience.

With Quarto and Shiny you can do that in just a few steps. Here's how. #rstats 1 // Enable Shiny

First, you need to enable Shiny in the YAML header of your document.

Just set "server: shiny"
Jan 15, 2023 25 tweets 11 min read
The best way to learn data analysis is to actually practice it.

Each week, the #tidyTuesday challenge gives you plenty of opportunity for this.

Don't know how to get started with the challenge? Here's an #rstats guide using this week's data. First, get the data.

Head over to the tidyTuesday's GitHub repo at github.com/rfordatascienc…

Just copy the code from the "Get the data" section. Image