As the #rstats course material is not public (yet?) or available as online training (yet?), I thought I am sharing some slides from the deck.

The course covers all steps of the #DataScience workflow as featured in @hadleywickham's fantastic #R4DS 📕 r4ds.had.co.nz/index.html The title slide of the workshop "Reproducible Data AnalThree avatars (customised versions of the lovely Open Peeps A chapter slide entitled "What is This Course About?&quA conceptional representation of the data science workflow:t
Let's start with session 1:
"Introduction to #rstats and #rstudio" ®️ The title slide for the first session of the "ReproduciA slide explaining what the R programming language is:  &quoA comparison of R and Rstudio, taken from ModernDive: R is tA screenshot of Rstudio with the default panes: Script (uppe
The fundamentals of R includes:

* values
* assignments and objects
* functions
* data types
* unknown values
* vectors
* factors
* packages
* tabular data
* data generation
* data import A colorful overview of how functions work: the function nameA function might return a value, which is printed on a new lA colorful representation of objects and assignments in R: tA colorful explanation of vectors that start with the vector
We also covered, among other topics, naming conventions, coercion, name conflicts, ... A slide on syntactic object names:  "* should be descriA slide on type coercion, illustrating the general rule logiA slide on name conflicts in case packages come with similarA schematic explanation of namespaces: the package name (pur
... tibbles as modern implementation of data frames, retrieving basic summaries of data sets, potential problems, and discussed resources to find help. A slide comparing data.frame and tibble:  "> Tibbles arA slide showcasing several basic R functions that help to inA slide illustrating potential mistakes when working with R An overview of resources that provide help with R:  "*
Ay, just realize the crappy image resolution... Sorry, going to prepare better ones for the other sessions.
Time for session 2:
"Data Wrangling with the {tidyverse}*"

This time with slides in better quality.

* I know it's a bit too broad but as we use multiple packages such as dplyr, tidyr, forcats, and stringr (and strictly speaking tibble as well) I went for this session name. The title slide of the 2nd session "Data Wrangling withAn overview of the data science workflow with "Import&qAn overview of the tidyverse, an opinionated collection of RA schematic overview of the core and other tidyverse package
Some analysis and #dataviz might be possible without (re)shaping and/or summarizing your data—especially also thanks to #ggplot2's powerful stat functionality—but often we need to prepare our data for the next steps. You can do it in #Excel but we, of course, use #rstats A description of the main dat set used for this session, theA visualization showing counts per sex and decade. This visuShort excourse on "Integer Division" which can be A visualization showing the number of unique baby names in t
Of course, we start with THE main package for data wrangling in the #tidyverse collection: the #dplyr 📦 and its main verbs

(Credit to @allison_horst for her lovely illustrations that are featured across all sessions 🙌) A chapter title stating "Data Wrangling with the {dplyrAllison Horst's wonderful and entertaining "go wranglinA general overview of the general syntax of dplyr functions:An overview of the main verbs of dplyr: filter, select, arra
I always share the equivalent #baseR code (not everyone loves the #tidyverse 😱) and show the basic and a bit more advanced usage of the main verbs--and of course group_by and how it gives you SUPERPOWER!! 🦹‍♀️🦸‍♂️ An example how to pick rows with matching criteria with filtAn example of two different approaches to select (or unselecA example of a more complex summarize call using across() anAn illustration of a female superhero and the text "gro
How to bring it all together? Pipe it! The chapter slide for the "pipe" section entitled A line-by-line data wrangling example in which we assign eacA "real-life" example how the pipe increases readaA code-based example of the pipe in use, also highlighting t
In the following, a few more functions (and #tidyverse packages) that help when cleaning data (feel free to share your favorites, those are the ones I am using regularly) An example of slice_max() to pick the n top values.A comparison of count(mpg, manufacturers) which is a wrapperAn example of a full join of two data sets with non-equal co
#tidyr: pivoting is though but so important and powerful The chapter slide for the tidyr packageA comparison of long (all measurements stored in a single coAn example of the pivot_wider function to turn long (tidy) dAn example of the pivot_longer function to turn wide into lo
#forcats: suddenly working with factors became one of my favorite tasks in R! 🤯

And it's so important in combination with #ggplot2 as well: The chapter slide for the forcats packageExamples of some of the fct_* functions of the forcats packaA sorted bar graph thanks to fct_rev(fct_infreq(value))A bar graph with "lumped" categories to reduce the
#stringr: well, working with strings. Consistent and simple (well, except nasty #regex formulas) The chapter slide for the stringr packageAn overview of some of the functions provided by the stringr... as well as str_replace and str_replace_allA bar graph, sorted thanks to the forcats functions but with
#lubridate: working with dates became so simple as well as I've never been a fan of POSIXct/lt and Co.

Plus #hms for working with timestamps. The chapter slide for lubridate and hmsAn example of the lubridate funcitons year and month in comb... as well as yday (to return the Julian day, the number ofThe month an wday functions can return full or abbreviated l

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Cédric Scherer 🐦➡️🦣 @CedScherer@vis.social

Cédric Scherer 🐦➡️🦣 @CedScherer@vis.social Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @CedScherer

Aug 17, 2022
"Graphic Design with #ggplot2" 👨‍💼👩‍💻🧑‍💻

Do you want to recap the 2-day workshop at #rstudioconf? Or do you feel sad you've missed it?

🔥 All course material incl. latest updates can be found on the workshop webpage—9 sessions, 760 slides, 314 ggplots!

👉 rstudio-conf-2022.github.io/ggplot2-graphi… A screenshot of the rstudio::conf(2022) workshop "GraphA screenshot of the title slide of that workshop.The motivational example illustrating the capability of ggpl
@rstudio The session pages contain not only the slides but
🔵 hands-on #rstats codes
🔵 recap notes
🔵 exercises incl.
🔵 prepared scripts, either as #quarto or #rmarkdown
🔵 step-by-step solutions

➕ all source codes on GitHub: A screenshot of a session page, showing the recap notes and A screenshot of the script to answer exercise 1 of the firstThe title slide of the solution slide deck for exercise 1 inThe raw code for one of the exercises.
Some exemplary #dataviz from the workshop, 💯 done in #rstats thanks to #ggplot2 A colorful boxplot with overlaid juitterstrips. Three differOne of the exercise graphics using sina plots and errorbars,A bar plot showing reported bike counts as sums per season aThe "Speed of Language" visualization using normal
Read 4 tweets
May 18, 2022
Guess I got a bit side-tracked while creating some (mostly bad) color palette examples for a #datviz training...

1. sequential
2. diverging
3. rainbow
4. qualitative (used for quantitative data) A world map of human popula...A world map of human popula...A world map of human popula...A world map of human popula...
All the colors (so far) 🧡🤎💙💚❤️💜💛 screenshot of my collection...
A bit of additional information:

* for quantitative data like this we usually want to use a sequential (map 1+2) or diverging (map 3+4) color palette A world map of human popula...A world map of human popula...A world map of human popula...A world map of human popula...
Read 7 tweets
Sep 23, 2021
📊🧵 Collection of tweets featuring open-access materials that I have shared over the last years:
Talks, seminars, blog posts, hands-on notebooks, codes, and more!
#rstats #ggplot2 #tidyverse #dataviz 🧙‍♂️ Title slide of my "ggp...Final visualization of my &...Overview of plots contained...Title slide of my "Bey...
Read 10 tweets
Dec 2, 2020
🔥🔥 MAJOR UPDATE 🔥🔥

The extended "#ggplot2 Tutorial for Beautiful Plotting in #Rstats" is online 🥳

A ton of new examples such as adding annotations, playing with after_scale, new chart types, modifying legends, interactive #dataviz and many more!

🔗 cedricscherer.netlify.app/2019/08/05/a-g… ImageImageImageImage
The tutorial now contains 188 plots and is generated with ~3000 lines of code.
Added topics (1/5):
- several alternative ways to solve things
- short explanation of geoms and theme in the intro
- more on theme elements
- in general a bit more text + explanations
- highlighting difference `scale_x|y_continuous()` vs `coord_cartesian(x|ylim)`
Read 11 tweets
Aug 4, 2020
#TidyTuesday Week 2020/32 ⚡ European Electricity by @EU_Eurostat inspired by @JohnMuyskens, @karim_douieb & @robradburn

Still experimenting with geofacets 🌐 And still experimenting with moon charts 🌘🌒🌖 Not enough green on that chart though.

#r4ds #rstats #ggplot2 #dataviz A geofacet of Europe that s...
Wasn't sure if I like the grey or white version better.
#FridaysForFuture #ClimateCrisis #ClimateAction #ClimateChange Image
Code for this and many other #dataviz'es on my GitHub: github.com/Z3tt
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(