This week for #TidyTuesday, I explored K12 school run radio stations. I was interested in where these stations are located and what types of music they play. Shout out to my favorite station @c895radio with their dance music in Seattle!
- Explore the data & determine what questions I can address with a plot
- Filter data for school-associated stations & get geo data
- Plot the US & put dots for each station
- Color stations by simplified music genre groups
- Customize colors, background, etc
You can see my entire process including live coding, errors, and future directions at:
Questions I still have & welcome help with!!
- How to add Alaska & Hawaii in ggplot2::borders( )
- How to better (ideally automatically) determine music genre groups
- Other ways to show a many group variable besides color
• • •
Missing some Tweet in this thread? You can try to
force a refresh
This morning, I'll share some tips for my favorite thing to do in R: make plots! I'm firming in the ggplot world so many tips are related to this and related packages.
So not surprisingly, my first tip is to checkout ggplot! I think the most powerful aspect it it's legends as they are created automatically and change as you change the plot. No fear of an incorrect legend!
ggplot is in the tidyverse & thus, works seamlessly with dplyr, tidyr, etc. If I need to transform data *just* for a plot, I use pipes (%>%) to avoid saving the data and cluttering up my environment. For example:
This afternoon I'd like to highlight a recent project that I'm very proud of! My team developed a data analysis R package called **kimma** or Kinship In Mixed Model Analysis. Yes, it's also a play on my name and the well-known package limma 1/n
kimma provides flexible modeling of RNAseq data where you have many responses (genes) tested against the same model. You can run simple linear or mixed effects models with covariates, weights, random, & (new with this package) covariance effects like genetic kinship 2/n
kimma also provides model fit metrics like AIC, BIC, & R-squared so you can determine the best model without the biases associated with picking based on how many significant results you get. 3/n
It's tip time again! Here are my tips for working reproducibly in R.
Rprojects: This is a self contained directory with all data, scripts, etc necessary for an analysis. Within a project, you don't need file paths; thus, you can move the project & others can reproduce it.
Rmarkdown: This allows you to make reproducible reports with varying levels of detail from all the code to none of the code. I love that I can use the same document to make what I show to other bioinformaticians & to my non-coding bosses.
Functions: If you do the same thing repeatedly, it's helpful to turn these steps into functions. I house my functions in one GitHub repo that all my projects can access github.com/kdillmcfarland…. Thus, I know all projects are using the same code for a given analysis, plot, etc.
I love using and teaching R. For a while I was an adjunct faculty member, and I taught MFA students about data viz and R. It was so much fun! I always learn new things when I teach.
Based on my experience learning and teaching R, here are my five tips for learning R. (1/8)
1. Make mistakes. Mistakes help you learn. Even though I’ve been coding and using R for many years, I consistently run into errors and make mistakes when I code, and I always learn new things. (2/8)
2. Experiment and try new things. There is so much to learn and do in R, and it can feel both exciting and a bit overwhelming. But the more you experiment, the more you’ll learn. (3/8)
Happy Tuesday! Today I'm going to share a little more about my R journey.
I first learned R through a statistics class and book in college. Then I learned more about it in my second full-time job as a statistician in market research by reading and using another person's code.
I got really into doing data viz in R after I joined my local R-Ladies chapter (@RLadiesTucson) in 2020 and started participating in weekly #TidyTuesday sessions.
By practicing my data analysis and visualization weekly in R through #TidyTuesday, I've learned new functions and improved my #RStats skills. It's also super inspiring to see other people's work and explore their code. I really appreciate the openness of the R community!
Hi everyone! I'm Jenn, and I'm super excited to be curating #RLadies this week! I've worked in data/data science for about 10 years, and I love R! I mainly use R for data viz and analysis. (1/5)
This week I'll be talking about #DataViz and learning #RStats. I learned R first in grad school and then on the job when I worked as a statistician. (2/5)
Now, I use R in my work as a research analyst and for data viz projects I do for fun, like participating in #TidyTuesday and visualizing the books I read in 2021. (3/5)