I'm gonna start a thread on what I hope will be helpful R tips to wrangle this huge NFL Big Data Bowl data. If you're an advanced R programmer, this is probably not for you but feel free to correct me if I made a mistake or offer better alternatives
#1

slice_sample() if you want to quickly preview what your result might look like using a random sampling of rows in your data
#2

janitor::clean_names() if variable names with random capitalization, spaces and other undesired characters make you sick

with the defaults you can turn gameTimeEastern (😒) into game_time_eastern (😙👌)
#3

lubridate::mdy() to convert a variable into a Date

data %>% mutate(game_date = mdy(game_date))
#4

lubridate::parse_date_time() for inconsistent date formats

players %>%
mutate(birth_date = lubridate::parse_date_time(birth_date,
orders = c("y-m-d", "m/d/y"))
#5

tidyverse::case_when()
#6

janitor::get_dupes()

(watch out with your joins, there are 5 players with the same name)
#7

if you're going to bind all 17 weeks of data into one dataset, save it to disk as a parquet file via {arrow}. from my very unscientific testing with different file formats (rda, fst, feather, rds, tsv.gz), parquet was the fastest read

More on {arrow}: arrow.apache.org/docs/r/
#8

tidyverse::separate()
#9

Another use of tidyverse::separate()

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Asmae Toumi

Asmae Toumi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!