Discover and read the best of Twitter Threads about #RStats

Most recents (24)

If you’re an academic you need a website so that people can easily find info about your research and publications. Here’s how to make your own website for free in around an hour using the blogdown package in #Rstats [UPDATED 2019 THREAD]
I posted a similar thread about this time last year, but have now updated info reflecting recent software package changes, so if you're thinking you may have seen this thread before, that's why...
Ok, so why use blogdown? Sure, there are several free options available to start your own blog (e.g., Medium). However, you generally can’t list your publications or other information easily on these services. Also, who knows where these services will be in a few years?
Read 26 tweets
Hi #rstats what publication options exist for highly specific functions that may still be quite useful to someone else doing the same task? I've been looking at MethodsX and JOSS for example but have by no means decided...
in the course of doing an analysis for an ecology project I've invested considerable effort into two functions for quite different task. I have Rmd appendices to my manuscript that detail how I've used these functions but I know the supplementary materials are rarely reviewed...
...this is what got me thinking about MethodsX because if instead I was to create two packages they would be very niche, single function packages (with a bunch of dependencies 😱😉) on the other hand making MethodsX papers OA = feeding Elsie even more research $
Read 4 tweets
My R package vtable is now updated to version 0.5.0 on CRAN, with two big improvements! Improved speed, and a new function labeltable() #Rstats (thread)
First, if you've ever fed a big data set with lots of labeled values to vtable, you know it can get very slow if the labels aren't perfectly aligned. I've sped that up a LOT, and also it now defaults to a simpler and faster still variant.
Second, labeltable! This function creates an easily viewable table (like vtable) that shows, for each value of a given variable, the labels of it. Or, the associated values of other variables.
Read 7 tweets
My previous ten random useful things about #rstats was so popular that I decided to add another ten. #datascience #analytics #r4ds
towardsdatascience.com/ten-more-rando…
@kearneymw you get a mention in this one :)
@emilykuehler I link to your bar chart race blog post in this.
Read 3 tweets
#SDSS2019 @gdequeiroz do people need to be on Twitter to be a part of #datascience community? How do we include people who are not on twitter?
#SDSS2019 @AmeliaMN joined twitter one day apart of first opening #rstats. She encourages people to at least open a twitter account and follow people.
#SDSS2019 @dataandme ecosystem of spaces approach: different fora that exist are good for different types of interactions -- think @StackOverflow @StackStats; @RStudio created an online community; there are people who actively contribute to @GitHub
Read 36 tweets
Today is the day! So excited to talk data science in the classroom! 9:30 in Regency Ballroom EF 👩‍🏫 💻 📊
Slides from today's workshop: bit.ly/SDSS_2019
Examples used in today's workshop (requires rstudio.cloud login): bit.ly/sdss_workshop

#SDSS2019
Introducing data science, the basics of #rstats programming, and chatting data science education are three of my favorite things to do. Getting to do that with other instructors was such a treat! Thank you to @SDSSconf for the opportunity and to all who attended the workshop!
Read 5 tweets
Hey everyone! Here’s a #tweetorial on our new paper on why we often can’t make #causalinference using #distancetocare as an exposure or instrument! cc: @epiellie
There are 3 problems with #distancetocare as an exposure for estimating causal effects. Let’s walk through them.
The first problem that probably comes to mind is #confounding 🙀Choices about where to live and where to locate care facilities are complicated and depend on a lot of things we might not be able to measure (e.g. socioeconomic status).
Read 10 tweets
Our new paper on using distance to care as an exposure or instrument and why it often violates all of the causal inference assumptions is now online at @AmJEpi! Check it out and stay tuned for a tweetorial!
cc @elliecaniglia
academic.oup.com/aje/advance-ar…
Ping @iwashyna @bnallamo ... 👆🏼here’s the paper we talked about last week!
One of the most fun parts of working on this paper was making an #rstats shiny app to help readers play around with the different sensitivity analyses we discuss.

Check it out here... but maybe not all at once because then the server might crash 😆
emurray.shinyapps.io/distanceApp/
Read 4 tweets
#rstats for YOU: Ever encountered an R error message like "Couldn't create memory segment of size 3.2G"? Also, more subtly, ever had some code run slowly in spite of little apparent reason? And ever wonder why data.table is so blindingly fast? This post will be on MEMORY. 1/n
So here goes Memory 101A. Memory (meaning RAM) is broken down into "words," typically 8 bytes long. One R numeric quantity will occupy one word. So, e.g. 1 gig of memory will hold a numeric vector of length only 125 million. 2/n
Each word has an "address," an ID number. If your installation of R includes the tracemem() function, you can use it to determine where in memory your R object is, e.g.

> x <- 1:500000
> tracemem(x)
[1] "<0x7fb1eab6e010>"

3/n
Read 17 tweets
Hey #rstats community! I'm currently the only r user at my org, but I now have the chance to introduce interested staff to r. Most rely predom on Excel w some SQL/SAS/SPSS. What's the one thing someone showed you early on that really made R seem worth learning?
THANK YOU r community!!! Everything in this thread has been amazing. Sent a quick pitch to the teams emphasizing reproducibility, dataviz, and the awesome r community, with lots of your tidbits sprinkled throughout; we'll see if I get any bites!
Update: So far I've persuaded ~85% of both teams to join me for a demo/quick intro! And for those who are curious what I assembled, I added the rmd to my github: github.com/phillynerd/Cul…

Now it's on to compiling some basic organization-specific teaching materials...
Read 3 tweets
Because it's Friday, I decided to thread-ify my "fun stuff in #rstats" day from class yesterday. It's the end of the semester, so I wanted to give my students a little taste of other things that are possible in R, as well as more of the community culture.
We started off with looking at @MilesMcBain's plot of magical mentions of R, and talked through the packages we've seen in class ({tidyverse}, {ggplot2}, {dplyr}, {here}, {httr}, {shiny}, {rmarkdown}) and those we hadn't ({purrr}, {data.table}, {anytime}, {Rcpp}, {later}). Bar chart of
We looked at the README for @tylermorganwall's {rayshader} github.com/tylermorganwal…, and the twitter chatter, including and , and I recommended Tyler's rstudio::conf talk, resources.rstudio.com/rstudio-conf-2…
Read 16 tweets
Let's chat about building a R community within your office. I was the only R user in the office and I also wanted people I could go to for help and just thought partner to solve problems. So, here are some of my lessons learned from building a R culture in the office!
👇👇
📌Lesson 1: Don't tell or lecture about the benefits of #rstats, instead show them a demonstration!

People generally are motivated when they see how easy it is to start something small and with a specific example!

my ex: being able to work with multiple dataframes concurrently.
📌Lesson 2: Not every training has to carefully planned!

Sometimes the spontaneous help you offer to a colleague is still training. This training was timely, concise and helped them cross the roadblock.

Ex: I wrote down a quick #STATA to #R translation on a post-it note.
Read 11 tweets
Just to return to this great tweet thread for a second -- the other problem with retractions is that articles will often keep being cited, even after all of this work to get them retracted in the first place (1/7)
A recent article in Scientometrics, for example, found that not only do retracted articles keep being cited, but they are often cited positively, despite clear notices of retraction (2/7) link.springer.com/article/10.100…
I actually don't think this is always authors' fault! Everyone should know not to cite Wakefield et al., of course, but journals don't really go out of their way to publicise retractions. (3/7)
Read 7 tweets
[1/n] To feliz, e vou dizer para vocês o motivo:
Como alguns sabem eu estava há um tempo escrevendo um pacote (meio que uma extensão, uma adição) para R. Depois de muito perrengue acredito ter terminado a primeira fase deste meu primeiro projeto, o owdbr (Open Welfare Data BR)...
[2/n] O que ele faz é servir como facilitador para a requisição de dados (em nível municipal) de programas sociais do governo federal (Bolsa Familia, Seguro Defeso, Prog. Err. Trab. Escravo Infantil).
[3/n] A ideia é facilitar o acesso a esses dados para que se possam fazer análises dessas políticas públicas. É possível requisitar os dados sem ele, claro, mas requer um trabalho chatinho e repetitivo com a API do Gov. Federal.
Read 7 tweets
🎨 A short thread of some cool ways to visualize @spacy_io's output in your terminal!
@spacy_io Explacy by @tylerneylon gives you a very nicely formatted dependency tree and a table including part-of-speech tags and lemmas.

github.com/tylerneylon/ex…
This one by @weihuang is for #rstats and super new – it uses @quantedainit's spacyr package under the hood.

github.com/weihuangwong/d…
Read 5 tweets
Brief (well, maybe...) non-technical thread attempting to explain “alpha-spending” at interim analyses for the layperson
Not-uncommon question: why we can’t look at the data a bunch of times during a trial and simply stop whenever p<0.05? After all, the evidence supporting an effect is now “statistically significant” right?
Bayesians, please, hold comments until the end. One day, perhaps, there will be more published trials using Bayesian interim-monitoring approaches, but for now, I work with students / trainees / faculty that need help reading / understanding more conventional frequentist trials
Read 26 tweets
I'm deeply disappointed in @DataCamp leadership, and cannot, in good conscience, recommend it to my students. Questionable ethics at tech companies are sadly common, but it hits especially close to home when it happens at an education company that I worked with & promoted. (1/5)
This means:
1. My @Coursera course no longer offers option for #rstats labs via @DataCamp.This years-long collaboration represents hard work by @DukeLearning+DataCamp staff to provide browser-based R access for thousands of learners,fruits of this effort will now go unused(2/5)
2. I'm working w/ @jo_hardin47+@BaumerBen+Andrew Bray to make our course content available elsewhere.Meanwhile, don’t take these courses on DataCamp. As @noamross put it so well “We can't change behavior without incentives, and for companies those incentives are financial."(3/5)
Read 5 tweets
So I heard about a really bad take today, that things that are available free online aren't good enough to be worth money.

To counteract that rubbish call, here I tender a list of free, online #rstats books that I recommend above paid ones #OER
1/many
This list (of course) begins with @djnavarro's Learning statistics with R. Want a brilliant, approachable entry to both statistics and #rstats when you have no prior background, I can recommend nothing else.
learningstatisticswithr.com/book/
Got less of a psych focus and coming from a #DataScience angle, well you must read @hadleywickham & @StatGarrett's #R4DS. It has it's own supportive community (@R4DScommunity) that you should check out also.
r4ds.had.co.nz #rstats
Read 11 tweets
#RYouWithMe showcase Day 4:

Unit 2 (Clean It Up) Lesson 1: Cleaning Up Columns
This lesson comes in 3⃣ parts! See thread 👇
Find the link to the lesson in the video descriptions!

#rstats #rladies #dumpdatacamp #rstudio #learntocode
Part 1: How to clean up and rename variable names! Video:
Part 2: How to select subsets of data and change the order of variables! Video:
Read 4 tweets
You might be looking for R learning resources, especially if you’ve dropped @DataCamp. We’d like to point you to #RYouWithMe!

rladiessydney.org/ryouwithme

Over the coming days, we'll be showcasing the lessons. Stay tuned!

#rstats #rladies #dumpdatacamp
Read 14 tweets
I took a crack at reworking @statsepi’s RCT adjustment post (bit.ly/2Pd1JkH) w/ added simulation & visualization. To pique interest, here is a redacted version of the final image. Stick around for a walkthrough of how to get there & what it means. #epitwitter #rstats 1/n
Let's say we are studying the effect of a treatment on a continuous outcome (Y). In this field of study, 20 continuous patient variables are commonly reported (i.e., are found in the typical table 1). 2/n
Next, let's say that 10 of these variables are causally unrelated to Y (and thus are non-predictors of Y; call these N) and 10 have true causal effects on Y (and thus are predictors of Y; call these C). 3/n
Read 32 tweets
Update: based on feedback from this thread, we're doing a trial of Mode Analytics, and liking it so far.

I'm going to add some specifics about our use case, and pros and cons of Mode so far, below in case anyone is trying to make this decision for their org. 👇
In current state, we have important data living in at least three databases hosted on different servers. This means we need a solution that will allow us to combine data from multiple DBs, and Mode lets us do that using either #rstats or #pydata, which is a big plus. ✅
That said, Mode provides some tools for building visualizations on top of queries (without code!) that we were hoping our business users could utilize if analysts did the work of querying and combining data sources. This isn't fully possible currently, but could be in the future.
Read 6 tweets
*clears throat* So, LEGO +#rstats enjoys a rich and storied past. I am very excited to highlight some of the keystones of this lore.
.@JennyBryan ‘s slides are a great place to start exploring the #rstats + LEGO legacy
Next, do yourself a favor and check out @seankross ‘s LEGO package:
Read 7 tweets
This story is out in the open now, so I guess I might as well say that the unnamed employee in the DataCamp post was me.
I haven’t talked publicly about this experience before. That’s been a complicated decision, but for me it’s been the best way to try to move on with my life.
I didn’t know they were going to publish this post, but now that it's out I appreciate everyone who has responded to say that what happened is not okay.
Read 8 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!