Since various SW benchmarks are going around today... A short thread on why I use #rstats.

Put simply, it offers by far the fastest & most efficient tools for the work I do (i.e. mostly data wrangling & applied econometrics).
(Disclaimer: This thread is *not* tying to get you to change from your preferred SW. You should use whatever you feel comfortable with. But I will try to highlight some objective facts that matter to me.)
For data wrangling, nothing comes close to consistently matching the performance of #rdatatable. Benchmarks here: h2oai.github.io/db-benchmark/ Image
(The tidyverse obviously provides another extremely rich data wrangling framework in R & comes w/ its own set of awesome features: SQL, Spark, Arrow etc. integration.)
If you wondering about Stata (not incl. in the above benchmarks), see
github.com/matthieugomez/…, or
grantmcdermott.com/2020/06/30/res…

Bottom line: even if I grant you gtools (which you should install), an MP license ($$), and constrain the no. of cores that R uses, R is consistently faster. ImageImage
For fixed-effect regressions, {fixest} is insanely quick... as much as a 100x faster than lfe and reghdfe (both great packages in their own right). github.com/lrberge/fixest/

And... there’s more! It also supports non-linear models (logit, etc.) Image
Or, maybe you’re interested in LASSO. To the best of my knowledge, the {biglasso} package is easily the fastest and most memory efficient implementation. github.com/YaohuiZeng/big… Image
A quasi-related issue is code concision/syntax. This is veering off the “objective” path (I don’t have detailed stats) but I can only smile at claims that R requires more lines of code than, say, Stata. The opposite is almost always true IME.
Fwiw, compare the following bits of code. This is literally the most recent bit of Stata code that I rewrote in R. ImageImage
Again, though: concision isn’t necessarily a goal unto itself. Good code is code that you (and your collaborators) find easy to write and understand. There’s nothing wrong with writing more verbose code that achieves these goals. Code shaming is despicable IMO.
In summary, I use #rstats because it offers the best tools for *my* needs. The awesome community and zero price tag don’t hurt either ;-)

Your needs and tolerance to learn a new SW language may differ. But you should know that performance loss is *not* a reason to avoid it. /fin

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Grant McDermott

Grant McDermott Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @grant_mcdermott

Mar 6, 2021
Friday afternoon tip 🧵: How to host your files on GitHub.

(Yeah yeah I know GH is already great at hosting files. But I mean for actually *rendering* HTML documents and the like in-browser.)

Here's the example repo if you just see it in action: github.com/grantmcdermott…
First a bit of prologue (feel free to skip).

Here I'm quickly creating a local repo with a PDF and HTML file. I'm using my lecturenotes Rmd template, but that's unimportant. The main thing is that I have some files that look good locally, but now I want to share on GH.
So, I push them to GitHub and.. ughh.

The PDF version is okay (though I can't easily print or resize like I would if it was rendered in my browser).

But GitHub won't even let me look at the HTML, let alone render it. Minging.
Read 7 tweets
Jan 9, 2019
I'm teaching a "data science for economists" course this semester.

If you're interested in learning more about #rstats, Git(Hub), programming, databases, cloud computation, ML, etc., I'll be making all of my course material publicly available here: github.com/uo-ec607
As I say in the syllabus, this course basically covers all of the things I wish I'd been taught in grad school. At the same time, I've benefited immensely from so many people making their teaching materials (and software!) publicly available. This is me trying to pay it forward.
Here's a short-cut to the lecture slides and notes: github.com/uo-ec607/lectu…
Read 21 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(