Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Nick HK

@nickchk

Mar 24, 2019 • 10 tweets • 4 min read • Read on X

Very excited to say that my 📦#Rstats package📦 vtable is now available on CRAN, installable with install.packages('vtable')
vtable is a *variable browser* for R that helps you look at your data WHILE you're working on it. (thread/)

One thing I sorely miss in R, coming from Stata, is the ability to look through your variables while coding, without having to open it up directly or repeatedly call head(), str(), etc. None of those options work great anyway when you have lots of vars. Show me information!

So vtable(yourdataset) generates an informative table with information about all your variables (ranges, factor levels, classes...), and opens it up in RStudio Viewer (if in RStudio) or a browser window (if not). Heck, or save it to file as a piece of the data documentation.

Comes with plenty of bells n whistles. Add summary statistics with summ=, or have it tell you whether the variable has missing obs with missing=TRUE. Data titling and description (data.title, desc) for if you're building doc files.

Big ol' bonus: VARIABLE/VALUE LABELS. Notice last tweet it auto displayed the variable/value labels from efc? It will do that with labels coming from sjlabelled, haven, or Hmisc. This labeled Stata file was imported via haven. Plus, apply your own labels in one of 3 easy formats.

Not to mention, way easier to SEARCH variables. Want to find a given label, or a given factor level? You could try to remember which sjlabelled sub-function lets you do that & spend time on proper syntax, or just use vtable to open up a browser window and do Ctrl/cmd-F 🤷‍♀️

Very glad it's on CRAN now, and thanks to the reviewers for dealing with first-timer me. More information on options and the methods of using your own variable labels in the documentation or here: nickchk.com/vtable.html

Also, an intro video (done before it was on CRAN, so slightly outdates)
Enjoy!

One last thing: comes with the helper function dftoHTML which is really just intended to do some HTML preprocessing for vtable. But as a bonus, also allows you to have a copy of your data open without having to flip back and forth to the View tab.

Can be real handy if you find yourself needing to look back at the data a lot. And you can always subset/select before calling it (as with vtable too, of course) if you just want to look at a few variables. Ok now that's it, /fin.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @nickchk

Nick HK

@nickchk

Dec 17, 2024

New working paper out today with Eleanor Murray, "Do LLMs Act as Repositories of Causal Knowledge?"

Can LLMs (like ChatGPT) build for us the causal models we need to identify an effect? There are reasons to expect they could. But can they? Well, not really, no.

Paper here:

Why might we expect LLMs could help with this task? At first blush you might expect this to require LLMs to have a real-world causal understanding of how the world works.arxiv.org/html/2412.1063…

But not really. If people have talked online about causal links, then the LLM could potentially just repeat that back.

Read 12 tweets

Nick HK

@nickchk

Mar 30, 2023

Various thoughts on LLMs:

1. Exam performance is super interesting, but I think is misinterpreted. Exams are almost pure signal. They're not the actual thing you want to do, they're designed to be something that a *human* could only do if they could actually do the actual thing.

i.e. on a good exam, cost of performing well on the exam should drop sharply as your skill on the actual important thing improves.

Example: your ability to answer "what are the proper safety checks to run before letting a plane fly?" is easier for a good plane mechanic...

to answer than a bad one. But for an LLM that cost function should be very different than for a human. Exam questions that have you basically recall and repeat back what standard course material said is a clear case of this.

Read 9 tweets

Nick HK

@nickchk

Oct 10, 2022

If you teach students to work with data, you're doing them a great disservice if you just teach them how to run models/analyses and not how to clean and manipulate data. If you haven't tested them directly on this, you'll be surprised how unintuitive this is to new users!

In my data viz class, the week 3 assignment has them:

1. Read some data docs
2. Recreate some variables based on the docs ("count the number of peers of each gender in each class, not counting someone as their own peer")
3. Make some tables of the form "average X by group"

4. At a few different places clean the data up or check it to see if it makes sense

This is for many students an extremely difficult assignment, and the one for which I always get barraged with extension requests and told they spent hours on it.

Read 6 tweets

Nick HK

@nickchk

Oct 10, 2022

I have recently been following two sources of 90s music reevaluation: (1) The Number Ones and (2) the Woodstock 99 documentary and the things these, respectively, have most noticeably revised upwards my opinions of:

1. Mariah Carey
2. The Limp Bizkit song "Break Stuff"

i think i'm previously on record on twitter as rejecting any limp bizkit reevaluation, and i largely stand by that, but break stuff is an exception in the catalogue and dang did you see that crowd

the song is the stupidest and most direct thing in the world but that's absolutely to its advantage as the hallmark greeting card of angry pop music, which is not a sleight

Read 5 tweets

Nick HK

@nickchk

Oct 10, 2022

The steel nerve of teaching behavioral economics and saying mid-lecture "huh I don't remember the sample size on this study let me check real quick"

It was in the 40s

Steel nerve, of course, being inferior to have selected a better study in the first place

Read 4 tweets

Nick HK

@nickchk

Aug 30, 2022

Announcement and invitation! A new project that aims to improve the quality of research in applied microeconomics by examining researcher choices. I am hoping to recruit up to *200 researchers* of all kinds (with pay) and hope you will join me! (Thread) nickch-k.github.io/ManyEconomists/

This project, with Claus Portner, is a follow-up to this paper onlinelibrary.wiley.com/doi/full/10.11…, where multiple researchers each replicated the same studies (a “many-analyst study”). Analytic and data-cleaning choices were different, and this really impacted results.

In this new project, a larger number of researchers will independently complete the same research task. Then, there will be several rounds of revision following peer review, or a change in the research task that standardizes some of the choices made.

Read 12 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Nick HK

Try unrolling a thread yourself!

More from @nickchk

Nick HK

Nick HK

Nick HK

Nick HK

Nick HK

Nick HK

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!