Tweet

Stas Kolenikov

10 Aug, 6 tweets, 5 min read

@jameswagner254

#JSM2021 @jameswagner254 Using Machine Learning and Statistical Models to Predict Survey Costs -- presentation on the several attempts to integrate cost models into responsive design systems

@jameswagner254

#JSM2021 @jameswagner254 Responsive designs operate on indicators of errors and costs. Error indicators: R-indicator, balance indicators, FMI, sensitivity to ignorability assumptions (@bradytwest @Rodjlittle Andridge papers).

@jameswagner254

@jameswagner254 #JSM2021 @jameswagner254 Cost indicators? more difficult; proxies: # of attempts (Groves & Heeringa 2006)

Some decisions are made at the sample level (launch new replicate, switch to a new phase of the FU protocol), others at case level (change incentive amount, change mode)

@jameswagner254

@jameswagner254 #JSM2021 @jameswagner254 analysis of the costs in NSFG data: multilevel models with random intercepts for interviewers; BART (interviewers can be random intercepts or fixed effects) -- the random intercept BART came the closest to the actual costs.

@jameswagner254

#JSM2021 @jameswagner254 responsive design systems need to assign the probability of a specific outcome on each attempt, and then we can fold in the cost models to get a sense of the cost of the design / implementation decisions

@jameswagner254

@jameswagner254 #JSM2021 @jameswagner254 Survey cost prediction is a tractable problem, albeit a hard one. Coarsening the outcomes may make it a little easier (e.g. quartile groups of costs). Need to identify appropriate methods for the decisions, although case-level predictions are difficult.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @StatStas

Stas Kolenikov

@StatStas

12 Aug

@minebocek

#JSM2021 panel led by @minebocek on upskilling for a statistician -- how to learn??

@minebocek

@minebocek #JSM2021 @hglanz no shortage of stuff to learn. First identify what you don't know -- that comes from modern media (blogs, twitter, podcasts; groups, communities -- @RLadiesGlobal or local chapters; professional organizations -- @AmstatNews ).

@minebocek

@minebocek @hglanz @RLadiesGlobal @AmstatNews #JSM2021 @hglanz What do the job postings require these days? (This is how the content for the @CalPoly stat/data science program was developed.)

Read 40 tweets

Stas Kolenikov

@StatStas

12 Aug

@CDCgov

#JSM2021 an exceptionally rare case of ACTUAL out of sample prediction in #MachineLearning #ML #AI: two rounds of the same health data collection by @CDCgov

@CDCgov

@CDCgov Yulei He @CDCgov #JSM2021 RANDS 1 (fall 2015) + 2 (spring 2016): Build models on RANDS1 and compare predictions for RANDS2

ridge, lasso, elastic net, PLS, KNN, bagging, RF, GBM, XGBoost, SVM, deep learning

#JSM2021 Yulei He R-square about 30%; random forests and grad boosting reduce the prediction error by about 4%, shrinking towards the mean; standard errors are way to small (-50% than should be)

Read 4 tweets

Stas Kolenikov

@StatStas

11 Aug

I have two general questions:

1. when will the survey statisticians in the U.S. move from weird variance estimation methods (grouped jackknife) to simple and straightforward (bootstrap)

and

2. when will they move from weird imputation methods with limited dimensionality and limited ability to assess the implicit model fit (hotdeck) to those where you explicitly model and understand which variables matter for this particular outcome (ICE)?

Oh and somebody reminded me of

3. when will we move from PROC STEPWISE to lasso as the rest of statistics world has

Read 4 tweets

Stas Kolenikov

@StatStas

10 Aug

@olson_km

Now let's see how @olson_km is going to live tweet while giving her own #JSM2021 talk

@olson_km

@olson_km #JSM2021 @olson_km Decisions in survey design: questions of survey errors and questions of survey costs. Cost studies are hard: difficult to offer experimental variation of design features, with a possible exception of incentives. Observational examinations are more typical.

@olson_km

#JSM2021 @olson_km When you have one (repeated) survey at a time, you can better study the impacts of variable design features (but can't provide the basis for the features that do not vary.)

Read 12 tweets

Stas Kolenikov

@StatStas

10 Aug

#JSM2021 virtual vs. in-person: IMO there are exactly two activities at an average JSM that dictate in-person presence: cheering at the award ceremonies and browsing the new books. Confidential coffee (job search, editorial boards) can be done with burner phones.

Committee meetings should be /must be zoom calls; nobody is going back to in-person on that one. Having the presentations/files in advance/right after the event is the level of awesomeness not ever achieved by the conferences of the yester year.

Found yourself in a session that’s a poor match? Just click “All agenda” and find something else.

Read 4 tweets

Stas Kolenikov

@StatStas

26 Feb

https://twitter.com/kareem_carr/status/1364963288610652161

Responses indicate that even statistical professionals have zero clue as to what it takes to have a survey of 1000 randomly selected Americans every week. Proposals to have 50,000 every week would put the sample sizes on par with American Community Survey ($250M / year).

https://twitter.com/kareem_carr/status/1364963288610652161

I'll expand on this a little bit.

1. The sample size: The rate of new cases in the U.S. right now is about 20 new cases per day per 100K. Thus a sample of n=1000 would capture cases at the Poisson rate of (20 cases / 100 K pop * 7 days * 1000 in sample) = 14. The prediction interval around that is...

Read 35 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Stas Kolenikov

Try unrolling a thread yourself!

More from @StatStas

Stas Kolenikov

Stas Kolenikov

Stas Kolenikov

Stas Kolenikov

Stas Kolenikov

Stas Kolenikov

Did Thread Reader help you today?

Like this author's thread?