Tweet

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @StatStas

Stas Kolenikov at #JSM2021

@StatStas

12 Aug

@minebocek

#JSM2021 panel led by @minebocek on upskilling for a statistician -- how to learn??

@minebocek

@minebocek #JSM2021 @hglanz no shortage of stuff to learn. First identify what you don't know -- that comes from modern media (blogs, twitter, podcasts; groups, communities -- @RLadiesGlobal or local chapters; professional organizations -- @AmstatNews ).

@minebocek

@minebocek @hglanz @RLadiesGlobal @AmstatNews #JSM2021 @hglanz What do the job postings require these days? (This is how the content for the @CalPoly stat/data science program was developed.)

Read 63 tweets

Stas Kolenikov at #JSM2021

@StatStas

12 Aug

@CDCgov

#JSM2021 an exceptionally rare case of ACTUAL out of sample prediction in #MachineLearning #ML #AI: two rounds of the same health data collection by @CDCgov

@CDCgov

@CDCgov Yulei He @CDCgov #JSM2021 RANDS 1 (fall 2015) + 2 (spring 2016): Build models on RANDS1 and compare predictions for RANDS2

ridge, lasso, elastic net, PLS, KNN, bagging, RF, GBM, XGBoost, SVM, deep learning

#JSM2021 Yulei He R-square about 30%; random forests and grad boosting reduce the prediction error by about 4%, shrinking towards the mean; standard errors are way to small (-50% than should be)

Read 4 tweets

Stas Kolenikov at #JSM2021

@StatStas

10 Aug

@jameswagner254

#JSM2021 @jameswagner254 Using Machine Learning and Statistical Models to Predict Survey Costs -- presentation on the several attempts to integrate cost models into responsive design systems

@jameswagner254

#JSM2021 @jameswagner254 Responsive designs operate on indicators of errors and costs. Error indicators: R-indicator, balance indicators, FMI, sensitivity to ignorability assumptions (@bradytwest @Rodjlittle Andridge papers).

@jameswagner254

@jameswagner254 #JSM2021 @jameswagner254 Cost indicators? more difficult; proxies: # of attempts (Groves & Heeringa 2006)

Some decisions are made at the sample level (launch new replicate, switch to a new phase of the FU protocol), others at case level (change incentive amount, change mode)

Read 6 tweets

Stas Kolenikov at #JSM2021

@StatStas

10 Aug

@olson_km

Now let's see how @olson_km is going to live tweet while giving her own #JSM2021 talk

@olson_km

@olson_km #JSM2021 @olson_km Decisions in survey design: questions of survey errors and questions of survey costs. Cost studies are hard: difficult to offer experimental variation of design features, with a possible exception of incentives. Observational examinations are more typical.

@olson_km

#JSM2021 @olson_km When you have one (repeated) survey at a time, you can better study the impacts of variable design features (but can't provide the basis for the features that do not vary.)

Read 12 tweets

Stas Kolenikov at #JSM2021

@StatStas

10 Aug

#JSM2021 virtual vs. in-person: IMO there are exactly two activities at an average JSM that dictate in-person presence: cheering at the award ceremonies and browsing the new books. Confidential coffee (job search, editorial boards) can be done with burner phones.

Committee meetings should be /must be zoom calls; nobody is going back to in-person on that one. Having the presentations/files in advance/right after the event is the level of awesomeness not ever achieved by the conferences of the yester year.

Found yourself in a session that’s a poor match? Just click “All agenda” and find something else.

Read 4 tweets

Stas Kolenikov at #JSM2021

@StatStas

26 Feb

https://twitter.com/kareem_carr/status/1364963288610652161

Responses indicate that even statistical professionals have zero clue as to what it takes to have a survey of 1000 randomly selected Americans every week. Proposals to have 50,000 every week would put the sample sizes on par with American Community Survey ($250M / year).

https://twitter.com/kareem_carr/status/1364963288610652161

I'll expand on this a little bit.

1. The sample size: The rate of new cases in the U.S. right now is about 20 new cases per day per 100K. Thus a sample of n=1000 would capture cases at the Poisson rate of (20 cases / 100 K pop * 7 days * 1000 in sample) = 14. The prediction interval around that is...

Read 35 tweets

Share this page!

Stas Kolenikov at #JSM2021

Try unrolling a thread yourself!

More from @StatStas

Stas Kolenikov at #JSM2021

Stas Kolenikov at #JSM2021

Stas Kolenikov at #JSM2021

Stas Kolenikov at #JSM2021

Stas Kolenikov at #JSM2021

Stas Kolenikov at #JSM2021

Did Thread Reader help you today?

Like this author's thread?