, 15 tweets, 6 min read
My Authors
Read all threads
Hello #epitwitter and #statstwitter,
There was a recent discussion about the Hosmer-Lemeshow goodness of fit test. I thought it would be interesting to talk to Dr. Lemeshow (who is not on twitter) about his thoughts on the test. This thread has some highlights from our chat. 1/n
In the late 70’s, Hosmer and Lemeshow were struggling with the question “How do you know that the probabilities produced by logistic regression models reflect reality?” This was the motivation for developing the Hosmer-Lemeshow goodness of fit test. 2/n
As with a linear regression, we can’t only look at the estimates from the model to know if the fit is good. For linear regression we can look at a plot (e.g., residual plot) to assess model fit. 3/n
There was nothing like this for logistic regression and no one had made any attempt to determine if the models were accurate. 4/n
The test was developed to compare the number of observed events in categories (defined by deciles of estimated probabilities) with the number of expected events based on the model. 5/n
Similar observed and expected counts suggest that the probabilities produced by the model reflect the true outcome experience in the data. If the counts are not similar, the model is not producing accurate probabilities. 6/n
The first publication was in 1980, titled ‘Goodness of fit tests for the multiple logistic regression model’ 7/n
…dfonline-com.proxy.lib.ohio-state.edu/doi/pdf/10.108…
This was followed by ‘A review of goodness of fit statistics for use in the development of logistic regression models’, in the 1982 January issue of the American Journal of Epidemiology. 8/n doi.org/10.1093/oxford…
The H-L test need not be run every time a logistic regression is run. The test does not indicate whether an estimated odds ratio is biased or accurate. It does provide useful insight if you are interested in estimating the probability of an event from a logistic regression. 9/n
The H-L test is also not a good choice in a case control study, as we already know the probability of an event based on how we designed the study. The test should be used in a cohort or cross sectional study when trying to estimate the probability of the event. 10/n
Dr. Lemeshow does warn that the test is overly powerful for large data sets; it tends to reject the hypothesis of good fit too frequently, suggesting that the model is poor when, in fact, the probabilities are reasonably accurate. 11/n
There is work being done to improve this. Look for an upcoming article in Biometrics by Nattino, Pennell and Lemeshow. This paper focuses on calibration belts, which offer an alternative way to assess model calibration without having to create groups. 12/n
A routine is available in Stata that makes implementing this procedure (following the fit of a logistic regression model) extremely easy. 13/n
stata-journal.com/article.html?a…
On the topic of reviewers insisting that authors report the H-L test every time a logistic regression model is generated, Dr. Lemeshow suggests to push back. If you aren’t interested in the probability of events from the model, then the H-L test isn’t necessary. 14/n
I hope this was an interesting thread! /fin
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Sara Conroy, PhD

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!