We recently published our first paper sharing validity evidence for the development of neurological emergency simulations for assessment. Are you familiar with forms of validity evidence? If you are an educator, you should be! A thread… rdcu.be/ceMm3
Steven Downing wrote a fantastic review on validity as it pertains to assessment in medical education. Let’s review the highlights! pubmed.ncbi.nlm.nih.gov/14506816/
As Downing states, validity is the sine qua non of assessment. It is approached as a hypothesis. No assessment is “valid” or “invalid” -> assessments have scores with more or less validity evidence to support interpretations.
Assessment data can be more or less valid for any specific purpose, at any specific time, for any specific population. For instance measuring IQ by asking what’s missing from a picture of a tennis match, would be less valid for examinees unfamiliar with the game.
Validity requires multiple sources of evidence. In our paper we used Messick’s framework of validity which includes 5: content, responses, internal structure, relationship to other variables, and consequences.
Content evidence: relationship between test questions and course objectives/scientific domains that are to be assessed. Do question items adhere to evidence-based principles? Are the item-writers content experts? Are there sufficient questions to adequately sample domain? Etc.
We had board-certified experts with subspecialty training develop cases and checklists. We based content choices on Neurocritical Care Society’s Emergency Neurological Life Support course, cross-references with other relevant guidelines.
Response process: data integrity such that sources of error associated with the test administration are controlled or eliminated as possible. Documentation of quality-control procedures, key validation, rationale for scoring methods.
We pre-briefed all participants, provided sim operator training, piloted the cases, and utilized a nurse confederate to clarify orders and prompt ddx. Rating was completed using checklists and global rating scales with attention to interrater reliability (see Internal structure)
Internal structure: the statistical or psychometric characteristics of the questions or performance prompts. Includes item analysis (computes difficulty of each item, discrimination of each question, etc.), reliability testing, and evaluation for bias.
Although we did not do in-depth measure of internal structure (coming soon!), we did show in subset of 50 cases 82% agreement between raters on 1073 critical action checklist items (kappa = 0.64). Global rating scale ratings were strongly correlated (Pearson correlation = 0.70)
Reliability is such an important source of validity evidence if deserves a deeper dive. We can’t draw large conclusions from assessments without being sure that our scores are reliable and reproducible. pubmed.ncbi.nlm.nih.gov/15327684/
Relationship to other variables: How does our assessment’s score correlate to an existing, accepted measure? Vascular neurologist scores on AIS and ICH sims should correlate, but scores on AIS and TBI sim may not.
Consequences: the impact on examinees from the assessment. High stakes exams (USMLE Step 1 for instance) have tremendous impact on futures. Passing rates and the appropriateness thereof (including process to determine cut offs) are examples of consequential validity evidence.
In our manuscript on the development of neurological emergency simulations for assessment we described content and response process evidence (with a little bit of internal structure via interrater reliability). We hope to publish evidence for other sources of validity soon!
Remember: tests have scores with more or less evidence to support interpretation that are unique to specific purpose, time, and population. Messick’s 5 sources of validity evidence: content, response process, internal structure, relationship to other variables, and consequences

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Nicholas Morris

Nicholas Morris Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @namorrismd

29 Jan
IV glibenclamide shows promise for reducing cerebral edema and appears to be safe. PO glyburide leads to more hypoglycemia, especially if abnl renal fxn. Smaller, more frequent dosing may help. Kudos to @MikeA_42 for pushing this through to publication. sciencedirect.com/science/articl…
@jessestokum performed work in Dr. Marc Simard's lab demonstrating how SUR1-TRPM4 and AQP4 form a complex that amplifies ion/water osmotic coupling and drives astrocyte swelling after brain injury. @UMDNeurosurgery . onlinelibrary.wiley.com/doi/full/10.10…
The role of Sur1-TRPM4 and AQP4 in the milieu that drives cerebral edema was recently summarized nicely by @MDNeurocritcare fellow Melissa Pergakis.tandfonline.com/doi/abs/10.108…
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!