Andrew Ho Profile picture
Oct 22 18 tweets 9 min read
Let's talk "misNAEPery": Common misues of #NAEP results. Here are 3 types of misNAEPery to look out for on Monday's "NAEP Day":
1) correlation-is-causation (@EduGlaze's original definition)
2) psychometric misNAEPery
3) one-true-outcome misNAEPery.
🧵 1/
For each of these misNAEPeries, I try to distinguish between "high crimes" and "misdemeanors."
I used to get a little too gleeful in pointing out misNAEPery.
I now try to ask, "does it really matter?" or "what's the end goal?" before calling someone out for something "wrong." 2/
Type 1 misNAEPery: My leadership or policy caused these NAEP results. @EduGlaze coined misNAEPery in 2013 to refer to this common, predictable tendency among leaders, reporters, and commentators. ggwash.org/view/31061/bad… 3/
So is NAEP useless because we can't conclude anything about remote learning or charter schools or political leadership? No. My stance is closer to @MichaelPetrilli's here. NAEP is essential. fordhaminstitute.org/national/comme… 4/
For Type 1 misNAEPery, what's a "high crime"? Taking credit for high NAEP scores (with known correlations with wealth in our @seda_data).
But speculating on the basis of positive *trends* and declining inequality reduces this to a "misdemeanor," to me. 5/ edopportunity.org/explorer/#/cha…
To avoid Type 1 misNAEPery, wait for researchers to estimate policy effects after eliminating alternative explanations for results, like @ProfTDee and @BrianJacobProf did with NAEP, and @CEDR_US and coauthors did with NWEA data. 6/
onlinelibrary.wiley.com/doi/abs/10.100…
caldercenter.org/sites/default/…
Type 2 misNAEPery is psychometric misNAEPery: misuse of different NAEP scores. This includes comparing proficiency %s, collapsing results across subjects and grades, neglecting statistical significance, and translating effect sizes to other metrics like "months of learning." 7/
Monday, comparisons of proficiency %s will be everywhere. Are they wrong? Unfortunately, often, yes. 8/ journals.sagepub.com/doi/10.3102/00…
A rule of thumb for psychometric misNAEPery is, "beware the differences in differences." Any one score metric is probably fine. A difference in metrics is a "misdemeanor." And a difference in differences becomes a "high crime," if you don't cross-check with means, first. 9/
Unfortunately, differences-in-differences are often important.
Comparing pandemic trends across states? Diff-in-diff.
Comparing subgroup score inequality over time? Diff-in-diff.
These require cross-checking with means or reporting means outright. 10/ gse.harvard.edu/news/uk/15/12/…
Problems with proficiency are why I consider "months of learning" metrics to be "misdemeanors," and the lesser of two evils.
e.g., my "Rule of 27" is a transparent calculation that doesn't suffer from the bias that proficiency rate comparisons have. 11/
Similarly, I consider aggregating sensibly across subjects and grades to be a "misdemeanor," as we do in our "pooling model" for @seda_data, a simple index of academic educational opportunity. I hope you'll let us off with a warning. 12/
edopportunity.org/explorer/#/map…
Type 3 misNAEPery is "one-true-outcome" misNAEPery, forgetting that NAEP scores and the Reading and Mathematics skills they measure are among many outcomes we desire for our children.
There is lots of middle ground between "NAEP is everything" and "NAEP is nothing." 13/
To me, it's a "high crime" to decontextualize NAEP entirely from other outcomes like physical health, social/emotional health, and other academic subjects. But it's also a "high crime" to dismiss NAEP entirely. It is important for kids to read and reason quantitatively. 14/
But I think it's fine, more of a "misdemeanor," to generalize from NAEP scores to "academic outcomes" after establishing this context. I've used a "tip of the iceberg" metaphor previously to remember what we're not measuring. 15/
news.harvard.edu/gazette/story/…
My favorite metaphor for NAEP is that NAEP is like "the North Star." It's just one star in the sky, but its consistency and dependability help us to navigate. I credit my fellow @GovBoard alumnus (and kamaʻāina) Frank Fernandes with the metaphor. 16/
So, on Monday, beware misNAEPery of all 3 types. Distinguish between "high crimes" and "misdemeanors." (Save your outrage for the "high crimes.") And then, as I've said before, let's keep increasing our support of education. 17/
news.harvard.edu/gazette/story/…
And, in case you missed my thread yesterday on "Why is NAEP Monday important?", see here: 18/18

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Andrew Ho

Andrew Ho Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @AndrewDeanHo

Oct 23
On NAEP Eve, my 3rd thread, on "learning loss." At 12AM, people expect NAEP will find "learning loss."
Are results about "learning loss" essential to inform us as we move forward?
Or is the concept of "learning loss," itself, damaging and hurtful?
To me, the answer is: Both. 🧵1/ Image
When I say "learning loss," I try to create a "firewall" between what I say about systems and what I say about kids.
Evidence of "learning loss" shows the debts our society owes to kids. For kids and their parents & teachers, we must build from their strengths, their assets. 2/ Image
I hope my "four quadrants" framework is useful in this debate. NAEP (upper left) monitors aggregate progress. It's not about kids (lower left). It's not even about schools (upper right). It's about our whole system of educational opportunity. 3/
1-pager: scholar.harvard.edu/files/andrewho… Image
Read 13 tweets
Oct 21
Why is Monday’s “NAEP Day” so important? Don’t we already know about “learning loss” after our @CRPE_edu report and the September 1 @NAEP_NCES release? Here are three reasons why NAEP Day matters. 🧵1/
CRPE Report: crpe.org/wp-content/upl…
Sept 1 NAEP LTT:
Reason #1: This is NAEP’s ONE JOB: Assessing Educational Progress. Below is my “four quadrants” framework for test purposes. NAEP sits in the upper left: monitoring progress.
scholar.harvard.edu/files/andrewho… 2/ Image
Tests should follow “the Golden Rule of Testing”: DO NOT CROSS QUADRANTS.
Why can't other tests monitor with authority? State tests can be inflated. Classroom tests can be incomparable. Selection tests can be incomplete. But NAEP? NAEP has one job... It does it with authority. 3/
Read 9 tweets
Mar 24, 2021
Over 540 of my colleagues in education and measurement signed this letter asking @SecCardona for blanket state testing waivers.

I usually agree with them. Here, I do not.

Am I Charlie Brown trying to kick Lucy's football? Or do they not see that we have another placeholder? Image
In this @FutureEdGU essay, I argue that state tests have a valuable role to play when stakes are low and funding is plentiful. future-ed.org/a-smart-role-f…
My essay acknowledges the historical harms and overreach of accountability testing and all the dangers of overemphasizing and inflating the role of tests.

But this time is different.

Or it could be.
Read 5 tweets
Mar 23, 2021
State testing programs are heading for an iceberg. We can still turn the ship. I wrote a short essay at @FutureEdGU about how. There is even an 8-step plan. 1/8
The anti-accountability movement has earned a well-deserved victory. I am glad! But its momentum has it poised to strike state tests at exactly the time when tests can be most useful--for allocating unprecedented federal support. 2/8
Federal guidance invites state waivers for accountability and attendance. States should take these invitations! But state tests should happen this spring: They have irreplaceable comparability, alignment, and authority for directing federal support. 3/8
Read 8 tweets
Feb 26, 2021
Monday's @usedgov test-score mandate leaves states drifting into a validity buzz saw.

Let's talk solutions. I propose 3 metrics that all states should report for every school:

1) The Match Rate
2) The Fair Trend
3) The Blind Spot

Feedback welcome.

scholar.harvard.edu/files/andrewho… Image
Without metrics like these, valid interpretations of school and district test scores will be impossible. States trying to "target resources and supports" per @usedgov intentions will fail. Image
States should prepare to define these metrics and answer these questions now. This is a time for an "educational census," not business-as-usual test score reporting. Image
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(