Profile picture
Roger Pielke Jr. @RogerPielkeJr
, 31 tweets, 13 min read Read on Twitter
OK, this starts a substantial thread on the science & politics of the IAAF testosterone regulations, formally titled: ELIGIBILITY REGULATIONS FOR THE FEMALE CLASSIFICATION (ATHLETES WITH DIFFERENCES OF SEX DEVELOPMENT), put forward last April.
These regulations are fatally flawed
First a bit of background.
Here is the 800m final at the 2009 IAAF World Championships in Berlin
This is when Caster Semenya first came to broad international recognition:
Controversy over Semenya was immediate and poorly handled by IAAF, details of which are well-covered. Here is a contemporary TV news story:
In May, 2011 the IAAF implemented new regulations on testosterone, which I'll call T1 regulations as they were the first attempt by IAAF to regulate (and as we will see, a failed attempt).…
During the summer of 2014 an Indian athlete named Dutee Chand was suspended under the IAAF T1 regulations. She immediately contested the regulations at the Court of Arbitration for Sport. Here is an excellent 5 min video on Chand:
Chand won her case at the CAS against the IAAF, which resulted in a suspension of the IAAF T1 regulations. This set the stage for IAAF to put forward a second incarnation of the T regulations, which are currently being debated.
Chand CAS decision:…
Arbitration under CAS is not precedental (no stare decisis) however the current challenge to IAAF T2 regulations brought by Semenya & Athletics South Africa follows directly from the Chand case.
On stare decisis & CAS see:…
I'll explain how the CAS Chand judgment set the stage for the our identification of the fatal flaw underpinning the current IAAF T2 regulations. Details matter.
CAS pointed to the Olympic Charter, which sets a high bar for discriminating against athletes.…
In the Chand judgment, CAS concluded that (a) discrimination by IAAF against certain individuals is allowable, but (b) if and only if such discrimination is "clearly established to be a necessary and proportionate means of achieving fair competition."
CAS interpreted the IAAF T1 regulations to be premised on the belief that, due solely to elevated T, some female athletes have a performance advantage over other female athletes "of comparable significance" to the performance advantage males have over females.
CAS concluded that a legally recognized female "must be permitted" to compete with females "unless" her T level conferred a performance advantage over other females comparable to male performance advantage over females.
CAS asked: Well, does T confer such a performance advantage?
CAS was careful to note this issue is not about drawing a line between male/female (i.e., gender classification).
It is about creating a new categorization within the female classification. CAS wonders if IAAF can demonstrate a correlation between T & performance among females.
CAS observed that there was in fact "no available evidence" on the postulated performance advantage that high T women enjoy over other women. CAS asserts that if such an advantage was 10-12% it would be a different matter "rather than, say 1%"
CAS: "The numbers therefore matter"
The critical role of numbers (on the purported performance advantage of high T females over other females) is what sets the stage for the promulgation of IAAF T2 regulations.
CAS explained that it is not just the reality of any demonstrated performance advantage that H T females have over others that is important here, "the degree or magnitude of the advantage is therefore critical"
CAS ruled in favor of Chand because of the lack of evidence from IAAF that would show a performance advantage of high T females over other females "of the same order as that of a male athlete"
Rather than simply end the arbitration with a judgment in Chand's favor, CAS decided to give IAAF another bite of the apple, carefully explaining what evidence would be necessary and sufficient to uphold testosterone regulations.
CAS found in CHand's favor & gave IAAF 2 years to come up with the right data.
It is highly irregular to see an organization seeking regulations that to date lack an evidence base, tasked with coming up with evidence in support of those regulations.
What could possibly go wrong?
So IAAF got to work producing new evidence.
The new analysis was published by @BJSM_BMJ less than 2 weeks before the CAS deadline.
Bermon & Garnier 2017, hereafter BG17…
How do we know that this analysis is the evidence IAAF produced in response to CAS Chand decision?
Easy: Because it is the only data on differences in performance between high T females and others cited in the IAAF T2 regulations…
I reviewed BG17 when it came out:…
I was struck by the fact that of the 24 female athletes it identified as having high T, almost 40% (9) were known dopers.
For a paper purporting to identify the effects of naturally occurring high T ... this seemed... odd.
In April 2018 IAAF released its T2 regulations & it was clear that BG17 was the entire empirical basis.
We could not make sense of the BG17 summary stats so we requested their performance data.
After no reply & involving BJSM editor we went public:…
After persisting Dr. Bermon sent us some data (25%) in July 2018, one day before BJSM published a letter from Bermon et al. (BHKE18) with a "do over" of BG17

As you will see, BG17 was based on deeply flawed data, so bad that IAAF had no choice but to perform a "do over" analysis
With BG17 data, we sought to reproduce results & recreate the dataset based on publicly available info.
What we found was shocking:
In 4 events that we recreated (those events being regulated) we identified 17%-33% bad data.
It was obvious that BG17 is a fatally flawed paper.
Our shock had just begun.
We notified Dr. Berman & @BJSM_BMJ expecting the paper would be retracted. Errors happen & science is strong because it is self-correcting.
Oddly, BJSM refused to retract the flawed paper or to even require Bermon to share further data.
The Bermon et al "do over" letter (BHKE18…) admitted to 230 flawed data points in the BG17 female data (~20%) across every event.
Actually, if you count, BHKE18 had dropped only 220 data points (not the 230), but who is counting? (Well, we are).
Based on our re-creation it seems clear that BHKE18 did not identify all the bad data of BG17 and some has made its way into BHKE18
Stranger still, Sebastian Coe, Dr. Bermon & @BJSM_BMJ editor Karim Khan have each claimed the results of BHKE18 are identical to BG17 (suggesting, I guess, that the data errors don't matter).
This is just wrong, as we have shown.
Differences between BG17 & BHKE18 results below
Our analysis (…)
has been widely shared, for instance via the NYT:…

Neither IAAF nor @BJSM_BMJ has questioned the scientific merit of our analysis.

Yet, IAAF is pushing ahead with its regulations.
A side note, even if we were to accept the IAAF "do-over" as solid science (it's not, but play along), then the performance advantage of high T females over low T females is only 1.6% across the 4 regulated events, smaller than the low T advantage over high T in sprints (2.0%)!
Bottom line:
The dispute over IAAF T2 regulations is not about testosterone or gender.
It is now about scientific integrity.
In the Chand arbitration CAS forcefully argued that evidence matters.
We will find out in the Semenya/ASA arbitration if it still matters.
It should.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Roger Pielke Jr.
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!