Quick thread about re-randomization in randomized trials, illustrating the following two simple points:
*) re-randomization for balance is sometimes fine and doesn't affect significance thresholds
*) re-randomization for balance can sometimes cause big artifacts.

1/14
First scenario, consider a clinical trial, where we randomize individuals independently (by individual coin flip, say) to treatment or control groups.

What if we re-randomize until the two groups have exactly the same size?
2/14
This is the same as choosing an equal split uniformly randomly, and standard statistical approaches will still apply without problems.

It is similar if one re-randomized until the populations were within some threshold, or if this was true for subgroups (sex, race, etc).
3/14
Re-randomization can be reformulated as conditioning on a certain event (the condition of acceptance), and in these particular cases above, the corresponding event gives rise to a conditional probability space with well-understood and definable structure.
4/14
In the case of cluster trials, I might imagine re-randomizing to balance the number of "big" of "small" villages in each trial arm. This would be equivalent to the similar examples above.

It tempting, but incorrect, to think that balancing summary statistics is always ok too.
5/
Example: Suppose I conduct a cluster randomized trial in 20 villages, whose populations are
1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,11
(nine have pop 1, ten have pop 2, and one has pop 11).

If I define a notion of a "big village" and re-randomize to balance big villages: 👍
6/14
If I run my trial and find all the treatment villages do better than all the control villages, this will be a highly significant finding at something like p=.000001, and this subgroup balancing will not have a big affect on this.

7/14
Suppose now, instead, that I choose a random 10:10 split of villages but re-randomize to balance the mean or total population of the villages in the two groups, rather than balance in predefined subgroups like "big" or "small". Seems okay, right?

8/14
But this situation is a disaster.

If I rerandomize a *lot* of times, I will essentially always find one of two splits:
11,1,1,1,1,1,1,1,1,1 vs 2,2,2,2,2,2,2,2,2,2, or
2,2,2,2,2,2,2,2,2,2 vs 11,1,1,1,1,1,1,1,1,1

In particular, all the randomness in my trial is now 1 coin flip.9/
If, for example, I am studying infectious disease, which all of the villages with >1 person have and none of the villages with 1 person have, I have a 50% chance of finding 9/10 of my treatment villages healthy while 10/10 of the controls are sick.

10/14
With standard tests this would appear to be extremely significant at something like p=.0001 but really it is a 50% chance event.

To conclude anything better, I have to make tacit assumptions about the generative distribution of village sizes, or its effect on my study outcome.
Even if I don't randomize a lot of times, it is still true that the more times I randomize, the more pop-1 villages's will cluster with each other (and the 11 village) and, conversely, the more pop-2 villages will cluster together.

12/14
If I randomize 100 times, my split will essentially be a 1-in-100 outlier with respect to how many pop-1 villages cluster on one arm. In the scenario above, where villages with only 1 person all are free of disease, I can easily get a p=.01 outcome just from the rerandomization.
All this is to say that while re-randomization can be fine and have advantages, it is worth remembering that it is not universally valid for all conditions we might intuitively think of as "balance", and one should think carefully when employing it at the outset of a trial.
/14

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Wes Pegden

Wes Pegden Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @WesPegden

2 Dec
This blog post discusses our re-analysis of the recently released data from the Bangladesh mask trial (with @ChikinaLab and @beenwrekt).

An accompanying short note is linked from the blog (arXiv link later).

Thread what we find: 1/
We carried out simple analyses of this data using standard nonparametric paired statistical tests.

Through this lens, there are highly significant effects on behavior, but the primary outcome (symptomatic seropositivity) is not significant. 2/..
One of the striking things in the figure above is the imbalance in the size of the consenting populations between the (50:50 randomized) treatment/control groups. There are highly nonrandom differences in the rates consent staff reached households in treatment/control villages.
Read 11 tweets
27 Sep
The authors of this study are doubling down on @Jabaluck's "guesstimate" of the # of lives masks would save in the U.S. based on their (impressive, sorely needed) cluster trial in Bangladesh.

There are many reasons this guess is speculative to the point of not being useful.
1/6
Apart from the obvious things about translating a finding which was highly context specific (the benefit concentrated in particular ages and in villages using a particular mask type) to a completely different context, one cannot make these estimates ignoring epidemic dyanmics. 2/
E.g.:
Suppose every K infections results in death.

To a first approximation, to believe "masks for a year" prevents x deaths, you either have to believe that it increases by K*x the number of people who will never get infected, or the number who would get vaccinated first.
3/
Read 6 tweets
1 Sep
Everyone should look at the remarkable work done in this cluster randomized trial.

They found that an intervention which increased surgical mask uptake in community settings significantly reduced SARS-CoV-2 infection among older adults.

A thread.
1/6
Most people's 1st tendency is to claim studies like this support what they already knew. In reality, the study had specific and not necessarily intuitive findings.

The study even collected predictions from experts, and found that they failed to predict the study outcomes!
2/6
For example, the study found that increasing mask usage had statistically significant effect on SARS-CoV-2 infection.

But these results were driven by surgical mask use, and by reductions in infections in people over 50.

A brief reminder that CDC mask guidelines start at age 2.
Read 9 tweets
31 Aug
On Friday, epi-Twitter exploded with urgent discussion of an MMWR on an isolated incident in which 1 teacher infected an unusual number of students.

Receiving much less attention was an MMWR on the same day of lower case rates all last year in LA schools than in the community.1/
At the time of this writing, the MMWR with data had 74 retweets and the MMWR on that one time this one crazy thing happened had 1.1K retweets, many from serious people claiming that this report affected in some fundamental way our understanding of COVID-19 risk in schools.
2/
Data is boring and stories seem compelling. But scientists and public health agencies should be actively working against the natural tendency to give greater weight to outlier incidents than data-driven understanding of risks.
3/
Read 4 tweets
31 Aug
One idea that has not been discussed much is the question of whether regulators have a special role to play in deciding when coercive measures can be used to increase uptake of vaccines.

In practice the hurdle for this has just been EUA, not even full approval.
1/
I think it is worth thinking about what principles should guide the decision of when coercive measures are ethically appropriate and whether regulators should play a role in adjudicating when that bar is crossed.

2/
The current situation is that we allow mandates even in cases where no clinical trial has weighed the direct individual risk/benefit (e.g., mandates for individuals with confirmed previous infection. This may also be the case soon for boosters).

This certainly seems flawed.

3/
Read 4 tweets
27 Aug
Just a reminder that (unlike for existing vaccination regimens) there are no trials quantifying a clinical benefit from boosters.

It's remarkable to see the decision portrayed here as one being made based on conversations between political leaders.

1/

cnbc.com/2021/08/27/bid…
Needless to say, politicians that have already made the decision to push forward with early boosters have an incentive to sell this decision to others as a wise and prudent one. Indeed, after implementing this decision, trials would not serve a helpful political purpose.

2/
But for the actual people who will receive boosters (let alone those who might have received a 1st dose of vaccine had the dose not been used as a booster in the U.S.) questions about whether boosters actually have any real clinical benefit (vs small risks) are crucial.

3/
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(