balajis.com Profile picture
12 Dec, 15 tweets, 4 min read
Imagine if we optimized for number of independent replications over number of citations.
A citation typically assumes the cited study is true. In a paper that cites 50 other papers, most of them are not being tested.

This leads to chains of unsupported inference, as in the replication crisis.

(Note: I have not replicated this graph!)
nature.com/news/1-500-sci…
Think about the reproducibility problem in software first. This is much easier than (say) biological science because it is usually cheap and fast to independently check whether software works on any given computer. It’s still not easy to make code work reliably everywhere.
In biomedicine, running a replication is time-consuming and expensive.

There are interesting companies like Transcriptic and Emerald Cloud Labs that have been working on AWS for biomedicine to speed this up.
One idea is to design studies for replication.

In software this means unit & integration tests, and TDD.

For bio perhaps it means (a) reproducible research software and (b) a series of escalating unit tests and controls for each study, like protocols. cell.com/trends/biotech…
Ideally:

1) A study’s PDF should be generated from code & data, as per reproducible research

2) All of that (code, data, PDF) should be not just open-access but *on-chain* to give a set of truly permalinks, and to ensure data/code are timestamped
coursera.org/learn/reproduc…
3) Now subsequent on-chain studies can not just cite previous studies, they can *import* them, much like defi composability but for scientific results.

4) Each author then serves as a (potentially trustable) oracle, posting code and data on-chain with a digital signature.
5) Obvious caveat: data itself could be wrong. This is true, and more on this shortly.

But note that even full access to all code/data reported by each study in your information supply chain would be valuable.

In software, we can view source for all open-source dependencies.
6) So, what if data is wrong? As per earlier part of thread, methods sections of papers can be designed for the most mechanical possible independent replication of the dataset.

Like a blog post estimates “time to read”, we can estimate a study’s “cost & time to replicate data”.
7) For example, given a mechanical protocol for reproducing the dataset underpinning a study:

- Spend X to replicate at CRO
- Spend Y to replicate at core facility
- Spend Z to replicate by re-running surveys at Mturk

And so on. Package studies for independent reproducibility.
8) Again, in computer science, we already do this. How much does it cost to replicate GPT-3 on AWS? These folks estimated it was ~$4.6M about six months ago, a cost that has likely dropped.
lambdalabs.com/blog/demystify…
9) Why is replication important?

Because science isn’t about prestigious people at prestigious universities publishing in prestigious journals echoed by prestigious outlets

That’s how we get “masks don’t work, now they do, because science”

Science is independent replication.
10) Some of the people at those prestigious institutions are legitimately very intelligent and hard working, capable of discovering new things and building functional products.

But on the whole, the *substitution* of prestige for independent replication isn’t serving us well.
11) Civilization has a ripcord, a glass breaking approach people use when centralized institutions get too ossified. It’s called decentralization.

Luther used it to argue for the “personal relationship with God”, disintermediating the Church. Washington used it. So did Satoshi.
12) Given how many hunches have been marketed to us this year as science (“travel bans don’t work, the virus is contained”), the emphasis on independent replication as the core of science is just such a ripcord.

Only trust as scientific truth what can be independently verified.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with balajis.com

balajis.com Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @balajis

6 Dec
If you're a city or country that wants the next Silicon Valley, the recipe is simple: abolish obsolete regulations.

- Fully legalize crypto
- Carve out zones for self-driving cars
- Allow expanded right-to-try in biomedicine

Here's more from 2014.
politico.com/magazine/story…
Focus on abolishing old regulations over subsidies or tax breaks. Here's why:

- legalizing something new is a true 0-to-1 step
- it gives instant global advantage vs other jurisdictions
- it's free & costs the jurisdiction nothing

Legalize innovation and the talent will follow.
It's hard to specifically attract the *digital* part of Silicon Valley, because it can be funded from anywhere and scaled online.

Instead, unlock innovation in the *physical* world by creating special innovation zones for particular technologies like self-driving cars.
Read 4 tweets
6 Dec
Most value eventually becomes digital.

This is counterintuitive. But first think about what fraction of your spending already goes to digital goods like books, SaaS, etc.

Now think about what happens as robotics improves and you can “print” more things.
Much work remains on sensors & actuators. But once an actuator gets to a certain level, most value is digital.

Printers let you print most documents, speakers let you “print” most sounds, monitors let you “print” most images.

Drone delivery, robotics, 3D printing extend this.
This is a complementary take on the phenomenon @EpsilonTheory notes.

Eventually, robotics turns more forms of labor into electricity. And with digital code + digital currency, you can print out anything.

Many of these APIs already exist (from TSMC to Zazzle) but more will come.
Read 5 tweets
4 Dec
Did we learn that? Or did we learn that the huge regulation of 2002, namely Sarbox, killed IPOs for a decade while doing nothing to prevent the financial crisis of 2008?
Many regulations are sold in the following way:

- Something should be done
- This is something
- Therefore this should be done

Needless to say, the outcomes are not always desirable.
Just to pre-empt a likely distortion, the answer isn't "no regulation" either. It is to have many jurisdictions with different alternatives, such that people can weigh pros & cons of each.

We see this with marijuana laws at state level, stem cells at national level, etc.
Read 4 tweets
26 Nov
The proposed new anti-crypto regulation by @stevenmnuchin1 is a form of financial disenfranchisement. It harms people who lack ID, further expands the surveillance regime, and sets up more honeypots for hackers.

It must be resisted vigorously.
Let's remember that OPM was hacked. And the State of Texas, and the Department of Veterans Affairs, and NARA, and even the US Voter Database.

Forcing companies to surveil you is bad enough. Then the databases get hacked and blasted all over the internet.
digitalguardian.com/blog/top-10-bi…
The new regulation is financial disenfranchisement because it takes away a right people already have: the right to use crypto without being surveilled, tracked, and recorded in some US government database. coincenter.org/we-must-protec…
Read 5 tweets
25 Nov
The Cool War

Thesis: America is transitioning from the coolest country in the world to the holiest country in the world. From blue jeans & counterculture to blue checks & cancel culture.
America's soft power remains, but it's of a very different type, and more vulnerable in the long run. Converting all these soft power institutions like Hollywood into instruments for propagating a very American form of secular gospel may open up space for competitors.
The rise of TikTok is instructive. Twenty years ago, the idea that a Chinese app could outcool Hollywood & SV to win millions of US teenagers would be laughable.

Is it a national security threat? Maybe, but the soft power issue may be a bigger deal than even the data collection.
Read 11 tweets
19 Nov
We really should be in the middle of a golden age of productivity. Within living memory, computers did not exist. Photocopiers did not exist. *Backspace* did not exist. You had to type it all by hand.
It wasn't that long ago that you couldn't search all your documents. Sort them. Back them up. Look things up. Copy/paste things. Email things. Change fonts of things. Undo things.

Instead, you had to type it all on a typewriter!
If you're doing information work, relative to your ancestors who worked with papyrus, paper, or typewriter, you are a golden god surfing on a sea of electrons. You can make things happen in seconds that would have taken them weeks, if they could do them at all.
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!