Think about the reproducibility problem in software first. This is much easier than (say) biological science because it is usually cheap and fast to independently check whether software works on any given computer. It’s still not easy to make code work reliably everywhere.
In biomedicine, running a replication is time-consuming and expensive.
There are interesting companies like Transcriptic and Emerald Cloud Labs that have been working on AWS for biomedicine to speed this up.
One idea is to design studies for replication.
In software this means unit & integration tests, and TDD.
For bio perhaps it means (a) reproducible research software and (b) a series of escalating unit tests and controls for each study, like protocols. cell.com/trends/biotech…
Ideally:
1) A study’s PDF should be generated from code & data, as per reproducible research
2) All of that (code, data, PDF) should be not just open-access but *on-chain* to give a set of truly permalinks, and to ensure data/code are timestamped coursera.org/learn/reproduc…
3) Now subsequent on-chain studies can not just cite previous studies, they can *import* them, much like defi composability but for scientific results.
4) Each author then serves as a (potentially trustable) oracle, posting code and data on-chain with a digital signature.
5) Obvious caveat: data itself could be wrong. This is true, and more on this shortly.
But note that even full access to all code/data reported by each study in your information supply chain would be valuable.
In software, we can view source for all open-source dependencies.
6) So, what if data is wrong? As per earlier part of thread, methods sections of papers can be designed for the most mechanical possible independent replication of the dataset.
Like a blog post estimates “time to read”, we can estimate a study’s “cost & time to replicate data”.
7) For example, given a mechanical protocol for reproducing the dataset underpinning a study:
- Spend X to replicate at CRO
- Spend Y to replicate at core facility
- Spend Z to replicate by re-running surveys at Mturk
And so on. Package studies for independent reproducibility.
8) Again, in computer science, we already do this. How much does it cost to replicate GPT-3 on AWS? These folks estimated it was ~$4.6M about six months ago, a cost that has likely dropped. lambdalabs.com/blog/demystify…
9) Why is replication important?
Because science isn’t about prestigious people at prestigious universities publishing in prestigious journals echoed by prestigious outlets
That’s how we get “masks don’t work, now they do, because science”
Science is independent replication.
10) Some of the people at those prestigious institutions are legitimately very intelligent and hard working, capable of discovering new things and building functional products.
But on the whole, the *substitution* of prestige for independent replication isn’t serving us well.
11) Civilization has a ripcord, a glass breaking approach people use when centralized institutions get too ossified. It’s called decentralization.
Luther used it to argue for the “personal relationship with God”, disintermediating the Church. Washington used it. So did Satoshi.
12) Given how many hunches have been marketed to us this year as science (“travel bans don’t work, the virus is contained”), the emphasis on independent replication as the core of science is just such a ripcord.
Only trust as scientific truth what can be independently verified.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Focus on abolishing old regulations over subsidies or tax breaks. Here's why:
- legalizing something new is a true 0-to-1 step
- it gives instant global advantage vs other jurisdictions
- it's free & costs the jurisdiction nothing
Legalize innovation and the talent will follow.
It's hard to specifically attract the *digital* part of Silicon Valley, because it can be funded from anywhere and scaled online.
Instead, unlock innovation in the *physical* world by creating special innovation zones for particular technologies like self-driving cars.
Did we learn that? Or did we learn that the huge regulation of 2002, namely Sarbox, killed IPOs for a decade while doing nothing to prevent the financial crisis of 2008?
Just to pre-empt a likely distortion, the answer isn't "no regulation" either. It is to have many jurisdictions with different alternatives, such that people can weigh pros & cons of each.
We see this with marijuana laws at state level, stem cells at national level, etc.
The proposed new anti-crypto regulation by @stevenmnuchin1 is a form of financial disenfranchisement. It harms people who lack ID, further expands the surveillance regime, and sets up more honeypots for hackers.
Let's remember that OPM was hacked. And the State of Texas, and the Department of Veterans Affairs, and NARA, and even the US Voter Database.
Forcing companies to surveil you is bad enough. Then the databases get hacked and blasted all over the internet. digitalguardian.com/blog/top-10-bi…
The new regulation is financial disenfranchisement because it takes away a right people already have: the right to use crypto without being surveilled, tracked, and recorded in some US government database. coincenter.org/we-must-protec…
Thesis: America is transitioning from the coolest country in the world to the holiest country in the world. From blue jeans & counterculture to blue checks & cancel culture.
America's soft power remains, but it's of a very different type, and more vulnerable in the long run. Converting all these soft power institutions like Hollywood into instruments for propagating a very American form of secular gospel may open up space for competitors.
The rise of TikTok is instructive. Twenty years ago, the idea that a Chinese app could outcool Hollywood & SV to win millions of US teenagers would be laughable.
Is it a national security threat? Maybe, but the soft power issue may be a bigger deal than even the data collection.
We really should be in the middle of a golden age of productivity. Within living memory, computers did not exist. Photocopiers did not exist. *Backspace* did not exist. You had to type it all by hand.
It wasn't that long ago that you couldn't search all your documents. Sort them. Back them up. Look things up. Copy/paste things. Email things. Change fonts of things. Undo things.
Instead, you had to type it all on a typewriter!
If you're doing information work, relative to your ancestors who worked with papyrus, paper, or typewriter, you are a golden god surfing on a sea of electrons. You can make things happen in seconds that would have taken them weeks, if they could do them at all.