Thread by @andrewjkoh on Thread Reader App

New paper on data as a driver of automation and growth: nber.org/papers/w35320

One of my favorite lines from the Nicomachean Ethics goes ‘for the things we need to learn before we do them, we learn by doing them’. The world is complex and messy… ‘men become builders by building and lyreplayers by playing the lyre’. How did AI systems become taxi drivers? How will they become office workers?

One view is that they’ll gain those skills by training on high quality data. But where will that data come from? What kinds of data will the economy accumulate? How quickly? Who gains and who loses? And if we train AI systems on tons of data so it becomes superhuman at every job, what will be left for us to do?

Simple macro model with three ingredients:

(1) data is task-specific: coding data is not the same as self-driving data; different tasks might also be differentially verifiable so accumulating data is harder/easier

(2) data accumulates endogenously: if the economy finds it worthwhile to do lots of coding we get lots of coding data

(3) data exhibits spillovers: data on one task might augment the productivity of another e.g., via transfer learning or via overlapping primitive skills (pic: some spillover patterns)

First, the long run: do we get full automation?

- Without spillovers: cross-task complementarity must be large enough. Why? Tasks are complements => market finds it worthwhile to produce things we’re bad at => generates data on those bottlenecks => more automation. Unintuitive implications for wages: complementarity means within each slice of time wages are higher than if tasks were substitutes; but across time the market produces the right kinds of data quickly enough to automate labor => long-run wages stagnate!

- With spillovers, automation can be contagious propagating both through prices (general eqm.) and through links (local spillovers)

Second, speed: without capital accumulation, automation is always slow in the long run (power law) but short run dynamics depends richly on the network.

One possibility is core-periphery: there’s a core of easy-to-verify/collect/codify tasks that are super amenable to AI e.g., coding, spreadsheets, powerpoint. They generate data that feeds into each other – in eqm we get amazing coding agents and produce tons of code! But this can generate imbalanced automation: the core sucks up capital and slows down the accumulation of data (and automation) of messier tasks in the wider economy

Third, efficiency: does the economy accumulate data optimally? No: except for knife-edge cases, the market distorts the composition of data accumulation. Why? Individual producers don’t capture (and internalize) the value of future data. Automation can be inefficiently fast or slow…

Fourth, long-run growth: data and capital (compute) accumulation feed into each other. Data simultaneously augments task-specific productivity + gradually shrinks need for labor => learning-by-doing singularity (infinite growth in finite time!) But this happens pretty slowly vs ‘software only singularities’ (@TomDavidsonX @BasilHalperin Tom H @akorinek)

These views aren't mutually exclusive! In a way we’re being pretty conservative – the economy is messy, but we know how to do RL so this data channel is lower bound on how crazy things can get! Basil and I think it's worth unifying both views and taking it to data!

Fifth, what will be left for us to do? Humans are super sample efficient... maybe ~millions of times more so than AI.

If the economy changes quickly (new tasks emerge quickly) humans might have a comparative advantage in doing those new tasks when they first appear. Then over time, we’ll teach the machines how to do them, and move on to the next new thing. Labor facilitates automation – wages keep growing, labor share stays high...

...but those tasks need to be genuinely new! If they are pretty similar to tasks we already do, spillovers from existing tasks might mean AI systems can bootstrap themselves and automate new tasks (almost) as soon as they emerge – no need for labor! Is operating an asteroid mining rig similar enough to driving a car?

Lots more in the paper and I had so much fun working on this (my first macro paper!) with Maryam and Bryant.

A final note: we describe an economy in which labor doesn't own its expertise – when workers earn wages they are also training their own replacement. But why aren't they appropriately compensated? One answer is divide-and-conquer: if I don't sell my expertise, I know that someone else will and that'll put me out of a job... so I should do so while I still can + I'm willing to accept way less than what it's worth. Is there a better way? What mechanisms and institutions do we need? Or maybe it's better to automate first and redistribute later? More soon...

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll