If this result holds up generally, merging networks trained on different parts of data looks feasible. Huge implications - privacy through federated learning, parallelization of learning, merging of em shards!
The basic idea is simple: you can permute hidden layer neurons, so there are actually far fewer internal models than it looks. Training gets to one, with linear mode connectivity. Can hence interpolate differently trained networks if one is careful.
If each layer has N neurons and weights can have K values there are N! permutations and K^(N^2) weight matrices. ln K^(N^2)/N! = N^2 ln K - N ln N + N + O(ln N). So wide networks have room for more possible models, but SGD consistently only seems to find one basin of attraction.
To me the surprise was that the merging was relatively cheap; I had expected it to be hard. But greedy approximation algorithms win anyway.
Looking at my own @AIObjectives project, this suggests that markets may get robustness and convergence properties because companies are permutable (company fungibility?) and market process like SGD converges on a global, easy to find optimum. We'll see.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Anders Sandberg

Anders Sandberg Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @anderssandberg

Aug 19
Is there any good study of the average lifespans of villages, towns and cities? My impression is that they do not disappear very often, despite there being a fair number of examples of ghost towns (that are often lightly inhabited). en.wikipedia.org/wiki/List_of_g…
This is linked to my bigger interest in what determines the lifespans of social structures and projects. Generally, constant risk over time seems to be the generic case for states, empires, species and companies; increasing risk only in software and individual organisms.
I suspect there may be a category that is just so resilient/regenerative/has economies of scale that the survival curve asymptotes or gets heavy tail: universities, religious institutions, and especially cities seem to be here.
Read 5 tweets
Aug 18
Nice thread about why democratizing AI makes more sense the weaker the AI is. But this only looks at "offensive" capabilities: clearly AI can also protect.
The real issue might not even be the offense-defense balance, but whether defense is reliable enough. A world where bad actors occasionally have great wins may be worse than one where they can often gain small. Some credit fraud ok, not everybody's accounts drained.
Democratisation of defensive AI may be a great good, due to diverse defences. Joint defence may scale well in some domains, but we should not expect same scaling for hacking, fraud, war or philosophy.
Read 5 tweets
Aug 12
#FridayPhysicsFun - I wrote an answer on Physics Stack Exchange about moving the Earth outwards to compensate for solar brightening. physics.stackexchange.com/questions/7227…
The first consideration of how to move the planet was made by Archimedes boast: "Give me a place to stand and with a lever I will move the whole world." physics.stackexchange.com/questions/4831… Image
Christoph Grienberger in 1603 proposed gearing powered by a treadmill, allowing it to be raised veeeerrryyy slowly. He got the rough number of gears right by modern reckoning. bbc.com/future/article… Image
Read 18 tweets
Aug 6
#FridayPhysicsFun – Coolest fact I learned this week: under some conditions light-emitting diodes can be more than 100% efficient, and act as refrigerators.
Normally energy conversion introduces losses: there is a production of entropy turning high-quality energy (e.g. mechanical motion, electricity) into disordered low-quality (e.g. heat), and turning low-quality into high-quality is less efficient.
This is why normally any device promising more than 100% efficiency is fake. Thermodynamics does not allow it.
Read 7 tweets
Jul 2
There is an old Swedish rhyme about the cuckoo: "södergök är dödergök, västergök är bästergök, östergök är tröstergök och norrgök är sorggök."
Roughly: "south-cuckoo is death-cuckoo, west-cuckoo is best-cuckoo, east-cuckoo is consolation-cuckoo, north-cuckoo is grief-cuckoo".
If you take the rhyme seriously each cuckoo hence radiates a kind of field in cells, like a cellphone tower for superstition. It presumably declines as 1/d^2 with the sound intensity.
In a forest different cuckoos presumably generates a mixed random field as they move about. It is a bit unclear if they show constructive or destructive interference: maybe a west and a south cuckoo cancel, your enemies die, or you get a monkey's paw situation.
Read 4 tweets
Jul 1
#FridayPhysicsFun – I got a question today: are there pyroclastic flows from volcanoes elsewhere in the solar system?
Pyroclastic flows are the result of explosive volcanic eruptions, a mix of hot gas and volcanic matter hurtling downslope at potentially more than 100 km/h, burning and suffocating everything. Basically a landslide mixed with hot, dense gas and dust.
en.wikipedia.org/wiki/Pyroclast…
Why do landslides move in the first place? The Varnes classification suggests there are falls, topples, slides, spreads and flows.
geology.cz/projekt681900/…
onlinepubs.trb.org/Onlinepubs/sr/…
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(