Here's is an unusual arxiv of mine: "Non asymptotic bounds in asynchronous sum-weight gossip protocols", arxiv.org/abs/2111.10248
This is a summary of unpublished work with Jérôme Fellus and Stéphane Garnier from way back in 2016 on decentralized machine learning.
1/5
The context is you have N nodes each with a fraction of the data and you want to learn a global predictor without exchanging data, having nodes waiting for others, and without fixing the communication topology (which nodes are neighbors).
That's essentially Jérôme's PhD.
2/5
We wanted to have an error bound w.r.t. the number of message exchanged, because it gives you an idea of when your predictor is usable.
Turns out, it's tough to get non-asymptotic results, but we got something not that bad for fully connected graphs. 3/5
After month of Stirling numbers of the second lking nonsense, one of the core ideas for the proof was proposed by Stéphane shortly before he passed away in mid 2016 from sudden illness. 😭
He was a super nice colleague. Sorely missed.
4/5
I don't know if these results will be useful to anyone, but they've been waiting on my computer for too long and I don't have the time or the energy to extend the work in a full fledged paper, so arxiv it is!
5/5
• • •
Missing some Tweet in this thread? You can try to
force a refresh
But in the real life, B does not exist and only ~200 papers would have made it!
So on average, among the accepted paper (as decided by A), 1/2 (99/(99+94)) got there because they're "universally good", 1/2 (94/(99+94)) because of luck. And about 1/2 (105/(99+94)) were unlucky.
Extend that to the full conference: if we assume 25% acceptance rate, then 13% of all submissions are accepted because they're really good. 13% are accepted because they're lucky, 13% are rejected because they're unlucky and 60% are rejected because they're not good enough.
The problem with DNNs is they are trained on carefully curated datasets that are not representative of the diversity we find in the real world.
That's especially true for road datasets.
In the real world, we have to face "unknown unkowns", ie, unexpected objects with no label.
How to detect such situation?
We propose a combination of 2 principles that lead to very good results:
1_ Disentangle the task (classification, segmentation, ...) from the Out-of-distribution detection.
2_ Train the detector using generated adversarial samples as proxy for OoD.
"torch.manual_seed(3407) is all you need"!
draft 📜: davidpicard.github.io/pdf/lucky_seed…
Sorry for the title. I promise it's not (entirely) just for trolling. It's my little spare time project of this summer to investigate unaccounted randomness in #ComputerVision and #DeepLearning.
🧵👇 1/n
The idea is simple: after years of reviewing deep learning stuff, I am frustrated of never seeing a paragraph that shows how robust the results are w.r.t the randomness (initial weights, batch composition, etc). 2/n
After seeing several videos by @skdh about how experimental physics claims tend to disappear through repetition, I got the idea of gauging the influence of randomness by scanning a large amount of seeds. 3/n