Machine learning researcher @MetaAI. Previously @Criteo and @Inria. I tweet on math, ML, and lots of random stuff. Tweets are mine. Blog: https://t.co/XAsL4l6Q7
Oct 10 • 5 tweets • 4 min read
1/n Introducing our new preprint: Strong Model Collapse arxiv.org/abs/2410.04840, wherein we show that within the "scaling laws" paradigm, even 1% bad / synthetic data in the training corpus might lead to model collapse, an eventual critical flattening or even degradation of model perf as the training dataset is scaled up. Joint work with @KempeLab, @feeelix_feng , and @arjunsubgraph 2/n Going further, we ask two important questions.
Q1: Is model collapse inevitable or can it be fixed by strategically mixing synthetic and real data?
Q2: Are larger models more prone to model collapse than smaller ones?
The answer to Q1 is generally no: "naive" mixing of real and synthetic data can't fix model collapse. However, iterative mixing can fix it and recover ideal scaling laws. Unfortunately, this is at the cost of greatly increased real data and training budget, which might not be feasible in practice.
Oct 10, 2018 • 5 tweets • 3 min read
1/ Strong No Free Lunch Theorem for adversarial robustness arxiv.org/pdf/1810.04065…. As the size eps of the perturbations is slightly increased, a point is reached eps_* is eventually reached beyond which accuracy decreases exponentially fast. A few typos in there.
Please RT. 2/ Moreover, eps_* is comparable to the natural noise level on the problem, with a modulation factor which varies only logarithmic as a function of the particular classifier ==> adversarial robustness / non-robustness is more a property of data than of some magical classifier