My unwavering opinion on current (auto-regressive) LLMs 1. They are useful as writing aids. 2. They are "reactive" & don't plan nor reason. 3. They make stuff up or retrieve stuff approximately. 4. That can be mitigated but not fixed by human feedback. 5. Better systems will come
6. Current LLMs should be used as writing aids, not much more. 7. Marrying them with tools such as search engines is highly non trivial. 8. There *will* be better systems that are factual, non toxic, and controllable. They just won't be auto-regressive LLMs.
I have been consistent while: 9. defending Galactica as a scientific writing aid. 10. Warning folks that AR-LLMs make stuff up and should not be used to get factual advice. 11. Warning that only a small superficial portion of human knowledge can ever be captured by LLMs.
12. Being clear that better system will be appearing, but they will be based on different principles.
They will not be auto-regressive LLMs.
13. Why do LLMs appear much better at generating code than generating general text?
Because, unlike the real world, the universe that a program manipulates (the state of the variables) is limited, discrete, deterministic, and fully observable.
The real world is none of that.
14. Unlike what the most acerbic critics of Galactica have claimed
- LLMs *are* being used as writing aids.
- They *will not* destroy the fabric of society by causing the mindless masses to believe their made-up nonsense.
- People will use them for what they are helpful with.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
@babgi ChatGPT n'est pas particulièrement innovant.
Il utilise des techniques originellement développées à Google et Meta (FAIR), qui possèdent des systèmes similaires dans leurs labos.
Mais ces entreprises sont moins motivées à déployer des démonstrations publiques qu'OpenAI.
1/
@babgi Les meilleurs experts en France de ces méthodes sont à FAIR-Paris.
FAIR-Paris contribue *énormément* à l'écosystème français de la recherche en AI.
On peut regretter que certaines institutions publiques françaises voient FAIR comme un ennemi et non comme un partenaire.
@babgi Tout cela au nom d'une conception un peu dépassée de la souveraineté.
La souveraineté technologique et la maîtrise locale des nouvelles technologies sont des objectifs désirables et admirables.
3/
By telling scientists they must publish, you get: 1. higher-quality research, more reliable results, less self-delusion 2. better scientists whose reputation will flourish 3. easier external collaborations 4. better research evaluation 5. better internal impact 6. prestige
That's why at FAIR, we not only tell scientists to publish papers and open-source their code, we also use their publications as one component of their periodic evaluation.
To be clear, my original tweet was about scientists in *industry*.
Few companies promote publishing, some tolerate it, many forbid it.
The role of publishing in academia is well established and not in question.
Obscurantisme médiéval chez le groupe EcoInfo du CNRS:
"On ne pourra pas maîtriser la consommation énergétique et les impacts environnementaux des réseaux mobiles sans imposer une forme de limitation dans les usages."
Quoi? 1/
1. L'impact environnemental des réseaux (mobiles ou non) est, en gros, négligeable et assez stable. 2. L'amélioration des technologies de communication *réduit* les besoins de déplacement et *améliore* l'efficacité de l'économie.
2/
3. Facturer à l'usage pour réduire la consommation revient à tuer dans l'œuf les bénéfices des réseaux. C'est ce que voulaient les grands groupes de télécom avant l'Internet. 4. Réglementer l'usage par la "frivolité" du contenu est impossible sans imposer une sorte de dictature.
It's a running joke of mine: every generation complains that the younger generation's activities, interests, and favorite technologies are {pointless, useless, a waste of time, immoral, culturally inferior, artistically worthless} & will destroy the fabric of society.
OK, debates about the necessity or "priors" (or lack thereof) in learning systems are pointless.
Here are some basic facts that all ML theorists and most ML practitioners understand, but a number of folks-with-an-agenda don't seem to grasp.
Thread. 1/
The no-free-lunch theorems tell us that, among all possible functions, the proportion that is learnable with a "reasonable" number of training samples is tiny.
Learning theory says that the more functions your model can represent, the more samples it needs to learn anything
2/
Consequence: the more priors you put in, the fewer samples you require.
But: the more priors you put in, the greater the chance that the functions you need to learn are not realizable (or hard to learn) by your model.
3/
The paper distills much of my thinking of the last 5 or 10 years about promising directions in AI.
It is basically what I'm planning to work on, and what I'm hoping to inspire others to work on, over the next decade. 2/N
Most people don't talk publicly about their research plans.
But I'm going beyond the spirit of Open Research by publishing ideas *before* the corresponding research is completed. 3/N