I finally read @boazbaraktcs’s blog on DL vs Stats.
A great mind-clearing read! 👍
“Yes, that was how we thought about NNs losing out due to bias/variance in ~2000”
“Yes, pre-trained models really are different to classical stats, even if math is the same” windowsontheory.org/2022/06/20/the…
A bit more nuance could be added to this 2nd para on Supervised Learning. Initial breakthroughs _were_ made in #NLProc via unsupervised learning prior to AlexNet—the word vectors of Collobert&Weston (2008/2011) like related stuff in RBMs, Google cat, etc. jmlr.org/papers/volume1…
But, for a few years, the siren song of the effectiveness of end-to-end deep learning on large supervised datasets was irresistible and very successful, probably partly because of how, in the over-parameterized regime, it does do representation learning as this post argues.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Dear @emilymbender—and @Abebab—you need to keep “reminding” people of your viewpoint because it is not an argument that is convincing to all or a self-evident truth. It is a particular academic position, which lots of people support but a good number of others disagree with. 1/8
The position of Bender & @alkoller is that of 20th century formal semantics, based on Anglo-American philosophy of language & formal grammar, the approach we were both brought up on. However, it had already been rejected by Ludwig Wittgenstein in 1953: topologicalmedialab.net/xinwei/classes… 2/
“You say: the point isn't the word, but its meaning, and you think of the meaning as a thing of the same kind as the word, though also different from the word. Here the word, there the meaning. The money, and the cow that you can buy with it. But contrast: money, and its use.” 3/
I’m happy to share the published version of our ConVIRT algorithm, appearing in #MLHC2022 (PMLR 182). In 2020, this was a pioneering work in contrastive learning of perception by using naturally occurring paired text. Unfortunately, things took a winding path from there. 🧵👇
The paper (Contrastive Learning of Medical Visual Representations from Paired Images and Text, @yuhaozhangx@hjian42 Yasuhide Miura @chrmanning & @curtlanglotz) shows much better unsupervised visual representation learning using paired text versus vision alone (SimCLR, MoCo v2)
However, sometimes you don’t get lucky with conference reviewing—even when at a highly privileged institution. We couldn’t interest reviewers at ICLR2020 or ICCV2021. I think the fact that we showed gains in radiology (x-rays) not general vision seemed to dampen interest….
I would suggest that this thread errs by over-representing the proportion of the time in which human “reasoning” is actually anything akin to mathematical reasoning, such as the example of solving SAT instances. 1/
To start with a Go analogy: A moderately skilled player can exhaustively read out a 6–10 move life-and-death problem or end game sequence—or usually work it out more quickly using pattern-based shortcuts! But for longer, more complex things such as fuseki (opening) sequences, 2/
they “reason” about moves—“it would be better for me to approach here than to extend on the other side, because then they would jump and I could then approach strengthening my corner while attacking”—but really this is pattern matching! It’s not like solving a SAT problem. 3/
It’s great getting to read my colleagues @robreich, @mehran_sahami & Jeremy Weinstein’s book, System Error. Building a broad understanding of problems with big tech and techno utopianism is such an important topic for this decade. harpercollins.com/products/syste…
Some thoughts below. 🧵👇
The authors rightly stress many key problems that have emerged: deficiencies of simplistic metrics, dangers of tech monopolies, balancing innovation against status as a public utility, what should become of privacy and free speech in a world of corporate-owned public squares?
However, they end up questioning an “optimization mindset” in general, and I’m not sure that’s right. There are many ways that optimization can go wrong, which they discuss: people can adopt simplistic uni-dimensional metrics (e.g., “connecting more people is good for all”) or …
@yoavgo@ChrisGPotts I take primary blame for advocating the anonymity period. It was an honest attempt at a compromise middle ground. With the passage of time, I admit that it seems a bit flawed, as more people aim for “the anonymity period deadline” but the real question is what would be better?
@yoavgo@ChrisGPotts Your suggestion, @yoavgo, is to move the dial all the way to the left or to the right, but such extremist positions seldom are optimal in a complex and varied world. We could survey again, but I suspect the situation is similar to where it was 3 years ago: one large group, ...
@yoavgo@ChrisGPotts which seemed to center on white, male, Americans were all for preprints (mostly killing anonymous reviewing) while another large group (mainly people outside the above group) believed strongly in preserving anonymous reviewing. Should we just fully go with one group? ...