Profile picture
Sebastian Ruder @seb_ruder
, 16 tweets, 3 min read Read on Twitter
1/ People (mostly people working with Computer Vision) say that CV is ahead of other ML application domains by at least 6 months - 1 year. I would like to explore why this is, if this is something to be concerned about, and what it might take to catch up.
2/ I can’t speak about other application areas, so I will mostly compare CV vs. NLP. This is just a braindump, so feel free to criticize, correct, and disagree.
3/ First, is that really true? For many specialized applications, where task or domain-specific tools are required, such as core NLP tasks (parsing, POS tagging, NER) comparing to another discipline is not meaningful.
4/ However, for tasks where more general ML techniques are applicable, the knowledge transfer seems to be CV->NLP rather than the other way around.
5/ Techniques such as residual or dense connections, GANs, adversarial examples, domain-adversarial loss, few-shot and meta-learning methods all originate in CV and have been adopted in NLP in recent years.
6/ In contrast, NLP methods that made their way to CV seem to be a lot rarer. Word2vec, attention and the recent Transformer model are the only ones that come to mind.
7/ CV is also ahead in terms of commercialization: Current CV methods enable applications such as self-flying drones, satellite image-based prediction of e.g. crop yields, facial recognition, plant identification, emotion recognition, etc.
8/ Second, why does CV seem to be ahead? I can think of four reasons: 1) Deep Learning, the main driver of current ML, had its start and first success in CV with AlexNet, which gave CV an early lead.
9/ 2) The CV community is larger and seemingly more competitive, making the development of novel techniques more likely. 3) Images are continuous, making them an easier testbed for neural approaches. 4) The availability of large datasets for benchmarking.
10/ Besides 1), these seem like things we can’t do anything about and the NLP community has made progress on 4). Do we need to worry then at all? After all, NLP is not just an application area for ML.
11/ Perhaps NLP should mostly focus on itself, e.g. on developing new methods that incorporate linguistic structure. However, I believe that fostering an environment where we do not just take inspiration but are inspired to develop ideas with a wide appeal is valuable in itself.
12/ What does it take to catch up? Large publicly available datasets and competitions seem to have enabled a large number of recent advances. In both respects, the NLP community seems to be doing fine.
13/ Novel datasets are developed at a regular pace and shared tasks and competitions are frequently hosted. One factor might be that NLP does not have a recurring competition that is catalyzing and brings both industry and academia together like the ImageNet competition.
14/ The recent kaggle Jigsaw competition (kaggle.com/c/jigsaw-toxic…) saw participation of 4,551 teams, but was largely eschewed by academia as far as I’m aware.
15/ Another thing is to make it easier for people to discover datasets and existing baselines for their task. Many historic NLP datasets are licensed, which makes it harder for outsiders to break in. In CV, many papers evaluate either on MNIST, CIFAR-10, or (low-res) ImageNet.
16/ Anyway, these were just my thoughts. I’m looking forward to hearing what you think.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Sebastian Ruder
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!