.@OpenAI ImageGPT is one of the first transformer architectures applied to computer vision scenarios.👇
In language, unsupervised learning algorithms that rely on word prediction (like GPT-2 and BERT) are extremely successful.

One possible reason for this success is that instances of downstream language tasks appear naturally in the text.
2/4
In contrast, sequences of pixels do not clearly contain labels for the images they belong to.

However, OpenAI believes that sufficiently large transformer models:
- could be applied to 2D image analysis
- learn strong representations of a dataset
3/4
Find more about ImageGPT here: openai.com/blog/image-gpt/

Thanks for learning ML and AI with us! This is a thread from Edge#117 – our series about transformers.
4/4

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with TheSequence

TheSequence Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @TheSequenceAI

14 Sep
Master Neural Architecture Search (NAS) to automate the creation of neural networks.

4 topics you need to cover⬇️
1) The concept of NAS

1. Read one of the fundamental papers, "A Survey on Neural Architecture Search" @IBMResearch
2. Explore our dedicated Edge#4 for free thesequence.substack.com/p/thesequence-…
2) NAS algorithms

1. Differentiable Architecture Search
2. Differentiable ArchiTecture Approximation
3. eXperts Neural Architecture Search
4. Petridish (Read "Efficient Forward Architecture Search")
Read 5 tweets
4 Sep
NAS is one of the most promising areas of deep learning.

But it remains super difficult to use.

Archai = an open-source framework that enables the execution of state-of-the-art NAS methods in PyTorch.⬇️
Archai enables the execution of modern NAS methods from a simple command-line interface.

Archai developers are striving to rapidly update the list of algorithms.

Current deck:
- PC-DARTS
- Geometric NAS
- ProxyLess NAS
- SNAS
- DATA
- RandNAS
2/5
Benefits for the adopters of NAS techniques:
- Declarative Approach and Reproducibility
- Search-Space Abstractions
- Mix-and-Match Techniques
- & more!

You can find more details here: microsoft.github.io/archai/feature…
3/5
Read 5 tweets
3 Sep
Start with a series of baseline models.

And then you use forward-search NAS techniques that can automatically generate neural networks.

Interesting?⬇️
Project Petridish = a new type of NAS algorithm that can produce neural networks for a given problem.

It was inspired by feature selection and gradient boosting techniques.
2/5
When exploring the NAS space, there are two types of techniques: backward-search and forward-search.

Backward-search methods have been the most common approach. But they require human domain knowledge.

Forward methods do not need to specify a finite search space.
3/5
Read 6 tweets
2 Sep
There are many challenges teams encounter while performing data labeling.

That's why we decided to discuss 3 real-world use cases.

Find one that fits your project⬇️ Image
1) Object detection and image classification

1. Select Object Detection with Bounding Boxes template
2. Modify it to include image classification options to suit your case

It is straightforward to customize the labeling interface using XML-like tags on Label Studio. Image
2) Correct predictions while labeling

Using Label Studio, you can:
- Display predictions in the labeling interface
- Allow their annotators to focus on validating or correcting the lowest-confidence predictions
Read 5 tweets
26 Aug
🔥2 New Super Models to Handle Any Type of Dataset

We build models optimized for a specific type of dataset like:
- text
- audio
- computer vision
- etc.

Is it possible to create a general model? @DeepMind unveils the answer⬇️
Recently, DeepMind published two papers about general-purpose architectures that can process different types of input datasets.

1) Perceiver supports any kind of input.
2) Perceiver IO supports any kind of output.

More⬇️
Perceivers can handle new types of data with only minimal modifications.

They process inputs using domain-agnostic Transformer-style attention.

Perceiver IO matches a Transformer-based BERT baseline on the GLUE language benchmark.
3/7
Read 7 tweets
24 Aug
🤗@huggingface Transformer library is one of the few frameworks that includes transformer models for computer vision.⬇️
Hugging Face Transformers provide implementations of hundreds of state-of-the-art transformer models for both PyTorch and TensorFlow 2.0

Here is a thread about it:
2/4
Models that can be readily applied to computer vision tasks:

- Vision Transformer created by @GoogleAI
- VisualBERT created by @UCLA
- DeIT created by @facebookai
3/4
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(