Tweet

TheSequence

25 Jun, 8 tweets, 4 min read

Pre-trained language models have been one of the most important breakthroughs in the recent years of deep learning.

What models are used in super large-scale language tasks?
Thread👇

Pre-trained language models are trained in massive text datasets.

Thanks to transformer architectures, we can implement pre-trained language models adapted to specific tasks. For example, question-answering or language modeling.
2/⬇️

Transformers opened the door to a new era of innovation in NLU. And the attention mechanism used in transformers = one of the most impactful developments in the last years of ML.

More about transformers & attention here: thesequence.substack.com/p/thesequence-…
3/⬇️

@MSFTResearch

Researches from @MSFTResearch moved further. They introduced one of the first generative models that could be used in super large-scale language tasks. It was @ChunyuanLi @icaruszyz @JianfengGao0217 Xiang Gao, Yuan Li, Xiujun Li, Baolin Peng.
4/⬇️

They called their model Optimus.

Optimus attempts to combine large pre-trained language models (language understanding) and generation tasks in a very clever architecture using generative models.
5/⬇️

The Optimus architecture includes a BERT-based encoder and a GPT-2-based decoder.

From that perspective, Optimus = variational auto-encoder (VAE) architecture that combines encoder and decoder layers.
6/⬇️

The full paper about Optimus is called "Optimus: Organizing sentences via pre-trained modeling of a latent space".

You can find it here: chunyuan.li/papers/Optimus…
And the code with data: github.com/ChunyuanLI/Opt…
7/⬇️

If you'd like to find a bite-sized, highly concentrated overview of this paper click the link below. You'll move to TheSequence Edge#7, our educational newsletter.
thesequence.substack.com/p/edge7
8/8

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @TheSequenceAI

TheSequence

@TheSequenceAI

26 Jun

The Adversarial Robustness Toolbox (ART) = framework that uses generative adversarial neural networks (GANs) to protect deep learning models from security attacks

Thread⬇️

GANs = the most popular form of generative models.

GAN-based attacks:
+White Box Attacks: The adversary has access to the training environment, knowledge of the training algorithm
+Black Box Attacks: The adversary has no additional knowledge
2/⬇️

The goal of ART = to provide a framework to evaluate the robustness of a neural network.

The current version of ART focuses on four types of adversarial attacks:
+evasion
+inference
+extraction
+poisoning
3/⬇️

Read 5 tweets

TheSequence

@TheSequenceAI

19 Jun

@Uber

🤖@Uber Ludwig = Open Source Framework for Creating ML Models Without Writing Any Code.

To use Ludwig all you need is a data file with the inputs attributes and the desired outputs, Ludwig does the result.
Thread🧵👇

The main innovation behind Ludwig = idea of data-type specific encoders and decoders. Ludwig uses specific encoders and decoders for any given data type supported.
2/6⬇️

Ludwig is based on a series of principles:
+No Coding Required
+Generality
+Flexibility
+Extensibility
+Interpretability
3/6⬇️

Read 6 tweets

TheSequence

@TheSequenceAI

18 Jun

Are you interested in Neural Architecture Search but don’t know where to start?

Then you should consult “A Survey on Neural Architecture Search”. It is one of the key papers to understand the NAS space.

Thread⬇️

The NAS space is growing very rapidly.

“A Survey on Neural Architecture Search” provides a survey of the most important NAS:
+methods
+principles,
+components
2/6⬇️

NAS techniques can all be abstracted in two fundamental steps:

+What to search for: a search space
+How to search: a search algorithm
3/6⬇️

Read 6 tweets

TheSequence

@TheSequenceAI

7 Jun

https://twitter.com/jrdothoughts/status/1401887287068303361?s=19

Here is our latest free Scope

And a recap thread👇

https://twitter.com/jrdothoughts/status/1401887287068303361?s=19

https://twitter.com/TheSequenceAI/status/1401531349790609409?s=19

Editorial

https://twitter.com/TheSequenceAI/status/1401531349790609409?s=19

2/⬇️

https://twitter.com/TheSequenceAI/status/1401543597732859904?s=19

The latest AI/ML research

https://twitter.com/TheSequenceAI/status/1401543597732859904?s=19

3/⬇️

Read 5 tweets

TheSequence

@TheSequenceAI

7 Jun

The centralized nature of AI makes it difficult for startups to compete with the large tech incumbents that have access to:
+massive datasets
+virtually unlimited computing resources
+world-class research talent

Decentralized AI is the key

Thread⬇️

The research in decentralized ML is nothing new and can be traced back to the late 1970s

But the space has caught new momentum w/ blockchains and distributed ledger technologies
2/⬇️

However, blockchains are not the only technology trend influencing decentralized ML

Decentralized ML has benefited from:
+Blockchains
+Federated Learning
+Private ML
3/⬇️

Read 7 tweets

TheSequence

@TheSequenceAI

5 Jun

@hopsworks

🤖@hopsworks = feature store for your deep learning solution

It’s a feature store platform with its own loyal community that has been adopted by several major companies
Thread🧵👇

❓HopsWorks = open-source feature store platform that enables the management and maintenance of features in a deep learning infrastructure

It’s a centralized catalog of features that can be discovered, used, and maintained across different ML models
2/⬇️

HopsWorks capabilities:
+Feature Reusability
+Feature Discovery
+Feature Analysis
3/⬇️

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

TheSequence

Try unrolling a thread yourself!

More from @TheSequenceAI

TheSequence

TheSequence

TheSequence

TheSequence

TheSequence

TheSequence

Did Thread Reader help you today?

Like this author's thread?