Josep Ferrer Profile picture
Data Scientist & Tech Writer @KDnuggets @DataCamp @Medium | Outstand using data | Join 8k data professionals on https://t.co/VdsUvb9SKu 🧩

Jul 28, 2024, 10 tweets

The Transformers architecture clearly explained πŸ‘‡πŸ»

Today I'm starting a new series of threads to simplify the concept of Transformers and what's behind the Natural Language abilities of LLMs.

Let's start with the basics of the Transformer architecture:

The encoder/decoder concept. 🧠✨

1️⃣ π—ͺ𝗛𝗔𝗧 π—œπ—¦ 𝗔 𝗧π—₯𝗔𝗑𝗦𝗙𝗒π—₯π— π—˜π—₯?
A Transformer is a neural network that excels at understanding the context of sequential data and generating new data from it.

They are the first to rely solely on self-attention, without using RNNs or convolution.

2️⃣ 𝗧π—₯𝗔𝗑𝗦𝗙𝗒π—₯π— π—˜π—₯ 𝗔𝗦 𝗔 π—•π—Ÿπ—”π—–π—ž 𝗕𝗒𝗫
Imagine a Transformer for language translation as a BLACK BOX. 🎩
β€’ Input: A sentence in one language.
β€’ Output: Its translation.

But what happens inside this black box? Let's find out! πŸ”

3️⃣ π—˜π—‘π—–π—’π——π—˜π—₯/π——π—˜π—–π—’π——π—˜π—₯ architecture
β€’ Input: Spanish sentence ΒΏDe quiΓ©n es?
β€’ Encoder: Transforms it into a structured format capturing its essence.
β€’ Decoder: Receives this encoded data and generates the translation.
β€’ Output: The translated sentence: Whose is it?

4️⃣ π—§π—›π—˜ 𝗔π—₯π—–π—›π—œπ—§π—˜π—–π—§π—¨π—₯π—˜ BEHIND THE TRANSFORMERS
Each encoder and decoder is made up of layers. Here's how they work:
β€’ Encoders: Process the input sequentially, layer by layer.
β€’ Decoders: Take the encoded data and generate the output step by step.

Both use self-attention and feed-forward neural networks, enabling the generation of natural language.

Tomorrow we will break down the architecture of both core elements of the Transformers architecture.

Do you want to understand the Transformers architecture?
Then go check my last article about TransformersπŸ‘‡πŸ»

aigents.co/data-science-b…

If you are interested in...
β€’ Python 🐍
β€’ SQL πŸ’Ύ
β€’ ML/MLOps πŸ› 
β€’ LLMs & NLP πŸ—£
β€’ DataViz πŸ—£
β€’ AI Engineering βš™οΈ

Then follow me β†’ @rfeers

Did you like this post?

Then join my freshly started DataBites newsletter to get all my content right to your mail every week! 🧩

πŸ‘‰πŸ» databites.tech

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling