๐จ๐น๐ฌ ๐๐ถ๐ผ ๐ท๐จ๐๐ฐ๐ต๐ฎ ๐จ๐ป๐ป๐ฌ๐ต๐ป๐ฐ๐ถ๐ต?
6th of December 2017... over FIVE years ago, researchers explored
๐ฅ ๐ป๐๐๐๐๐๐๐๐๐๐๐ ๐ฅ (the "T" in GPT)
arxiv.org/pdf/1706.03762โฆ
What even ๐จ๐น๐ฌ they??
๐๐ณ๏ธ๐๐งต
#100DaysOfChatGPT
Explore Emerging Tech at @AtmanAcademy
@AtmanAcademy The truth is, I'm still trying to wrap my head around it...
But got a little bit further last night with the help of #ChatGPT.
I've heard someone describe it like a Rubix Cube (?!) so I knew there was a 3D sorta component to it conceptually.
Let's hop on in!
๐๐ณ๏ธ๐๐งต
@AtmanAcademy ๐ณ๐จ๐๐ฌ๐น๐บ
๐ฏ๐ฌ๐จ๐ซ๐บ
๐ป๐ถ๐ฒ๐ฌ๐ต๐บ
Remember these three... that's about all I could take away from all these WORDS
It was a bit more than I was ready to try and digest.
๐๐ณ๏ธ๐๐งต
@AtmanAcademy "Explain this like I'm 15"... not like I'm 5, but not like I'm a college graduate either.
THAT's more like it - I can actually follow these!
(Can you?)
Again - say it with me:
๐ณ๐จ๐๐ฌ๐น๐บ
๐ฏ๐ฌ๐จ๐ซ๐บ
๐ป๐ถ๐ฒ๐ฌ๐ต๐บ
๐๐ณ๏ธ๐๐งต
@AtmanAcademy So it's all starting to come together (or come apart?):
Larger Layers,
More Attention Heads and a
Bigger Context Window (more input tokens)
Give the newer models superior performance.
๐๐ณ๏ธ๐๐งต
@AtmanAcademy That's probably enough to chew on for tonight.
To (try) and summarise:
Layers - enable context abstraction
Heads - gives the "weighting" (or attention) to the words/tokens
Tokens - broken up words and sub-words for processing
So yeah - bigger is better. ๐ค
๐๐ณ๏ธ๐
@AtmanAcademy Day 93 of #100๐ซ๐๐๐๐ถ๐๐ช๐๐๐๐ฎ๐ท๐ป; exploring, experimenting and growing through interactions with ChatGPT.
If this content inspires you, please: Comment/Like/Retweet/Follow.
GRAB THE CHEAT SHEET from the ChatGPT workshop here:
AtmanAcademy.io
Hop on in!
๐๐ณ๏ธ๐
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.