Akshay πŸš€ Profile picture
May 8 β€’ 10 tweets β€’ 3 min read β€’ Read on X
How LLMs work, clearly explained:
Before diving into LLMs, we must understand conditional probability.

Let's consider a population of 14 individuals:

- Some of them like Tennis 🎾
- Some like Football ⚽️
- A few like both 🎾 ⚽️
- And few like none

Here's how it looks πŸ‘‡ Image
So what is Conditional probability ⁉️

It's a measure of the probability of an event given that another event has occurred.

If the events are A and B, we denote this as P(A|B).

This reads as "probability of A given B"

Check this illustration πŸ‘‡ Image
For instance, if we're predicting whether it will rain today (event A), knowing that it's cloudy (event B) might impact our prediction.

As it's more likely to rain when it's cloudy, we'd say the conditional probability P(A|B) is high.

That's conditional probability for you! πŸŽ‰
Now, how does this apply to LLMs like GPT-4❓

These models are tasked with predicting the next word in a sequence.

This is a question of conditional probability: given the words that have come before, what is the most likely next word? Image
To predict the next word, the model calculates the conditional probability for each possible next word, given the previous words (context).

The word with the highest conditional probability is chosen as the prediction. Image
The LLM learns a high-dimensional probability distribution over sequences of words.

And the parameters of this distribution are the trained weights!

The training or rather pre-training** is supervised.

I'll talk about the different training steps next time!**

Check this πŸ‘‡ Image
Hopefully, this thread has demystified a bit of the magic behind LLMs and the concept of conditional probability.

Here's the gist of what we learned today: Image
Working with LLMs is going to to be a high leverage skill!

@LightningAI provides state of the art tutorials on LLMs & LLMOps!

An integrated AI developer platform with access to FREE GPUs & VSCode right in your browser!

Check this: lightning.ai/lightning-ai/h…
If you interested in:

- Python 🐍
- Machine Learning πŸ€–
- AI Engineering βš™οΈ

Find me β†’ @akshay_pachaar βœ”οΈ
My weekly Newsletter on AI Engineering, Join 9k+ readers: @ML_Spring

Cheers! πŸ₯‚

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Akshay πŸš€

Akshay πŸš€ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @akshay_pachaar

May 10
10 great Python packages for Data Science not known to many:
1️⃣ CleanLab

You're missing out on a lot if you haven't started using Cleanlab yet!

Cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset.

It's like a magic wand! πŸͺ„βœ¨

Check this outπŸ‘‡
github.com/cleanlab/clean…
2️⃣ LazyPredict

A Python library that enables you to train, test, and evaluate multiple ML models at once using just a few lines of code.

Supports both regression & classification! ✨

Check this outπŸ‘‡
pypi.org/project/lazypr…
Image
Read 12 tweets
May 7
Let's build a 100% local RAG app, featuring ⌘R, a self-hosted vector
database, a fast embedding library & a reranker:
Before we begin, take a look at what we're building today!

And here's what you'll learn:

- @Ollama for serving ⌘R
- @Streamlit for building the UI
- @Llama_Index for orchestration
- @qdrant_engine to self-host a vector db
- @LightningAI for development & hosting

Let's go! πŸš€
The architecture diagram presented below illustrates some of the key components & how they interact with each other!

What's new❓

- Self hosting a vector db
- Faster embedding
- Using a reranker

It will be followed by detailed descriptions & code for each component: Image
Read 12 tweets
May 6
Python *args & **kwargs clearly explained:
*args allows you to pass a variable number of non-keyword arguments to a function.

It collects all non-keyword arguments passed to the function and stores them as a tuple.

Consider the following example: Image
Similarly, **kwargs allows you to pass a variable number of keyword arguments to a function.

It collects all keyword arguments passed to the function and stores them as a dictionary.

Consider the following example: Image
Read 6 tweets
May 4
I have been coding in Python for 8 years now. ⏳

If I were to start over today, here's a roadmap...πŸ‘‡ Image
1️⃣ freeCodeCamp

4 hours Python bootcamp!!

What you'll learn:
- Installing Python
- Setting up an IDE
- Basics Syntax
- Variables & Datatypes
- Looping in Python
- Exception handling
- Modules & pip
- Mini hands-on projects πŸ”₯

Check this out πŸ‘‡
2️⃣ CS50p: Harvard University

There isn't a better place to learn #Python than @davidjmalan 's CS50p.

Beautiful explanations and great projects.
It's a complete package.

Highly recommended!!

Check this out πŸ‘‡
cs50.harvard.edu/python/2022/
Read 6 tweets
May 2
Lambda functions in Python clearly explained:
What are lambda functions ?

Simply put, they are small anonymous functions that are defined without a name.

Check out the syntax πŸ‘‡ Image
Lambda functions can have any number of arguments, but they can only have one expression.

The expression is executed and the result is returned.

Here is an example of a lambda function that adds two numbers πŸ‘‡ Image
Read 6 tweets
Apr 29
Self-attention in transformers clearly explained:
Before we start a quick primer on tokenization!

Raw text β†’ Tokenization β†’ Embedding β†’ Model

Embedding is a meaningful representation of each token (roughly a word) using a bunch of numbers.

This embedding is what we provide as an input to our language models.

Check thisπŸ‘‡ Image
The core idea of Language modelling is to understand the structure and patterns within language.

By modeling the relationships between words (tokens) in a sentence, we can capture the context and meaning of the text. Image
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(