I just read "Foundations of LLMs 2025" cover to cover.
It explained large language models so clearly that I can finally say: I get it.
Here’s the plain-English breakdown I wish I had years ago:
To understand LLMs, start with pre-training.
We don’t teach them specific tasks.
We flood them with raw text and let them discover patterns on their own.
This technique is called self-supervised learning and it’s the foundation of everything.
There are 3 ways to pre-train:
→ Unsupervised: No labels at all
→ Supervised: Classic labeled data
→ Self-supervised: Model creates its own labels (e.g., “guess the missing word”)
LLMs use #3 it scales like crazy and teaches them language from scratch.
Example of self-supervised learning:
“The early bird catches the worm.”
Mask some words:
→ “The [MASK] bird catches the [MASK]”
The model’s job? Fill in the blanks.
No human labels. The text is the supervision.
This leads to 3 main model types:
→ Encoder-only (BERT): Understands text
→ Decoder-only (GPT): Generates next word
→ Encoder-decoder (T5): Translates input to output
Each has strengths. Think of them as different tools for different jobs.
Let’s break it down further.
Decoder-only (GPT-style):
Trained to guess the next word:
“The cat sat on the ___” → “mat”
This is called causal language modeling.
Loss is measured by how wrong the guesses are (cross-entropy).
Encoder-only (BERT-style):
Takes the whole sentence.
Randomly hides some words and tries to reconstruct them.
This is masked language modeling uses left and right context.
Great for understanding, not generation.
Example:
Original:
→ “The early bird catches the worm”
Masked:
→ “The [MASK] bird catches the [MASK]”
The model predicts “early” and “worm” by understanding the whole sentence.
It’s learning language by solving puzzles.
Encoder-decoder (T5, BART):
Treats everything as a text-to-text task.
Examples:
“Translate English to German: hello” → “hallo”
“Sentiment: I hate this” → “negative”
This setup lets one model do it all: QA, summarization, translation, etc.
Once pre-trained, we have two options:
→ Fine-tune it on a labeled dataset
→ Prompt it cleverly to do new tasks
Fine-tuning adjusts weights.
Prompting? Just tweaks the input text.
Let’s dive into the magic of prompts.
Prompting = carefully phrasing input so the model does what you want.
Example:
“I love this movie. Sentiment:”
It’ll likely respond: “positive”
Add a few examples before it? That’s in-context learning no fine-tuning needed.
Prompting gets deep.
Advanced strategies:
• Chain of thought → “Let’s think step by step...”
• Decomposition → Break complex tasks into parts
• Self-refinement → Ask the model to critique itself
• RAG → Let it fetch real-time data from external sources
This is all possible because of the way these models are trained: predict the next word over and over until they internalize language structure, reasoning patterns, and world knowledge.
It's not magic. it's scale.
But raw intelligence isn’t enough.
We need models to align with human goals.
That’s where alignment comes in.
It happens in two major phases after pretraining 👇
Supervised Fine-Tuning (SFT)
Feed the model good human responses. Let it learn how we want it to reply.
RLHF (Reinforcement Learning w/ Human Feedback)
Train a reward model to prefer helpful answers. Use it to steer the LLM.
This is how ChatGPT was aligned.
RLHF is powerful but tricky.
Newer methods like Direct Preference Optimization (DPO) are rising fast.
Why?
They skip the unstable reward modeling of RL and go straight to optimizing for preferences.
More stable. More scalable.
Inference how the model runs is just as important as training.
Grok might be the smartest stock trader on the planet.
But only if you know the right prompts.
Here are 10 to put your trades on autopilot 👇
1/ Market Analysis:
"Analyze the current trends in the stock market, focusing on [input sector or stock]. Identify any emerging patterns and suggest potential investment opportunities. Consider recent earnings reports and industry news in your analysis."
2/ Portfolio Diversification:
"Given a portfolio with a mix of [input current sectors or stocks], suggest strategies to diversify further while minimizing risk. Include potential sectors to explore and specific stocks to consider."
You can ask ChatGPT-4o to explain Warren Buffett’s portfolio, analyze market trends, and even spot risky stocks.
Here are 10 essential prompts for every trader:
1/ Market Analysis:
"Analyze the current trends in the stock market, focusing on [input sector or stock]. Identify any emerging patterns and suggest potential investment opportunities. Consider recent earnings reports and industry news in your analysis."
2/ Portfolio Diversification:
"Given a portfolio with a mix of [input current sectors or stocks], suggest strategies to diversify further while minimizing risk. Include potential sectors to explore and specific stocks to consider."
but 99% of people are using it like a smarter Google
here are 8 prompts to actually automate your work and save hours:
1. Market Research
"Conduct market research on {industry/product}. Identify trends, competitors, consumer behavior, and growth opportunities. Provide insights backed by data, key statistics, and strategic recommendations to leverage market gaps effectively."
Use Case: Launching a new product or validating an idea.
Transforms scattered data into actionable strategy using trends, stats, and competitive intelligence.
2. Content Creation
"Create engaging, non-generic content on {topic}. Avoid robotic or formulaic responses; use a conversational, human-like tone. Incorporate storytelling, examples, and unique insights. Make it feel fresh, original, and compelling for the target audience of {industry}."
Use Case: Blog posts, social media content, or branded storytelling.
Blends personality with precision, which builds trust and drives engagement.
But 99% of users are wasting it on surface-level stuff.
No strategy. No depth. No results.
These 8 prompts change everything:
1. Market Research
"Conduct market research on {industry/product}. Identify trends, competitors, consumer behavior, and growth opportunities. Provide insights backed by data, key statistics, and strategic recommendations to leverage market gaps effectively."
Use Case: Launching a new product or validating an idea.
Transforms scattered data into actionable strategy using trends, stats, and competitive intelligence.
2. Content Creation
"Create engaging, non-generic content on {topic}. Avoid robotic or formulaic responses; use a conversational, human-like tone. Incorporate storytelling, examples, and unique insights. Make it feel fresh, original, and compelling for the target audience of {industry}."
Use Case: Blog posts, social media content, or branded storytelling.
Blends personality with precision, which builds trust and drives engagement.
But if you prompt ChatGPT right, it becomes the best teacher you’ve ever had.
Here are 8 prompts that helped me learn anything 10x faster:
1/ Deep Dive into a Topic:
Prompt:
"Act as an expert on [subject], explain the most important concepts, and provide real-world examples to illustrate each. Then, give me a step-by-step guide to master this topic in the next 30 days."
2/ Personalized Learning Plan:
Prompt:
"Help me design a personalized learning plan for mastering [subject]. Break it down into daily learning tasks, recommended resources, and practical exercises I can do to build my skills."