Post

More from @techNmak

Tech with Mak

@techNmak

Nov 23

These engineering blogs have leveled up my tech skills more than any bootcamp, course, or conference.

Here are the ones worth bookmarking:

Netflix Tech Blog
Check here: netflixtechblog.com

LinkedIn Engineering Blog
Check here: linkedin.com/blog/engineeri…

Read 15 tweets

Tech with Mak

@techNmak

Nov 22

Make the most of your weekend.

Don't sleep on this.

Stanford's Autumn 2025 Transformers & LLMs course. 7 lectures. Free.

While others scroll, you could understand how Flash Attention achieves 3x speedup, how LoRA cuts fine-tuning costs by 90%, and how MoE makes models efficient.

➕ What's covered:

➡️ Lecture 1: Transformer Fundamentals
→ Tokenization and word representation
→ Self-attention mechanism explained
→ Complete transformer architecture
→ Detailed implementation example

➡️ Lecture 2: Advanced Transformer Techniques
→ Position embeddings (RoPE, ALiBi, T5 bias)
→ Layer normalization and sparse attention
→ BERT deep dive and finetuning
→ Extensions of BERT

➡️ Lecture 3: LLMs & Inference Optimization
→ Mixture of Experts (MoE) explained
→ Decoding strategies (greedy, beam search, sampling)
→ Prompting and in-context learning
→ Chain-of-thought reasoning
→ Inference optimizations (KV cache, PagedAttention)

➡️ Lecture 4: LLM Training & Fine-tuning
→ Pretraining and scaling laws (Chinchilla law)
→ Training optimizations (ZeRO, model parallelism)
→ Flash Attention for 3x speedup
→ Quantization and mixed precision
→ Parameter-efficient finetuning (LoRA, QLoRA)

➡️ Lecture 5: LLM Tuning
→ Preference tuning
→ RLHF overview
→ Reward modeling
→ RL approaches (PPO and variants)
→ DPO

➡️ Lecture 6: LLM Reasoning
→ Reasoning models
→ RL for reasoning
→ GRPO
→ Scaling

➡️ Lecture 7: Agentic LLMs
→ Retrieval-augmented generation
→ Advanced RAG techniques
→ Function calling
→ Agents
→ ReAct framework

From Stanford Online:
Rigorous instruction. Latest techniques. Free access.

Perfect for:
→ ML engineers building with LLMs
→ AI engineers understanding transformers
→ Researchers working on language models
→ Anyone learning beyond API calls

This weekend: learn the techniques that separate good engineers from great ones.

(I will put the playlist in the comments.)

♻️ Repost to save someone $$$ and a lot of confusion.
✔️ Follow @techNmak for more AI/ML insights.

Lecture 1: Transformer

- Class logistics
- NLP overview
- Tokenization
- Word representation
- Recurrent neural networks
- Self-attention mechanism
- Transformer architecture

Lecture 2: Transformer-Based Models & Tricks

- Overview of position embeddings
- Sinusoidal embeddings
- T5 bias, ALiBi
- RoPE
- Layer normalization
- Sparse attention
- Sharing attention heads
- Transformer-based models
- BERT deep dive
- BERT finetuning
- Extensions of BERT

Read 8 tweets

Tech with Mak

@techNmak

Sep 6

9 core patterns for building Fault-Tolerant Applications

Fall seven times, stand up eight.

Follow @techNmak for more :)

[1.] Circuit Breaker

◾ Acts like an electrical circuit breaker.
◾ When a service experiences repeated failures, the circuit breaker 'trips' & stops sending requests to that service for a period of time.
◾ This allows the failing service to recover without being overwhelmed.

The main circuit breaker states -
◾ Closed: Requests are allowed to pass through.
◾ Open: Requests are immediately rejected with an error.

◾ effective for protecting against cascading failures & isolating problematic services.

[2.] Retry

◾ When a request fails, the system automatically retries it a certain number of times before giving up.
◾ This can help overcome transient errors like network glitches or temporary unavailability.
◾ Improves system availability and can mask transient errors.
◾ Be mindful of retry storms (where excessive retries overload the system) and implement exponential backoff (increasing the time between retries).

Read 11 tweets

Tech with Mak

@techNmak

Apr 29

Let's understand DOCKER

Follow the 🧵

What is a Container?

Container vs VM

Read 12 tweets

Tech with Mak

@techNmak

Apr 15

How Git Works (1/8)

How Git Works (2/8)

How Git Works (3/8)

Read 8 tweets

Tech with Mak

@techNmak

Feb 17

Are you preparing for a system design interview? Then you should not miss this post.

Read more in the 🧵

Follow - @techNmak

Are you preparing for a system design interview? 👇

Remember,

'Think Tradeoffs, Not Just Tech'

[1.] CAP Theorem
Consistency vs. Availability vs. Partition Tolerance
Choose two => Consistent data, high availability or tolerance to network failures.
(It's actually a choice between C & A)

[2.] Latency vs. Throughput
Fast response times vs. high data processing volume

[3.] ACID vs. BASE
Strict transaction guarantees vs. flexible consistency models

[4.] Monolithic vs. Microservices
Single, unified application vs. distributed, independent services

[5.] SQL vs. NoSQL
Structured data and complex queries vs. flexible schemas and scalability

[6.] Push vs. Pull
Data delivery initiated by server vs. requested by client

[7.] Caching Strategies
Tradeoffs between different cache eviction policies (LRU, LFU, etc.)
Balancing faster data access with potential staleness and increased complexity.

[8.] Statefulness vs. Statelessness
Maintaining session state vs. stateless interactions for scalability

[9.] Optimistic vs. Pessimistic Locking
Optimistic locking assumes no conflicts, favoring speed and concurrency. Pessimistic locking prevents conflicts by acquiring locks upfront, sacrificing performance for data integrity.

[10.] Data Locality vs. Data Distribution
Keeping data close for faster access vs. distributing for resilience and parallel processing

Read 6 tweets

Share this page!

Enter URL or ID to Unroll

Tech with Mak

Try unrolling a thread yourself!

More from @techNmak

Tech with Mak

Tech with Mak

Tech with Mak

Tech with Mak

Tech with Mak

Tech with Mak

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!