Akshay ๐Ÿš€ Profile picture
Oct 1, 2023 โ€ข 10 tweets โ€ข 3 min read โ€ข Read on X
Broadcasting in NumPy is widely used, yet poorly understoodโ—๏ธ

Today, I'll clearly explain how broadcasting works!

Same rules apply to PyTorch & TensorFlow!

Let's go! ๐Ÿš€ Image
Broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations.

The smaller array is โ€œbroadcastโ€ across the larger array, such that the 2 have compatible shapes.

Check this out๐Ÿ‘‡ Image
In the image below, scalar "b" is being stretched into an array with the same shape as "a".

But how do we generalise these things?

continue reading ... ๐Ÿ“– Image
๐Ÿ’ซ General Rules:

1) Broadcasting starts with the trailing (i.e. rightmost) dimensions and works its way left .

2) Two dimensions are compatible, either when they are equal or one of them is 1.

Check out the examples ๐Ÿ‘‡ Image
When ever a one dimensional array is involved in broadcasting, consider it as a row vector!

Array โ†’ [1, 2, 3] ; shape โ†’ (3,)
Treated as โ†’ [[1, 2, 3]] ; shape โ†’ (1, 3)

Remember, broadcasting occurs from trailing dimension!

Check this out๐Ÿ‘‡
Image
Image
Let's check a scenario when broadcasting doesn't occur!

- a(4x3)
- b(4) will be treated as b(1x4)

Now, broadcasting starts from trailing dimension but (4x3) & (1x4) are not compatible!

Check this out๐Ÿ‘‡ Image
Let's take one more example to make out understanding concrete!

Remember, 1D array treated as a row vector while broadcasting!

Check this out๐Ÿ‘‡ Image
Here's how an array of shape (4x1) & (3,) broadcasts together!

Check this out๐Ÿ‘‡ Image
Why use broadcastingโ“

Broadcasting provides a means of vectorising array operations so that looping occurs in C instead of Python.

It does this without making needless copies of data and usually leads to efficient algorithm implementations.
That's a wrap!

If you Enjoyed reading this & are interested in

- Python ๐Ÿ
- ML/MLOps ๐Ÿ› 
- CV/NLP ๐Ÿ—ฃ
- LLMs ๐Ÿง 
- AI Engineering โš™๏ธ

Find me โ†’ @akshay_pachaar โœ”๏ธ
Everyday, I share tutorials on the above topics!

Cheers! ๐Ÿฅ‚

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Akshay ๐Ÿš€

Akshay ๐Ÿš€ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @akshay_pachaar

Oct 6
You're in a Research Scientist interview at OpenAI.

The interviewer asks:

"How would you expand the context length of an LLM from 2K to 128K tokens?"

You: "I will fine-tune the model on longer docs with 128K context"

Interview over.

Here's what you missed:
Extending the context window isn't just about larger matrices.

In a traditional transformer, expanding tokens by 8x increases memory needs by 64x due to the quadratic complexity of attention. Refer to the image below!

So, how do we manage it?

continue...๐Ÿ‘‡ Image
1) Sparse Attention

It limits the attention computation to a subset of tokens by:

- Using local attention (tokens attend only to their neighbors).
- Letting the model learn which tokens to focus on.

But this has a trade-off between computational complexity and performance. Image
Read 10 tweets
Sep 25
Local MCP clients are so underrated!

Everyone's using Cursor, Claude Desktop, and ChatGPT as MCP hosts, but if you're building your own apps that support MCP, you need custom clients.

Here's the problem: Writing MCP clients from scratch is painful and time-consuming.

Today, I'm showing you how to build custom MCP clients in minutes, not hours.

To prove this, I built a fully private, ultimate AI assistant that can:

- Connects to any MCP server
- Automates browser usage
- Scrapes web data seamlessly
- Controls the terminal of my computer
- Processes images, audio, and documents
- Remembers everything with knowledge graphs

The secret? mcp-use โ€” a 100% open-source framework that makes MCP integration trivial.

Building custom MCP agents takes 3 steps:

1. Define your MCP server configuration
2. Connect any LLM with the MCP client
3. Deploy your agent

That's it. No complex setup, no proprietary dependencies.

The best part? Everything runs locally. Your data stays private, and you control the entire stack.

Full breakdown with code...๐Ÿ‘‡
Let's break this down by exploring each integration and understanding how it works, using code and illustrations:
1๏ธโƒฃ Stagehand MCP server

We begin by allowing our Agent to control a browser, navigate web pages, take screenshots, etc., using @Stagehanddev MCP.

Below, I asked a weather query, and the Agent autonomously responded to it by initiating a browser session.

Check this๐Ÿ‘‡
Read 11 tweets
Sep 23
Context engineering, clearly explained!

Everybody is talking about context engineering, but no one tells you what it actually means.

Today, I'll explain everything you need to know about context engineering in a step-by-step manner.

Here's an illustrated guide:
So, what is context engineering?

Itโ€™s the art and science of delivering the right information, in the right format, at the right time, to your LLM.

Here's a quote by Andrej Karpathy on context engineering...๐Ÿ‘‡ Image
To understand context engineering, it's essential to first understand the meaning of context.

Agents today have evolved into much more than just chatbots.

The graphic below summarizes the 6 types of contexts an agent needs to function properly.

Check this out ๐Ÿ‘‡
Read 11 tweets
Sep 19
We've all dealt with activation functions while working with neural nets.

- Sigmoid
- Tanh
- ReLu & Leaky ReLu
- Gelu

Ever wondered why they are so importantโ“๐Ÿค”

Let me explain... ๐Ÿ‘‡ Image
Before we proceed, I want you to understand something!

You can think of a layer in a neural net as a function & multiple layers make the network a composite function.

Now, a composite function consisting of individual linear functions is also linear.

Check this out๐Ÿ‘‡ Image
We have a simple neural net that does binary classification.

Scenario 1:

- Linear decision boundary
- Linear Activation function

Observe how the neural net is able to quickly learn & loss converges to zero.

Watch this ๐Ÿ‘‡
Read 7 tweets
Sep 12
10 MCP, AI Agents & LLM visual explainers:

(don't forget to bookmark ๐Ÿ”–)
1๏ธโƒฃ MCP

MCP is a standardized way for LLMs to access tools via a clientโ€“server architecture.

Think of it as a JSON schema with agreed-upon endpoints.

Anthropic said, "Hey, let's all use the same JSON format when connecting AI to tools" and everyone said "Sure."

Check this๐Ÿ‘‡
2๏ธโƒฃ MCP vs Function calling for LLMs

Before MCPs became popular, AI workflows relied on traditional Function Calling for tool access. Now, MCP is standardizing it for Agents/LLMs.

The visual covers how Function Calling & MCP work under the hood.

Check this out๐Ÿ‘‡
Read 12 tweets
Sep 11
I've put 100+ MCP apps into production!

There's one rule you can not miss if you want to do the same!

Here's the full breakdown (with code):
There are primarily 2 factors that determine how well an MCP app works:

- If the model is selecting the right tool?
- And if it's correctly preparing the tool call?

Today, let's learn how to evaluate any MCP workflow using @deepeval's MCP evaluations (open-source).

Let's go!
Here's the workflow:

- Integrate the MCP server with the LLM app.
- Send queries and log tool calls, tool outputs in DeepEval.
- Once done, run the eval to get insights on the MCP interactions.

Now let's dive into the code for this!
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(