Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Awni Hannun

@awnihannun

Jul 11, 2025 • 2 tweets • 1 min read • Read on X

https://twitter.com/Kimi_Moonshot/status/1943687594560332025

The new Kimi K2 1T model (4-bit quant) runs on 2 512GB M3 Ultras with mlx-lm and mx.distributed.

1 trillion params, at a speed that's actually quite usable:

https://twitter.com/Kimi_Moonshot/status/1943687594560332025

Here's a sample command:

mlx.launch --hostfile hosts.json \
mlx-lm/mlx_lm/examples/pipeline_generate.py \
--model mlx-community/Kimi-K2-Instruct-4bit \
--prompt "Say hello world"

Documentation on setting up mx.distributed:
ml-explore.github.io/mlx/build/html…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @awnihannun

Awni Hannun

@awnihannun

Dec 5, 2023

Just in time for the holidays, we are releasing some new software today from Apple machine learning research.

MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop!)

Code:
Docs: github.com/ml-explore/mlx
ml-explore.github.io/mlx/build/html…

The video is a Llama v1 7B model implemented in MLX and running on an M2 Ultra.

More here:

* Train a Transformer LM or fine-tune with LoRA
* Text generation with Mistral
* Image generation with Stable Diffusion
* Speech recognition with Whisper github.com/ml-explore/mlx…

MLX Data is a framework agnostic, efficient, and flexible package for data loading.

Code:
Docs: github.com/ml-explore/mlx…
ml-explore.github.io/mlx-data/build…

Read 4 tweets

Awni Hannun

@awnihannun

Jul 1, 2022

Read a bit about Grokking recently. Here's some learnings:

"Grokking" is a curious neural net behavior observed ~1 year ago (arxiv.org/abs/2201.02177).

Continue optimizing a model long after perfect training accuracy and it suddenly generalizes.

Figure:

What's especially surprising is that generalization happens SO LONG after perfect accuracy on train.

The sudden generalization is interesting, but we've seen this type of rapid concept learning in NNs before.

Some rough explanations of Grokking:

After learning the training set, the model randomly walks between low-loss solutions (beren.io/2022-01-11-Gro…)

...and stays at generalizing solutions because they have slightly better training loss (alignmentforum.org/posts/zvWqPmQa…)

Read 9 tweets

Awni Hannun

@awnihannun

Jun 4, 2022

A short thread on forward and reverse mode autograd:

A great way to internalize the complexity difference between forward and reverse mode automatic differentiation is through the lens of Jacobian-vector products.

First: the Jacobian of a function is the matrix of derivatives with inputs on rows and outputs on columns.

The (i, j) entry is the derivative of the j-th output with respect to the i-th input.

Reverse-mode let's you compute a Jacobian-vector product for a given vector in a single pass.

Forward-mode let's you compute a (row) vector-Jacobian product for any vector in a single pass.

Read 9 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Awni Hannun

Try unrolling a thread yourself!

More from @awnihannun

Awni Hannun

Awni Hannun

Awni Hannun

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!