Diana Profile picture
Sep 9 8 tweets 2 min read Read on X
Deepsilicon runs neural nets with 5x less RAM and ~20x faster. They are building SW and custom silicon for it.

What’s interesting is that they have proved it with SW, and you can even try it.

On why we funded them 1/7
2/7 They found that representing transformer models as ternary values (-1, 0, 1) eliminates the need for computationally expensive floating-point math.
3/7 So, there is no need for GPUs, which are good at floating point matrix operations, but energy and memory-hungry.
4/7 They actually got SOTA models to run, overcoming the issues from the MSFT BitNet paper that inspired this. .microsoft.com/en-us/research…
5/7 Now, you could run SOTA models that typically need HPC GPUs like H100s to make inferences on consumers or embedded GPUs like the NVIDIA Jetson.

This makes it possible for the first time to run SOTA models on embedded HW, such as robotics, that need that real-time response for inference.
6/7 What NVDIA is overlooking, is the opportunity with specialized HW for inference, since they've been focused on the high end with the HPC cluster world.
7/7 You can try it here for the SW version

When they get the HW ready, the speedups and energy consumption will be even higher.

More details here too github.com/deepsilicon/Si…
news.ycombinator.com/item?id=414901…
Intuitively this works, because neurons in DNN use activation functions that are S curved with 3 states. With only (-1,0,1), the dot product between matrices just becomes arithmetic. Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Diana

Diana Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @sdianahu

Aug 2, 2022
A summary of my talk earlier today on "how to build and succeed as a technical founder" for @ycombinator @startupschool

(meme edition) 🧵
1/10 What the technical founder does:
(everything including not fun things to get the product out of the door + lots of chaos and ugly code) Image
2/10 Common mistakes in building prototypes Image
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(