I have tried to figure out why $TSLA designed Dojo instead of using #Cerebras last night. I almost lost sleep on it and I don't have an answer yet. Any input welcome. Thread delow for where I stand.👇
1 - a fan out wafer of Dojo and a wafer-scale chip of Cerebras have similar transistor density (~2.5 Trillions). Dojo claims 9 PFLOPS BF16, Cerebras 2.5 PFLOPS half precision, which is about equivalent, I think. (3+X ratio between the two is fair-same as FP32/64 ratio on Ampere)
2 - Dojo plans to scale out to ~120 Wafers to get to Exascale. Cerebras announced yesterday at Hotchips 192. Both configurations are similar, and deliver exascale compute. Both architecture should be able to go beyond that if needed as well.
3 - I still haven't figured out comparing bandwidth across wafer and between wafer. I need help. On intra-wafer latency, I suspect Cerebras is superior, but I am not sure, and I am not sure it matters, as long as you can move data node-to next-node in one cycle. Help needed
4 - different processor architecture: Cerebras has 850k (!) super basic cores per wafer, Dojo has 9,000! That means Dojo cores are 10x larger. I need help here as well. What difference does that make.
5 - One last consideration: Using effectively these monster systems, I think is the key. Maybe the only people able to efficiently run training on a Dojo or a Cerebras cluster are the people who designed them. Maybe the answer to my question is just that?
6 - Both Cerebras and Dojo are amazing machines. They can eventually train models with complexity comparable to the human brain... My next sleepless night will be comparing Dojo and Cerebras to the TPU v4 cluster. It is an exascale system as well.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Pierre Ferragu

Pierre Ferragu Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @p_ferragu

14 Sep
Some have asked me to be more specific. Quick thread, below. Not my area of expertise, so probably a lot of nonsense & ignorance, but my views at this point in time. In increasing order of importance and specificity; go to the bottom. for the full picture.
1) Taxing corporate is taxing everybody, competitive dynamics analysis shows tax hike gets into end-product price. Those spending more of their total income, i.e. not the richest ar hurt the most. This is govt taking more money not at all redistributing it towards the poorest.
2) Not taxing the super-rich friends making tax-shielded money, but only the upper middle folks, making real good money (>$400k), out of a real job or a real business they own. It is a) ethically wrong, demagogic, and divisive; b) counter-productive: fosters tax-shield strategies
Read 5 tweets
20 Aug
I was pretty excited to watch $INTC 's Architecture day yesterday - very cool upcoming parts. Intel is definitely coming back…. then I watched $TSLA 's AI day… OMG... How will others compete with them on any autonomy use case? 👇Thread
1 -Tesla has created the most advanced SW architecture for perception, with recurring neural nets refining in real time and selectively their environment, with spacial and temporal queuing - exactly like we humans do. @dileeplearning @karpathy - you guys should talk!
2 -The above requires monster infrastructure to gather relevant video streams and label them for training. Tesla has a million cars on the road and INTEGRATED semi-auto labeling with >1,000 labelers. This is a key success factor. The above is impossible without it:
Read 11 tweets
21 May
The long awaited cash return on operating asset tweetstorm: $TSLA and $TSLAQ, anyone interested in investing and evaluating the FUNDAMENTAL RIGHT TO MAKE MONEY™ of a business should read this - our research on Tesla’s profitability is in the thread as well👇👇👇👇👇0 / 11
1 - What makes a right to make money™ ? Your ability to 1) invest some money into an operating asset 2) burn some cash to operate the asset and 3) generate cash from selling the resulting product or service, in excess of cash spent on running operations.
2 - A right to make money™ is more than a competitive advantage. It is a competitive advantage at producing something, which meets a need. which is valued at more than your cost of producing it. This is profound - as @ElonMusk would say.
Read 14 tweets
28 Jan
New narrative in the auto industry is that they have a software problem. Same as Nokia ten years ago. “We just need to invest in software to get back in the game”. Unfortunately this strategy failed; Nokia disappeared (in smartphones). Why? Thread below. $TSLA $BMWYY $DMLRY etc
First, investing in software doesn’t make sense. Software is a living animal, which starts like a baby, 3 guys in a dorm, and grows a full childhood before being grown up. If you “invest in software”, you create a Frankenstein. Ugly and dangerous.
Second, SW is the tip of the iceberg. The real key to success is end to end integration. Making a car like an iPhone. Perfectly pulled together. SW is where lack of integration is the most visible (can’t do OTA updates, our SW is fragmented), but integration matters everywhere.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(