After enjoying $TSLA #TESLA#FSDBeta, at last. My 2 impressions:π
#1 - Awesome! Quality of the experience is exceptional. I was stunned and I was expecting much less. My car drove me to Mass today, and back, Alleluia! π. The quality of perception, handling complex and uncertain situations... Waow. Best A.I.-based product I know in the world.
#2 - One may think the gap to driverless (L4-L5) is tiny. It is kind of true. It is tiny, but of abysmal depth. I continue to firmly believe the technology will never scale-out effortlessly over a fleet, for driverless services. Let me explain three reasons why:π
-2.a: Removing supervision. I will be able to do most trips without disengaging soon, but >1,000x better is required for a whole fleet driverless. Solution: mixed fleet human-driven / autonomous, in which autonomous ramps as a% of total fleet over the years.
-2.b: Mission delivery. My car drives by itself but will struggle to deal with riders and get the job done. Where to stop & pick up, driveways, problem solving in tricky situations. Many trips will generate user frustration, even abort. Solution: same as above!
-2.c: User acceptance. The most important element of the FSD experience is learning... not the model learning on Dojo, but me learning to relax while the car drives me around! Driverless-only fleets will not gain share easily. Solution: same as above!
Conclusion #1: FSD is massive, No competition on the horizon. Makes Tesla's differentiation incredibly more sustainable, for the very long run. The underlying A.I. will be monetized in multiple other ways, but the straight "switching on overnight a robotaxi fleet" doesn't work.
Conclusion #2: In order to capture the Robotaxi opportunity, Tesla needs to be smart about deployment trajectory, go to market, and partner with operators (e.g. Uber, UPS, etc.).
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
A short thread on accelerated computing. How $TSLA's #Dojo, $GOOG's #TPU#Cerebras's #WSE-2, #Graphcore's #IPU, and others compare to $NVDA 's #GPU. This is an extract of work we published a few weeks ago. π1/6
It is difficult to compare chips. Designing a chip is all about managing trade-offs between multiple dimensions. A chip has limited potential resources and the architect allocates them over these multiple dimensions. 2/6
With this framework in mind, one sees alternatives to #GPU have fundamentally different architectures, favoring the flow of data across the chip vs. the flow of data between the memory and the chip. 3/6
Quick thread of $TSLA Dojo vs. $FB Metaβs RSC πππFun side note, as I write a note comparing architectures: Microsoft doesnβt even recognize the word exascale in its dictionary. They are running behind big time! π
More seriously: in a single tweet - RSC will be 5x larger than Dojo in terms of computing power, but an order of magnitude behind on through-bandwidth per transistor. Details ππ
This metric (that I invented tonight) is the key. Exaflops are not equal. You need to turn them into useful exaflops, feeding them with the right data at the right time. For that you need bandwidth and low latency between compute units.
1) @ $42k per car, probably no discount, and it makes sense, why would Tesla sell at a discount while they won't be able to meet fully demand for years?
2) Hertz is the most cost-conscious car buyer in the world. Hertz picking Tesla means only one thing: no other manufacturer come close on total cost of ownership.
Some have asked me to be more specific. Quick thread, below. Not my area of expertise, so probably a lot of nonsense & ignorance, but my views at this point in time. In increasing order of importance and specificity; go to the bottom. for the full picture.
1) Taxing corporate is taxing everybody, competitive dynamics analysis shows tax hike gets into end-product price. Those spending more of their total income, i.e. not the richest ar hurt the most. This is govt taking more money not at all redistributing it towards the poorest.
2) Not taxing the super-rich friends making tax-shielded money, but only the upper middle folks, making real good money (>$400k), out of a real job or a real business they own. It is a) ethically wrong, demagogic, and divisive; b) counter-productive: fosters tax-shield strategies
I have tried to figure out why $TSLA designed Dojo instead of using #Cerebras last night. I almost lost sleep on it and I don't have an answer yet. Any input welcome. Thread delow for where I stand.π
1 - a fan out wafer of Dojo and a wafer-scale chip of Cerebras have similar transistor density (~2.5 Trillions). Dojo claims 9 PFLOPS BF16, Cerebras 2.5 PFLOPS half precision, which is about equivalent, I think. (3+X ratio between the two is fair-same as FP32/64 ratio on Ampere)
2 - Dojo plans to scale out to ~120 Wafers to get to Exascale. Cerebras announced yesterday at Hotchips 192. Both configurations are similar, and deliver exascale compute. Both architecture should be able to go beyond that if needed as well.
I was pretty excited to watch $INTC 's Architecture day yesterday - very cool upcoming parts. Intel is definitely coming backβ¦. then I watched $TSLA 's AI dayβ¦ OMG... How will others compete with them on any autonomy use case? πThread
1 -Tesla has created the most advanced SW architecture for perception, with recurring neural nets refining in real time and selectively their environment, with spacial and temporal queuing - exactly like we humans do. @dileeplearning@karpathy - you guys should talk!
2 -The above requires monster infrastructure to gather relevant video streams and label them for training. Tesla has a million cars on the road and INTEGRATED semi-auto labeling with >1,000 labelers. This is a key success factor. The above is impossible without it: