Chris Profile picture
19 Aug, 22 tweets, 14 min read
I will be live tweeting @Tesla AI Day Here. Personally looking at the data flow, latency, low power and performance of #Dojo and how it compares to @Google #TPU version 4 and other HPC systems and how it directly relate to #AutonomousVehicles #FSDBeta
Well as usual, Tesla is fashionably late and we are almost 18 mins past the start time. And the beat goes on...
#AIDay starts with a demo of FSD with the driver griping the steering wheel with his left hand in what looks to be the streets of California. #DojovsTPU #Dojo @Tesla Image
.
@elonmusk says @Tesla "is the leader in real world AI" due to FSD Beta. Although there are other systems like
@Mobileye Supervision and Huawei Autopilot that also does L2 anywhere in the country.
.@karpathy is now presenting the case for @Tesla camera only system and comparing it to the human eye. The obvious problem is that the cameras Tesla uses are very low resolution (1.2 MB) compared to the Human eye and have disparity in dynamic range. #AIDay #TeslaVision #Dojo Image
Here is a comparison to the industry standard 8mb type of camera used by most AV systems versus the type of camera that @Tesla uses
.@karpathy Is now detailing the difficulty in 2D detection from each individual vs 3D detection from Multi-Cam. This is a known problem with Camera based systems in converting 2D detection into 3D world view. Image
.@karpathy Showcases a variety of prediction networks which are all industry standard. For example Pseudo Lidar (Vidar) and Bird's Eye View network. Its good that @Tesla is finally making progress with this but they are simply following the direction of the industry at large. Image
For example here is @Mobileye and @Waymo Vidar (Pseudo Lidar) NN system.
What's missing in this #AIDAy is prediction. @elonmusk recently detailed how they just began working on their prediction network. @Tesla still relies on conventional control algorithm. Think of algorithms like R*, you could do Convex Optimization. Image
So from my understanding they run the FSD Planner on other cars. So its not actually predicting what a typical car would do in a situation but what the FSD Planner would do. Thats...what..
Other AV companies run complex multi-modal prediction networks that accurately the future behavior of moving agents in any given situation. @Waymo recently published a paper on Target-driveN Trajectory Prediction
.@karpathy Touches on 4D labeling through time. @kvogt talked about it in his presentation at 16mins. This is also industry standard.
Now onto Auto labeling. which is off-board labeling because you have the future frames, allowing you to estimate accurate bounding boxes and tracking. This is also industry standard. Here is Drago at @Waymo giving a SOTA presentation on it. Image
Difference between @Tesla simulation and @Waymo is that Tesla is based on video game engine which suffer from Domain Gap & looks like Tesla knows that and are trying to work on sensor simulation which Waymo has already developed called Simulation City.
blog.waymo.com/2021/06/Simula…
Here is @Waymo Simulation City. Realistic NN generated 3D Surfel Map not made by video game engine (UE), which they can relight with 24 hour realistic TOD, weather, seasons, various sensor simulation NNs, and smart Imitation and RL learned agents that Drago have talked about. Image
.@Tesla is now talking about the #Dojo chip which is a superscalar processor Image
In comparison a single @Google TPU v4 pod (4k chips) delivers 1.1 Eflop and has the best performance in MLperf benchmarks, which is something you don't see mentioned. What matters is the actual benchmark performance and we see nothing provided by @Tesla cloud.google.com/blog/products/… Image
Its worth noting that AI Accelerators like the TPU (ASIC) are just Matrix multiply accumulators (MACs).
Which is all the calculation a neural network needs to do. Matrix Multiplication and Addition. Image
CPU are bad at this because of the memory bottle due to the fact they have to access the register/memory after each calculation. GPU's have 100's processing units and are good for brute force parallelizable calculations. But they still run into the memory bottleneck problem.
In contrast the TPU uses high speed interconnect & the hardware does sync in an instant. First TPU loads the data into the MAX accumulators & as each multiplication is executed, the result is passed to the next multiplier while also summing & nothing is saved in memory/registry.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Chris

Chris Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Christiano92

23 Aug
1/ Setting the record straight: What you see here is Sim 1.0, while the industry is moving towards Sim 2.0. Example are @Waymo Simulation City and @RaquelUrtasun GeoSim. Ex: the below reconstruction was manually recreated in Unreal Engine 4 and would take *hours/days. #TeslaAIDay
2/ Even environments recreated with procedural tools (1.0 tech) are limited by hand modeled assets & textures by artists, introducing huge domain gaps. Its not scalable & is why Sim 1.0 is being depreciated (taking backseat) in some AV companies while Tesla is just introducing it
3/ .@theinformation said in Q4 2018 that Tesla's simulation were "in their infancy". Sim 1.0 is ~2015 era tech that gives perfect labeled ground truth, procedural scenario generation & reconstruction, etc. @aurora_inno goes into details of Sim 1.0 here.
aurora.tech/blog/scaling-s…
Read 12 tweets
20 Aug
So instead of a working HPC, what we saw in #Dojo is a single node that’s is running a rudimentary NN on a single test bench. Designing a chip is easy. What’s hard is building the compiler, runtime scheduler in a HPC environment at scale. none of which @Tesla is anywhere close to Image
When asked, its brushed off as “No but we will”. This exapod doesn’t exist it’s a photoshopped image and won’t exist for many years. That’s why there’s no MLPerf benchmarks. There is only one exapod in existence & that’s @Google TPU version 4. In use today by @Waymo #Dojo Image
By the time this is working and ready in the presented form and specs there will already be a TPU V5. @Tesla will always be 3-4 years behind on this front. But this won’t stop the fanboys because ignorance is bliss.

Read 4 tweets
7 May
With @Tesla CJ Moore, Director of Autopilot saying "Elon’s tweet does not match engineering reality". Here's Elon's tweets/statement about #FSD $TSLA

Dec 2015: "We're going to end up with complete autonomy, and I think we will have complete autonomy in approximately two years."
January 2016: "In ~2 years, summon should work anywhere connected by land & not blocked by borders, eg you're in LA and the car is in NY"

June 2016: "I really consider autonomous driving a solved problem, I think we are less than two years away from complete autonomy, safer than humans, but regulations should take at least another year"

theguardian.com/technology/201…
Read 24 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(