jimmah Profile picture
Jan 11 8 tweets 3 min read
@1LoafOfMeat Graph is their public data with averaging applied. I can't rule out them lying. I have put significant effort into trying to find a better way that they could release the data to inform users without making the confusion worse - and I couldn't come up with one. /1
@1LoafOfMeat After pulling the NHTSA crash database and breaking it down many different ways I was unable to find a simple, clear, and accurate way to convey crash statistics. Every approach I came up with had the potential to be gamed by cherry picking thresholds and definitions. /2
@1LoafOfMeat There are dozes of road types and many categories of accidents with a panoply of causes and confounding factors. And there aren't all that many accidents so you can't get statistically significant for any narrow category. /3
@1LoafOfMeat On top of this (for NHTSA data) reporting is done locally and varies by jurisdiction - and quality can be poor so there's subjectivity in deciding what to throw out. A lot of it is clearly erroneous. The NHTSA's own summaries are quite nonspecific for this reason. /4
@1LoafOfMeat Just trying to find incident rates for highway versus city driving is maddeningly and depends on subjective decisions. Any number you present can validly be argued away. The only simple and true statements are about broad categories with tight definitions. /5
@1LoafOfMeat Which is one reason fatalities feature so prominently in statistics. Death is well defined but injuries less so, accident even less so. Incident, crash, damage - all have subjective thresholds of many types. /6
@1LoafOfMeat And definitions are critical because adding or removing just a few 'incidents' to the list can change the 'statistics' a lot. NHTSA has hundreds of pages of categorization rules and they still aren't enough. /7
@1LoafOfMeat In the end broad and simple, if perhaps vague, is probably the best we can do for public facing information. Why trust it? Well for one thing NHTSA gets access to the raw data so lying in a big way would be 1) hard to get away with and 2) extremely damaging when found out. /end

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with jimmah

jimmah Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jamesdouma

May 23, 2022
@MerrillEarnest My take is that it's FUD. The core argument is that increasing features in an NN scales poorly, that HW3 is maxed out on current functionality, and that upgrading from HW3 is infeasible. All of these are wrong and the entire argument is a joke. /1
@MerrillEarnest This is reminiscent of arguing Tesla is unprofitable because they lose money on every car and thus *must* approach bankruptcy as they expand. Every component of the argument is flawed. /2
@MerrillEarnest Where does this claim that NNs scale badly with additional features come from and why would anyone believe it? The opposite is true: NNs scale *incredibly* well with added features - it's one of their core strengths. /3
Read 15 tweets
Jan 18, 2022
People misunderstand the value of a large fleet gathering training data. It's not the raw size of the data you collect that matters, it's the size of the set of available data you have that you can selectively incorporate into your training dataset. /1
This is a critical distinction. The set of data you choose to train with has a huge impact on the results you get from the trained network. Companies that just hoover up everything have to go back through the collected data and carefully select the items to use for training. /2
So if you put cameras on cars and just collect everything you will end up not using 99.999% of it. Collecting all of that is time consuming and expensive. Tesla doesn't do that. Tesla cars select specific items of interest to the FSD project and just upload those items. /3
Read 12 tweets
Oct 26, 2021
Today Tesla release a whitepaper describing new numerical formats being used in Dojo. These novel numerical formats are created with the needs of neural network training in mind. A brief explanation follows: 1/6
Common numerical formats in current use have shortcomings that make their use in the training of neural networks more complex and less efficient that could be ideally achieved. This paper describes two new formats: CFP8 and CFP16. 2/6
CFP8 and CFP16 address many of the shortcomings of existing formats and allow for 2x to 4x increase in performance and capacity of training hardware, with concomitant improvements in hardware size, cost, complexity, and efficiency. 3/6
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(