If there's one conversation #DataScientists or #MachineLearning engineers dread, it's...

📊Explaining accuracy to a non-technical stakeholder

Too high-level, and they suspect you're hiding something. Too granular, and they'll be lost in the weeds.

My solve? Visualize F1 👇
First, I find F1 navigates simplicity and power. This lets me earn trust by briefly explaining the downside of traditional Accuracy:

"Imagine predicting fraud where only 1% of the transactions were fraudulent. If the model predicted that none were fraudulent, it would be..."
Your stakeholder will instantly understand -- "99% Accurate!"

Use that to explain the power of F1:

"F1 measures how well the model is doing at finding that 1%, and only that 1%."

I have yet to find a stakeholder that would at least hear me out after that. So next...
The big hurdle: explaining F1. What you don't do is say, "F1 is the harmonic mean of precision and recall, which is true positives over total positives, and true positives over all true predictions respectively."

Recipe for that glazed donut look -- not in a good way. 🍩
I always prepare a visual laid out on a single page (so they can take it with them and tape it to their cubicle wall and impress their friends with their stats knowledge).

First, a visual legend:
I explain the difference between Predictions (what the model thinks is the correct answer) and Reality (the correct answer).

I then show them how the model can be correct in two ways -- by predicting yes when the answer is yes, and predicting no when the answer is no.
I ask them to notice how a filled in dot is a correct prediction, while an empty dot is an incorrect prediction. The color indicates what the right answer was.

Now for the big reveal. I give them a visual of the performance of their model:
I explain that there are 100 dots, and each one represents a percentile. I rarely need to do any other context setting for stakeholders to understand this visual.
Once they've soaked in that (letting them do a few mental calculations), I introduce a glossary of metrics, again using the visual to ground them:
Many of them will start trying to do a few of the calculations in their head -- which is great! That means they're engaged!

Once any questions they have are out of the way, it's time for the final act -- their model's performance metrics:
I explain that these ratios just describe the 100-dot visual in a few numbers.

I go back to the imbalanced class example -- the model that predicted there was no fraud would have a very low score, while their model has a high score!

Now we get to have some A+ conversations:
Introducing Drift
"We always check whether the model's performance is changing over time. Here's the same metrics, but only for observations in the last month."

Segmentations
"I know segment X is really important to you. Here's how the model does with those observations."
Sources of Errors
"The model is really struggling with observations that have characteristic Y. Here are the metrics for just those observations. Do you have any ideas how these observations might be different than others?"
Caveat: This only works for binary classification models. But (thread for another day) I find those also strike an amazing balance between simplicity and power, so it's my default way to define the model problem.
🚨: Make sure the two colors you choose are color-blindness friendly! Red/green meaning no/yes is powerful, and you can use it to your advantage, but not all red/green combos are intelligible to a colorblind person.

Here’s a tool for check your colors: davidmathlogic.com/colorblind/#%2…
But how do you explain if the F1 is “good enough?” Another thread for another time!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Data Scientist | Kirsten Lum

Data Scientist | Kirsten Lum Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @machsci

Sep 10
This is why having one makes you irreplaceable! I don’t know of a place to go to find sets of metrics frameworks by function or org type (add a link if you have one!).

But I do have a method for developing them — think in terms of inputs and outputs.
Every organization is in the business of turning inputs into outputs.

Whether in manufacturing where raw materials become products, or in education where applicants become graduates, something goes in one end of an org, gets processed, and comes out changed.
Each function in an org governs one or more steps of the input to output process. E-commerce is about turning people into customers of some product:
➜ Marketing puts more people in the input-output funnel.
➜ UX oversees the customer interfaces
➜ Finance manages company funds
Read 5 tweets
Sep 10
Whether you’re a #DataScientist, #Analyst, or #MachineLearning Engineer, fill your toolbox with one of the most important tools for any job:
🧰 Metrics Frameworks

They’re the shortcut to being effective and irreplaceable.

Example:
Take a org type like an e-commerce business. Most of them have the same functions within them: sales and/or marketing, finance, product, UX (site/app design and maintenance), logistics, customer service, legal, etc.

Seems like a ton of disparate data! Well, turns out…
while it’ll feel overwhelming the first time you work in a new unit, metrics frameworks for a given function are surprisingly durable from org to org.

Here’s an e-commerce marketing metrics framework (I spent time as a marketer and a marketing analyst, so it’s my favorite):
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(