Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Mustafa Suleyman

Jun 30 • 6 tweets • 2 min read • Read on X

We're taking a big step towards medical superintelligence. AI models have aced multiple choice medical exams – but real patients don’t come with ABC answer options. Now MAI-DxO can solve some of the world’s toughest open-ended cases with higher accuracy and lower costs.

While AI has achieved near-perfect scores on the US Medical Licensing Exam, we set a higher benchmark: 304 cases from the New England Journal of Medicine. These are some of the toughest and most diagnostically complex cases a physician can face.

Microsoft AI built MAI-DxO to simulate a virtual panel of physicians with different approaches collaborating to find a diagnosis on each case. They also included the ability to set a budget to avoid infinite testing (higher costs, longer wait times, etc.).

What they found:
- MAI-DxO boosted performance of every model tested on those 304 cases
- 85.5% solve rate vs. 20% by a group of physicians
- Its higher accuracy came with LOWER overall testing costs than lone LLMs or physicians

MAI-DxO in action, tackling one of those complex cases:

This research is just the first step on a long, exciting journey. We’re excited to keep testing and learning with our healthcare partners in pursuit of better, more accessible care for people everywhere. More on the blog today: microsoft.ai/new/the-path-t…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @mustafasuleyman

Mustafa Suleyman

@mustafasuleyman

Apr 5

ICYMI: we made a lot of Copilot announcements this morning! Some of the highlights of what’s rolling out today + in the coming weeks 🧵

Copilot can remember you now, from the name of your dog to whether you feel most productive first thing in the morning. You’re always in control, and can delete any conversation at any time. This is the start of really personalizing your Copilot.

We’re also experimenting with personalization through Copilot’s appearance. Maybe you want yours to reflect your music taste or a love of Clippy?

Read 10 tweets

Mustafa Suleyman

@mustafasuleyman

Mar 19

You can't just be right, you have to know you're right. Good advice for LLMs, according to new Johns Hopkins research. Sometimes no answer is better than a wrong one – life or death choices in medicine, for example, or big financial decisions. 🧵

We know more compute results in higher accuracy, but are the models more confident those answers ARE accurate too? And how do we teach them when to say “I don’t know”? That’s what the research team wanted to find out.

In the study, they measured how different combinations of compute budget and confidence thresholds (being at least 50% sure of the answer, etc.) affected models’ performance on a benchmark math test.

Read 9 tweets

Mustafa Suleyman

@mustafasuleyman

Jan 29

Today we’ve made Think Deeper free and available for all users of Copilot.

This now gives everyone access to OpenAI’s world class o1 reasoning model in Copilot, everywhere at no cost.

I urge you to give it a try. It’s truly magical. Think Deeper helps you:

Get in-depth advice on how to manage a career change, with detailed breakdowns of educational milestones and options, resources on where to look for roles, strategies for getting in the door and industry trends you absolutely need to know.

Plan that epic project. Brain dump everything into Think Deeper and watch it churn through it all and spit out a clean, crisp step by step guide to making it happen. I've tried this on a few things (fitness routine, big launch coming up) and it’s genuinely so helpful.

Read 6 tweets

Mustafa Suleyman

@mustafasuleyman

Jan 18

https://twitter.com/emollick/status/1879633485004165375

After Ethan's post, I went on a deep dive into this study! I could go on and on about the results but if I had to boil it down to my biggest takeaways...🧵

https://twitter.com/emollick/status/1879633485004165375

The setup: For 6 weeks, students used Copilot in their computer lab 2x/week, guided by teachers on selected topics and grammar/writing tasks.
The results: A pen and paper test showed their scores improving .3 standard deviations, the equivalent of almost 2 years of learning.

• Many of these students had never even used a computer before. They spent the beginning of the program figuring out how to navigate a PC, setting up user accounts, being taught how to prompt. Makes the learning curve even more remarkable.

Read 9 tweets

Mustafa Suleyman

@mustafasuleyman

Jun 8, 2023

Very excited to announce my new book: The Coming Wave

Today's AI is only the start. A wave of emerging technologies will help address global challenges & create vast wealth. But they will also create upheaval on a once unimaginable scale.

THE-COMING-WAVE.COM

These are ideas I've been thinking about for over a decade. This is my attempt to understand how and why technology naturally proliferates, and what society needs to do to remain in control.

I argue that “containing” this coming wave is the defining challenge of the century.

As the public conversation around AI has exploded, it's more important than ever for those of us driving development to critically reflect on what’s unfolding.

I hope it’s useful. It intended to provoke debate and encourage everyone to develop new strategies for containment.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Mustafa Suleyman

Try unrolling a thread yourself!

More from @mustafasuleyman

Mustafa Suleyman

Mustafa Suleyman

Mustafa Suleyman

Mustafa Suleyman

Mustafa Suleyman

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!