Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Alan Karthikesalingam

@alan_karthi

May 17 • 6 tweets • 7 min read Twitter logo

Read on Twitter

So happy to share #MedPaLM2 - our team's evolution of Med-PaLM. A new state of art for medical question-answering!

Med-PaLM 2 scores 86.5% on MedQA-USMLE, exceeding Med-PaLM's score by >19% 🤯, & 81.8% on PubMedQA...

More here: arxiv.org/pdf/2305.09617…

We believe in rigorous, careful evaluation. Physicians even preferred #MedPaLM2's long-form answers to answers from other real 🇮🇳🇺🇸🇬🇧 physicians along 8/9 axes of quality including medical accuracy (consensus w/medical opinion) and reasoning, with less likelihood of harm

MedPaLM-2's performance was superior to Med-PaLM far beyond exam performance. To highlight the real-world importance of nuanced evaluation we introduce a new dataset of "adversarial" questions designed specifically to probe LLM weaknesses including #HealthEquity

Lay raters also consistently find MedPaLM-2 more helpful, and that it directly addresses the intent behind a medical question:

@sundarpichai

I can't believe I get to say this but you can see a summary by @sundarpichai with a sneak peek at where our research is heading next! Also co-senior authors' feeds at @AziziShekoofeh @vivnat and first authors @taotu831 @thekaransinghal @Mysiak
👀 youtube.com/clip/Ugkxb7W_k…

@GoogleAI

Grateful for our amazing team @GoogleAI @GoogleHealth @DeepMind @AziziShekoofeh
@thekaransinghal
@vivnat
@Mysiak
@weballergy
@clark_kev
@stephenpfohl
@hcolelewis
@ymatias
@greg_corrado

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @alan_karthi

Alan Karthikesalingam

@alan_karthi

Dec 27, 2022

@GoogleHealth

💡New paper - Large Language Models Encode Clinical Knowledge💡 Our work @GoogleHealth @GoogleAI @DeepMind advances state-of-art in 7 medical question-answering tasks - including achieving 67% on MedQA (USMLE qs) improving prior work by >17%

arxiv.org/abs/2212.13138

1/n

https://twitter.com/vivnat/status/1607609299894947841

Careful evaluation is key for LLMs in safety-critical settings. We pilot a framework for clinician and layperson evaluation of LLMs’ outputs. Deeper human inspection reveals gaps in comprehension + reasoning (2/n)

We approach these with instruction prompting-tuning. We show that this helps to align a model "MedPaLM" better to the medical domain - with smaller gaps in reasoning, comprehension, safety and helpfulness

(3/n)

Read 5 tweets

Alan Karthikesalingam

@alan_karthi

Nov 5, 2021

@GoogleHealth

Our research @GoogleHealth @GoogleAI @DeepMind published at Medical Image Analysis goo.gle/31kUam7.
Wise doctors know when they don’t know- medical AI should too. In dermatology this is critical, as many rare skin conditions occur too infrequently for AI to learn (1/n)

https://twitter.com/GoogleHealth/status/1456660083102916614

For AI researchers, detecting conditions a model has not seen in training is called “out-of-distribution (OOD) detection”. Doing this in medical AI is significantly harder than most computer vision work, because the differences between rare + common diseases can be subtle

Using our large-scale pre-training advances and a novel "HOD" loss, we achieved an AUC of 0.83 on a new benchmark for this "near-out-of-distribution" detection challenge - to evaluate how well a dermatology AI system recognises a previously-unseen condition.

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Alan Karthikesalingam

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @alan_karthi

Alan Karthikesalingam

Alan Karthikesalingam

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!