Sully Profile picture
May 21 12 tweets 3 min read Twitter logo Read on Twitter
The medical industry will never be the same again.

Google has released Med-PALM2, their LLM specifically for medical purposes.

It scored higher than humans on a medical exam.

Heres the breakdown, and a glimpse of future of healthcare: Image
Alright so what' is med-palm? It's google's version of palm-2, but fine tuned and trained specifically for medical purposes.

That means its much MUCH better than a generic LLM at
- understanding users questions
- understanding various health scenarios
- synthesizing answers
And here is the crazy part. It scored 86.5% on the MedQA dataset, the highest out any LLM.

The dataset contains general USMLE style questions. These are the same questions a person is applying for a medical license would answer in an exam beforehand. Image
When they did an independent long-form evaluation with physician rater, they founded that the answers were comparable to a physician answer! (basically doctors rated the LLM's answers)

Yes you heard that right, Med-PaLM2 was able to answer the questions as well as doctors. Image
It actually gets even better..

They asked a group of of doctors to look at answers given by Med-PaLM 2 and other doctors for over 1k health related questions.

The doctors actually preferred the ones from the LLM over the ones given by their colleagues in most cases Image
They also looked at how people with no medical backgrounds liked the questions, and it was a similar story.

The evaluations were very very close for Med-PaLM 2 vs physicians
Can you imagine a world where we have no more mis diagnosis dude to negligent doctors?

There should be no reason why you'd get the wrong advice from a doctor thats using some AI assistant.
The report also talks about how there are still quite a lot of limitations, namely that:

- it hasn't been extensively tested in a real-world clinic setting
- evaluations were performed on specific datasets ( might not be applicable to all medical scenarios)
- The outcome of these models in practice might vary depending on features like answer length and the level of detail provided, which can change the perceived quality of responses.
At the end of the day, I think medical focused LLM's have the power to turn healthcare on its head.

Doctors + AI = Better diagnosis, quicker turn around, more time for doctors to spend time with patients (instead of grunt work)

Better healthcare ftw
That's a it!

If you enjoyed this thread:

Give me follow me @SullyOmarr for more of these :)
oh yah link to paper:
arxiv.org/pdf/2305.09617…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sully

Sully Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @SullyOmarr

May 20
Quick comparison of Claude vs GPT-4.

Speed on GPT4 is a huge issue, and its still slower despite it not being peak times.

How do you think the outputs compare?
At first GPT4 seems more comprehensive, but is it really? Ex:

I didn't even ask for that? It kinda made it up with random numbers Image
Compared to Claude:
Actually gives me actionable things i can do (although i didn't specify my product was acne related) Image
Read 4 tweets
May 19
Made agents shareable so its easier to share them with other people (instead of screenshotting it)

Uses Vercel OG image to dynamically create the preview image.

Hoping it increases social aspect of it a bit more (will be doing more cool stuff around this later too) Image
Heres one of it scraping the news sites and creating an outline for a podcast using the latest news from tesla app.cognosys.ai/s/gruPwp0
And rip its supposed to look like this, not sure why it didnt preview the whole image Image
Read 4 tweets
May 15
Spoke with a few tech-savvy friends who heard of chatGPT but barely use it.

Kinda surprised me, so I asked them why they don't use it more often.

The answer? They didn't know what to use it for.

And when I think about it, it makes a lot of sense:.

Let me explain:
They basically told me that they didn't know what it was capable of other than "write me a paper" or "write my marketing copy".

I think as these models get more capable, they're going to be more confusing for the average person to use.
Confusing in the sense that they just have no idea what to do.

Plus, since ChatGPT/Bard/Claude etc isn't perfect, they test it once, and if it falls short of their needs, they ditch it.
Read 8 tweets
May 13
Google's new model, Bison, is here.

I decided to compare it to GPT4 and benchmark them in 4 real world scenarios, looking at: output, reasoning and cost.

Here's what i found:
First up Coding: Here's the prompt I gave it:

Create the backend api for a flask application. The application should support the basic endpoints needed to create a netflix-like application.

First pic is gpt4, second is bison. ImageImage
Then a follow up: "Ok now we want to implement some sort of rate limiter so that users cannot hammer our servers. How would you update this?"

Overall i'd give this to GPT4. It was able to reason and understand my prompt a lot better. Only downside is GPT4 is 15x more expensive ImageImage
Read 12 tweets
May 10
Googles Keynote 🤯 for AI. Here's a brief overview:

About 25 different AI products. Biggest LLM news

Palm 2
- 4 different sizes, gecko (runs on mobile) , otter, bison, unicorn

Palm med
- tuned for Medical purposes. Can analyze x-rays to help radiologists.

- Palm Security
Bard getting A HUGE upgrade.

Its now fully running on Palm 2:
- Much better at coding
- Can search github urls
- Will cite code from sources
- Export python to Colab
- Export code to @Replit
- Interact with partnered tools/google apps directly in bard
- Support image prompts
Bard also allows support for different "plugins"
- Spotify
- Adobe firefly
- Khan Academy
- Zip recruiter + plus

Its also available globally with no waitlist.
Read 4 tweets
May 8
Vector databases & embeddings are the current hot thing in AI.

Pinecone, a vector DB company, just raised $100M at ~1b valuation.

Shopify, Brex, Hubspot and others use them for their AI apps

But what are they, how do they work and why are they SO crucial in AI? Let's find out Image
Ok so first, what are vector embeddings? You've likely seen this word thrown thrown out a billion times all over twitter

The simple explanation is:

Embeddings are just a N-dimensional vectors of numbers. They can represent anything, text, music, videos, etc. We'll focus on text Image
The process of creating an embedding is straight forward. It involves an embedding model (ex: Ada from Openai).

You send your text to the model, and it creates a vector representation of that data for you, which can be stored and used later on.
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(