Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Sully

@SullyOmarr

May 21 • 12 tweets • 3 min read Twitter logo

Read on Twitter

The medical industry will never be the same again.

Google has released Med-PALM2, their LLM specifically for medical purposes.

It scored higher than humans on a medical exam.

Heres the breakdown, and a glimpse of future of healthcare:

Alright so what' is med-palm? It's google's version of palm-2, but fine tuned and trained specifically for medical purposes.

That means its much MUCH better than a generic LLM at
- understanding users questions
- understanding various health scenarios
- synthesizing answers

And here is the crazy part. It scored 86.5% on the MedQA dataset, the highest out any LLM.

The dataset contains general USMLE style questions. These are the same questions a person is applying for a medical license would answer in an exam beforehand.

When they did an independent long-form evaluation with physician rater, they founded that the answers were comparable to a physician answer! (basically doctors rated the LLM's answers)

Yes you heard that right, Med-PaLM2 was able to answer the questions as well as doctors.

It actually gets even better..

They asked a group of of doctors to look at answers given by Med-PaLM 2 and other doctors for over 1k health related questions.

The doctors actually preferred the ones from the LLM over the ones given by their colleagues in most cases

They also looked at how people with no medical backgrounds liked the questions, and it was a similar story.

The evaluations were very very close for Med-PaLM 2 vs physicians

Can you imagine a world where we have no more mis diagnosis dude to negligent doctors?

There should be no reason why you'd get the wrong advice from a doctor thats using some AI assistant.

The report also talks about how there are still quite a lot of limitations, namely that:

- it hasn't been extensively tested in a real-world clinic setting
- evaluations were performed on specific datasets ( might not be applicable to all medical scenarios)

- The outcome of these models in practice might vary depending on features like answer length and the level of detail provided, which can change the perceived quality of responses.

At the end of the day, I think medical focused LLM's have the power to turn healthcare on its head.

Doctors + AI = Better diagnosis, quicker turn around, more time for doctors to spend time with patients (instead of grunt work)

Better healthcare ftw

@SullyOmarr

That's a it!

If you enjoyed this thread:

Give me follow me @SullyOmarr for more of these :)

https://twitter.com/1328913688892346370/status/1660318717174071297

oh yah link to paper:
arxiv.org/pdf/2305.09617…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SullyOmarr

Sully

@SullyOmarr

May 20

Quick comparison of Claude vs GPT-4.

Speed on GPT4 is a huge issue, and its still slower despite it not being peak times.

How do you think the outputs compare?

At first GPT4 seems more comprehensive, but is it really? Ex:

I didn't even ask for that? It kinda made it up with random numbers

Compared to Claude:
Actually gives me actionable things i can do (although i didn't specify my product was acne related)

Read 4 tweets

Sully

@SullyOmarr

May 19

Made agents shareable so its easier to share them with other people (instead of screenshotting it)

Uses Vercel OG image to dynamically create the preview image.

Hoping it increases social aspect of it a bit more (will be doing more cool stuff around this later too)

Heres one of it scraping the news sites and creating an outline for a podcast using the latest news from tesla app.cognosys.ai/s/gruPwp0

And rip its supposed to look like this, not sure why it didnt preview the whole image

Read 4 tweets

Sully

@SullyOmarr

May 15

Spoke with a few tech-savvy friends who heard of chatGPT but barely use it.

Kinda surprised me, so I asked them why they don't use it more often.

The answer? They didn't know what to use it for.

And when I think about it, it makes a lot of sense:.

Let me explain:

They basically told me that they didn't know what it was capable of other than "write me a paper" or "write my marketing copy".

I think as these models get more capable, they're going to be more confusing for the average person to use.

Confusing in the sense that they just have no idea what to do.

Plus, since ChatGPT/Bard/Claude etc isn't perfect, they test it once, and if it falls short of their needs, they ditch it.

Read 8 tweets

Sully

@SullyOmarr

May 13

Google's new model, Bison, is here.

I decided to compare it to GPT4 and benchmark them in 4 real world scenarios, looking at: output, reasoning and cost.

Here's what i found:

First up Coding: Here's the prompt I gave it:

Create the backend api for a flask application. The application should support the basic endpoints needed to create a netflix-like application.

First pic is gpt4, second is bison.

Then a follow up: "Ok now we want to implement some sort of rate limiter so that users cannot hammer our servers. How would you update this?"

Overall i'd give this to GPT4. It was able to reason and understand my prompt a lot better. Only downside is GPT4 is 15x more expensive

Read 12 tweets

Sully

@SullyOmarr

May 10

Googles Keynote 🤯 for AI. Here's a brief overview:

About 25 different AI products. Biggest LLM news

Palm 2
- 4 different sizes, gecko (runs on mobile) , otter, bison, unicorn

Palm med
- tuned for Medical purposes. Can analyze x-rays to help radiologists.

- Palm Security

@Replit

Bard getting A HUGE upgrade.

Its now fully running on Palm 2:
- Much better at coding
- Can search github urls
- Will cite code from sources
- Export python to Colab
- Export code to @Replit
- Interact with partnered tools/google apps directly in bard
- Support image prompts

Bard also allows support for different "plugins"
- Spotify
- Adobe firefly
- Khan Academy
- Zip recruiter + plus

Its also available globally with no waitlist.

Read 4 tweets

Sully

@SullyOmarr

May 8

Vector databases & embeddings are the current hot thing in AI.

Pinecone, a vector DB company, just raised $100M at ~1b valuation.

Shopify, Brex, Hubspot and others use them for their AI apps

But what are they, how do they work and why are they SO crucial in AI? Let's find out

Ok so first, what are vector embeddings? You've likely seen this word thrown thrown out a billion times all over twitter

The simple explanation is:

Embeddings are just a N-dimensional vectors of numbers. They can represent anything, text, music, videos, etc. We'll focus on text

The process of creating an embedding is straight forward. It involves an embedding model (ex: Ada from Openai).

You send your text to the model, and it creates a vector representation of that data for you, which can be stored and used later on.

Read 16 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Sully

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @SullyOmarr

Sully

Sully

Sully

Sully

Sully

Sully

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!