Interesting failure mode of #GPT4!

It can't play "Set", a card game that is trivial to solve with a 10-line python program.

--> it can explain the game, can abstract it, write a program to solve it, but can't actually *play* it.

Check the full convo in the chat below:
First, I asked about the game (great intro if you don't know the game).
Then, I asked #GPT4 to come up with a strategy to solve it. The strategy is sound, but it doesn't consistently map between the numbers and the attributes:
Ok, so let's try playing: (spoiler: it provides only wrong answers!)
Funnily, it can write a program that does it correctly:
which returns:
Set 1: (2121, 1213, 3332)
Set 2: (3221, 3113, 3332)
Set 3: (1331, 3132, 2233)

But #GPT4 can't simulate this program properly, and when being pointed out that the output is wrong, it suggests the algorithm is wrong and edits the code:
This again showcases the promise of Toolformer and external plugins to execute code..
Might be of interest to @fchollet
#Bard can't play Set either. At least, it hallucinated a valid Set (but the cards were not on the table).

#AI #LLM #NLP

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Michael Moor

Michael Moor Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Michael_D_Moor

Dec 29, 2022
2022 was wild for medical #AI and esp. medical foundation models (FMs).

This 🧵lists some of the standout papers from this year about this topic. Let's go!
(1/9)
#medtwitter #AIinMedicine #medicalAI @NatureMedicine
First, some excellent reviews explaining some key preliminaries of FMs:

🔍 self-supervised learning:
a paradigm that allows for the training of AI models w/o explicit and costly labels (huge for medical applications).
nature.com/articles/s4155…
(2/9)
🔍 multimodal AI:
Medicine is driven by various types of data that hold complementary information. The rise of multimodal medical AI promises holistic views on patients and their diseases.
nature.com/articles/s4159…
(3/9)
Read 10 tweets
May 29, 2021
If you want to predict clinical phenotypes using #MachineLearning, check out our systematic review on ML-based #sepsis prediction. A THREAD with take-aways that could be relevant to #AI in #healthcare in general. (--> = hints for practitioner) 1/n
bit.ly/34pIvRs
1) Motivation: why should we even care about #sepsis? For decades clinicians have a) struggled to detect it in its early stages where organ damage is still reversible and b) failed to find a robust and early biomarker for sepsis. 2/n
--> Make sure that your problem (as well as your solution) *actually matters* (e.g., how is it helping doctors if you just foresee that a patient won't make it?) and ensure that it cannot be easily solved by conventional approaches.
3/n
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(