Extraordinary new paper from Google on medicine & AI: When Google tuned a AI chatbot to answer common medical questions, doctors judged 92.6% of its answers right … compared to 92.9% of answers given by other doctors.
Doctors also rated the likelihood and extent of the harm that came from giving the wrong answers.
The percentage of harmful advice from the trained chatbot (Med-PaLM) was essentially rated the same as the percentage of potentially harmful advice provided by other real doctors!
Also, to be clear, there are lots of caveats in the paper, and the system is nowhere close to replacing doctors.
But the rate of improvement is fast, and there is a lot of potential for AI to work with doctors to improve their own diagnoses.
2022 is ending and it is time for prophecy! Or at least, a short thread on a favorite super-nerdy literary device: prophetic poems in fantasy novels.
⚔️The grandfather of these is Kipling's 1909 poem, the "Runes of Weland's Sword." Tolkien definitely read this poem... 1/6
…And it may have inspired Tolkien to create his own prophecy poem, the Riddle of Strider 👑2/
Perhaps the best-written prophecy poem in fantasy is from Susan Cooper’s Dark is Rising sequence. Both memorable and actually useful to the characters in the book
I think I have found a favorite ChatGPT hallucination.
I asked it to provide a table of the cities in Italo Calvino’s Invisible Cities, and, after a couple of real ones, it just started making them up. I then asked the AI to provide entries on the fictional fictional cities…
It gets better. I asked it to give me another city, and the structure was very similar to the first.
But it was insistent that both entires were actually written by Calvino. And when I asked for the entry done in the style of Steven King, it refused to dilute “Calvino’s” vision!
This raises a really interesting question about storytelling: ChatGPT can tell many kinds of very specific stories 👇, but tends to default to one or two “types” of each; are these particular story essences telling us anything interesting about the genres it is trying to emulate?
There is no easy way to detect liars. Anyone who tells you otherwise is lying.
🤞Non-verbal cues don't show who is lying
💬Asking people for lots of details doesn't help detect liars
👂Listening for pauses & anger doesn't help
🫢You can't tell who is a liar by facial appearance
Mechanical Turk plays a big role in research & it worked well for years… but there are ominous signs:
📉Invalid data in MTurk only happened in ~10% of answers 2015-2017, but that went to 62% in 2018 & 38% in 2019.
🤖2022: out of a sample of 529 MTurk workers, only 14 were human
There is some interesting criticism of the second paper in the comments that suggest that better-run Mturk studies would not have quite the same issues.
And if are going to use MTurk ethically & successfully, here is a good list of tips from researchers.
There are “only” 5 to 10 trillion high-quality words (papers, books, code) on the internet. Our AI models will have used all of that for training by 2026. Low-quality data (tweets, fanfic) will last to 2040. arxiv.org/pdf/2211.04325…
One of the fascinating hypotheticals is that humanity may one day decide to engage in a massive word-generating project to capture everything we say in order to feed AIs training material.
This feels like a science fiction story that needs to be written.
It is a kind of mind-alternating way to think about humanity’s cultural production: education for AIs
The idea that we are all authors who create 140k to 2.6M words a year is cooler than the Matrix premise that we are batteries. Word for the word god! Books for the book throne!
I made ChatGTP write reviews of my paper on the role of middle management as if it was journal reviewer. The first "reviewer" did identify some real issues that I addressed in later drafts...
Then I asked it to simulate a typical Reviewer #2 & Reviewer #3. Things got too real 😉
Prompts:
1⃣This is an academic paper, write a nitpicking review as Reviewer #1.
2⃣Now write a review as Reviewer #2 who wants to make sure they get cited more.
3⃣Now write as Reviewer #3, who hates the paper and demands changes to its arguments, and suggests new data gathering.
So ChatGPT might work as exposure therapy for traumatized academics.