How to get URL link on X (Twitter) App
While AI has achieved near-perfect scores on the US Medical Licensing Exam, we set a higher benchmark: 304 cases from the New England Journal of Medicine. These are some of the toughest and most diagnostically complex cases a physician can face.
We know more compute results in higher accuracy, but are the models more confident those answers ARE accurate too? And how do we teach them when to say “I don’t know”? That’s what the research team wanted to find out.
https://twitter.com/emollick/status/1879633485004165375The setup: For 6 weeks, students used Copilot in their computer lab 2x/week, guided by teachers on selected topics and grammar/writing tasks.
These are ideas I've been thinking about for over a decade. This is my attempt to understand how and why technology naturally proliferates, and what society needs to do to remain in control.