Stories are persistent: this paper traces back fairy tales across languages & cultures to common ancestors, arguing that the oldest go back at least 6,000 years. One of the oldest became the myth of Sisyphus & Thanatos in ancient Greece. 1/
That may be the start: this paper argues some stories may go back 100,000 years. Many cultures, including Aboriginal Australian & Ancient Greek, tell stories of the Plaeades, the 7 sisters star cluster, having a lost star- this was true 100k years ago! 2/ dropbox.com/s/np0n4v72bdl3…
Stories share similar arcs: Analyzing 1.6k novels, this paper argues there are only 6 basic ones:
1 Rags to Riches (rise)
2 Riches to Rags (fall)
3 Man in a Hole (fall rise)
4 Icarus (rise fall)
5 Cinderella (rise fall rise)
6 Oedipus (fall rise fall) 3/ epjdatascience.springeropen.com/articles/10.11…
Stories have links to cultural values. You can make predictions about economic factors from the stories people tell, as this 👇cool paper shows 4/
The stories organizations tell matter, too: When firms share stories in which their executives were clever but sneaky, the result is less helping & more deviance! Firms that share stories about low-level people upholding values have increased helping & deceased deviance. 5/
Firms also transmit learning through stories. This paper shows stories of failure work best. They are more easily applied than stories of success, especially if the story is interesting & you believe that it is important to learn from mistakes. But make sure it is a true story. 6
Entrepreneurs especially rely on stories, as, all you have initially is your pitch - a story about your startup. You have to use that to get people to give you resources, buy your product, join your company, etc. Here is a thread on how to do that: 7/
One of the most fascinating examples of the power of stories in startups is how the Theranos fraud relied on Elizabeth Holmes’s ability to tell a compelling story, which involved her tapping into the archetypes of what we expect an entrepreneur to be (black turtleneck & all) 👇
“GPT-4.5, Give me a secret history ala Borges. Tie together the steel at Scapa Flow, the return of Napoleon from exile, betamax versus VHS, and the fact that Kafka wanted his manuscripts burned. There should be deep meanings and connections”
“Make it better” a few times…
It should have integrated the scuttling of the High Seas Fleet better but it knocked the Betamax thing out of the park
🚨Our Generative AI Lab at Wharton is releasing its first Prompt Engineering Report, empirically testing prompting approaches. This time we find: 1) Prompting “tricks” like saying “please” do not help consistently or predictably 2) How you measure against benchmarks matters a lot
Using social science methodologies for measuring prompting results helped give us some useful insights, I think. Here’s the report, the first of hopefully many to come. papers.ssrn.com/sol3/papers.cf…
This is what complicates things. Making a polite request ("please") had huge positive effects in some cases and negative ones in others. Similarly being rude ("I order you") helped in some cases and not others.
There was no clear way to predict in advance which would work when.
The significance of Grok 3, outside of X drama, is that it is the first full model release that we definitely know is at least an order of magnitude larger than GPT-4 class models in training compute, so it will help us understand whether 1st scaling law (pre-training) holds up.
It is possible that Gemini 2.0 Pro is a RonnaFLOP* model, but we are only seeing the Pro version, not the full ultra.
* AI trained on 10^27 FLOPs of compute, an order of magnitude more than then GPT-4 level (I have been calling them Gen3 models because it is easier)
And I should also note that everyone now hides their FLOPs used for training (except for Meta) so things are not completely clear.
There is a lot of important stuff in this new paper by Anthropic that shows how people are actually using Claude. 1) The tasks that people are asking AI to do are some of the highest-value (& often intellectually challenging) 2) Adoption is uneven, but many fields already high
This is just based on Claude usage, which is why adoption by field is less of a big deal (Claude is popular in different fields than ChatGPT) than the breakdowns at the task level, because they represent what people are willing to let AI do for them.
Thoughts on this post: 1) It echoes what we have been hearing from multiple labs about the confidence of scaling up to AGI quickly 2) There is no clear vision of what that world looks like 3) The labs are placing the burden on policymakers to decide what to do with what they make
I wish more AI lab leaders would spell out a vision for the world, one that is clear about what they think life will actually be like for humans living in a world of AGI
Faster science & productivity, good - but what is the experience of a day in the life in the world they want?
To be clear, it is completely possible to tell a very positive vision of the future of humans and AI (heck, just steal from The Culture or Long Way to an Angry Planet or something), and I think that would actually be a really useful exercise, showing where the labs hope we all go