We ran a 5-month long RCT in Kenya with 640 Kenyan entrepreneurs, over 4,000 measures of firm performance, and thousands of interactions w/ a generative AI mentor that we built using GPT-4. Participants interacted w/ the mentor through WhatsApp. Example interaction shown here. 2/
In contrast to excellent recent work by folks like @whitneywzhang, @LindseyRRaymond, @erikbryn, Dellβacqua, @klakhani @emollick + @H_DigInnovation which find positive impact of gen AI on well-defined tasks, we find no effect of gen AI, despite considerable usage. BUT... 3/
This null effect masks considerable and important heterogeneity: initially low-performing entrepreneurs saw a 10% performance decline, whereas high performers experienced a 20% performance boost from AI use! 4/
The direction of our treatment effect heterogeneity is another difference between our work and the work of folks like @shakkad_noy @Danielle__Li, @eddy_mac_3 @shakked_noy and others, who have found that generative AI *reduces* differences in productivity 5/
Exploratory analysis of the interaction logs with the AI mentor suggest this heterogeneity stems from the *types* of things low performers ask about: more challenging tasks that may not be well-suited to AI (or human) assistance. 6/
Our results are not inconsistent with other studies; in fact, if low performers select into asking for assistance with more difficult tasks, you could get results that look like ours, even if low performers benefit more from AI on any given task. 7/
Overall, this suggests that for gen AI to add value to workers in more open-ended contexts (for instance, entrepreneurs), we also need to expand access to complementary resources and skills 8/
Many more interesting nuggets and examples in the paper, which again was led by the fantastic @nickgotis. He is going to be on the job market soon and is fantastic, so consider hiring him into your department! Here's a link to his website: 9/ nicholasotis.com
This is, of course, work-in-progress, so thoughts, comments, and feedback of all sorts are much appreciated! π
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
There's tons of buzz about the importance of prompting genAI models well - @theindicator by @planetmoney just had an episode about prompt engineers!
At the same time, folks such as @emollick say prompting will soon be less important. So who's correct? 2/npr.org/2024/07/05/119β¦
We ran an online @Prolific experiment with 1,891 participants; collecting and analyzing over 18k prompts and 180k images.
People were randomized into 1 of 3 models: DALL-E 2, DALL-E 3, or DALL-E 3 w/ prompt revision, and asked to recreate a "target image" as best they could. 3/