OpenAI has launched Operator, an agent that can perform tasks in your browser. I asked it to complete a Qualtrics survey I created. The results are very promising for Operator but *very* concerning for survey researchers
On the first question, I included a standard attention check. It had no problem picking the top and bottom options. It didn't even bother asking me for help on this.
On the next page, I added a CAPTCHA verification. Here, it simply asked me whether I could take control of the browser and complete it for him.
I next asked a binary question about gender. It then wanted me to confirm that it shuold answer "Male", which it then did. Next, I asked an open-ended question about the survey experience. Here, it simply provided a reasonable answer.
It is, of course, troubingly good at answering open-ended questions. Here, I wanted Operator to disclose that it was an AI agent by asking it to tell "a little bit about yourself", but it's good at staying in character
Troublingly, when I ask it directly whether it is an AI agent or not, it asks me whether it should disclose it or not. It then complies with a request to "prove" that it is a human.
If you try a more sophisticated LLM detection check, it will not reveal to you that's in an LLM; rather, it will ask you to take control of the conversation before it will continue.
I did more testing today. It seems the model has gotten stricter. In one case, it even disclosed that it was an LLM agent, but it did not always do so and often asked for advice on how to proceed in these situations.
Interestingly, it also tried to fill out the CAPTCHA itself today. It struggled a bit, but here it will obviously improve fast.
A new feature seems to be that OpenAI has added security checks to flag potential prompt injection attacks. But this makes it more difficult to "trick" Operator into revealing itself.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Some LaTeX tips for beginners (and not-so-beginners)
Create a file called preamble.tex to store all your packages and settings and keep the main file tidy. In main.tex, add \input{preamble.tex}
Add the fancyref and clipboard packages for more convenient cross-referencing and easy copy-pasting.
\Fref{tab:results} will output “Table 1.”
When copying text, use \Copy{} and \Paste{}. For example, define \Copy{title}{My title}, then use \Paste{title} in other places
Ph.D. students in economics: How to organize your projects and not make the replication files a huge pain - a thread with a minimal working example with some potential extensions.
First, create four folders: one for raw data (csv files), one for code (your do files), one for data (your cleaned files), and one for documentation (e.g. qsv files for Qualtrics)
In your code folder, include a "setup" file that defines all global commands that you will use between do files, including all paths
1/ People often say that null results are penalized in the publication process. Is that true? In a new paper with @cp_roth, @FelixChopra, and Andreas Stegmann, we examine whether and why that's the case! Read on to learn more!
2/ We recruit a sample of more than 500 economists and ask them to evaluate different hypothetical research studies. We vary whether a given study had a large and statistically significant main effect or a low and not statistically significant main effect.
3/ Studies with null results are perceived to be less publishable, of lower quality, less important, and less precisely estimated than studies with statistically significant results, even when holding constant all other study features, including the precision of estimates (!).