Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Ingar Haaland

@Ingar30

Mar 13 • 10 tweets • 4 min read • Read on X

Scrolly

OpenAI has launched Operator, an agent that can perform tasks in your browser. I asked it to complete a Qualtrics survey I created. The results are very promising for Operator but *very* concerning for survey researchers

On the first question, I included a standard attention check. It had no problem picking the top and bottom options. It didn't even bother asking me for help on this.

On the next page, I added a CAPTCHA verification. Here, it simply asked me whether I could take control of the browser and complete it for him.

I next asked a binary question about gender. It then wanted me to confirm that it shuold answer "Male", which it then did. Next, I asked an open-ended question about the survey experience. Here, it simply provided a reasonable answer.

It is, of course, troubingly good at answering open-ended questions. Here, I wanted Operator to disclose that it was an AI agent by asking it to tell "a little bit about yourself", but it's good at staying in character

Troublingly, when I ask it directly whether it is an AI agent or not, it asks me whether it should disclose it or not. It then complies with a request to "prove" that it is a human.

https://x.com/andre_quentin/status/1900210369509155310

If you try a more sophisticated LLM detection check, it will not reveal to you that's in an LLM; rather, it will ask you to take control of the conversation before it will continue.

https://x.com/andre_quentin/status/1900210369509155310

I did more testing today. It seems the model has gotten stricter. In one case, it even disclosed that it was an LLM agent, but it did not always do so and often asked for advice on how to proceed in these situations.

Interestingly, it also tried to fill out the CAPTCHA itself today. It struggled a bit, but here it will obviously improve fast.

A new feature seems to be that OpenAI has added security checks to flag potential prompt injection attacks. But this makes it more difficult to "trick" Operator into revealing itself.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @Ingar30

Ingar Haaland

@Ingar30

Oct 29, 2024

Some LaTeX tips for beginners (and not-so-beginners)

Create a file called preamble.tex to store all your packages and settings and keep the main file tidy. In main.tex, add \input{preamble.tex}

Add the fancyref and clipboard packages for more convenient cross-referencing and easy copy-pasting.
\Fref{tab:results} will output “Table 1.”
When copying text, use \Copy{} and \Paste{}. For example, define \Copy{title}{My title}, then use \Paste{title} in other places

Read 5 tweets

Ingar Haaland

@Ingar30

Mar 4, 2024

Ph.D. students in economics: How to organize your projects and not make the replication files a huge pain - a thread with a minimal working example with some potential extensions.

First, create four folders: one for raw data (csv files), one for code (your do files), one for data (your cleaned files), and one for documentation (e.g. qsv files for Qualtrics)

In your code folder, include a "setup" file that defines all global commands that you will use between do files, including all paths

Read 8 tweets

Ingar Haaland

@Ingar30

May 28, 2022

@cp_roth

1/ People often say that null results are penalized in the publication process. Is that true? In a new paper with @cp_roth, @FelixChopra, and Andreas Stegmann, we examine whether and why that's the case! Read on to learn more!

2/ We recruit a sample of more than 500 economists and ask them to evaluate different hypothetical research studies. We vary whether a given study had a large and statistically significant main effect or a low and not statistically significant main effect.

3/ Studies with null results are perceived to be less publishable, of lower quality, less important, and less precisely estimated than studies with statistically significant results, even when holding constant all other study features, including the precision of estimates (!).

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Ingar Haaland

Try unrolling a thread yourself!

More from @Ingar30

Ingar Haaland

Ingar Haaland

Ingar Haaland

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!