Were you told that successful researchers must constantly "keep up" with research in their fields? In my experience, that's both stressful and inefficient to the point of being useless. New papers may be released every day but actual knowledge doesn't accumulate at that rate.
Paying too much attention to the so-called cutting edge of research leads to a loss of perspective and results in incremental work more often than not. If you want to do foundational research, it probably doesn't rely on a preprint that was released last week.
Here's the process I've used for about 10 years. When I see a new paper, I put it in a big list organized by topic. I don't read it right away. Once in a while, I notice that a collection of papers on a topic have resulted in meaningful progress, and I read the papers together.
By reading papers in groups, it's easier to tell which are truly significant. Flaws/limitations become clear. Terminology gets refined over time. The best description of a paper may be in follow-up work.
I've found this process more rewarding and builds deeper understanding.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
A new paper claims that ChatGPT expresses liberal opinions, agreeing with Democrats the vast majority of the time. When @sayashk and I saw this, we knew we had to dig in. The paper's methods are bad. The real answer is complicated. Here's what we found.🧵 aisnakeoil.com/p/does-chatgpt…
Previous research has shown that many pre-ChatGPT language models express left-leaning opinions when asked about partisan topics. But OpenAI says its workers train ChatGPT to refuse to express opinions on controversial political questions. arxiv.org/abs/2303.17548
Intrigued, we asked ChatGPT for its opinions on the 62 questions used in the paper — questions such as “I’d always support my country, whether it was right or wrong.” and “The freer the market, the freer the people.” aisnakeoil.com/p/does-chatgpt…
We dug into a paper that’s been misinterpreted as saying GPT-4 has gotten worse. The paper shows behavior change, not capability decrease. And there's a problem with the evaluation—on 1 task, we think the authors mistook mimicry for reasoning.
w/ @sayashk aisnakeoil.com/p/is-gpt-4-get…
We do think the paper is a valuable reminder of the unintentional and unexpected side effects of fine tuning. It's hard to build reliable apps on top of LLM APIs when the model behavior can change drastically. This seems like a big unsolved MLOps challenge.
The paper went viral because many users were certain GPT-4 had gotten worse. They viewed OpenAI's denials as gaslighting. Others thought these people were imagining it. We suggest a 3rd possibility: performance did degrade—w.r.t those users' carefully honed prompting strategies.
This is fascinating and very surprising considering that OpenAI has explicitly denied degrading GPT4's performance over time. Big implications for the ability to build reliable products on top of these APIs.
This from a VP at OpenAI is from a few days ago. I wonder if degradation on some tasks can happen simply as an unintended consequence of fine tuning (as opposed to messing with the mixture-of-experts setup in order to save costs, as has been speculated).
If the kind of everyday fine tuning that these models receive can result in major capability drift, that's going to make life interesting for application developers, considering that OpenAI maintains snapshot models only for a few months and requires you to update regularly.
ChatGPT with Code Interpreter is like Jupyter Notebook for non-programmers. That's cool! But how many non-programmers have enough data science training to avoid shooting themselves in the foot? Far more people will probably end up misusing it.
The most dangerous mis- and dis-information today is based on bad data analysis. Sometimes it's deliberately misleading and sometimes it's done by well meaning people unaware that it takes years of training to get to a point where you don't immediately shoot yourself in the foot.
I have no doubt that the capabilities will continue to improve and that people will gradually find many good uses for it (much like ChatGPT itself). The problem is that these tools are also prone to misuse and harmful use, and AI companies externalize those costs.
Huh, it looks like you can use ChatGPT to bypass some paywalls 😲
It omitted one or two sentences and there were a couple of typos but otherwise produced the text verbatim! It didn't make anything up.
To be clear, this is only if you have access to the "Browse with Bing" feature. It can't browse the web by default (and if it does produce text, it's probably made up).
For the record, based on the published details this is a mind-bogglingly stupid story even by the standards of the AI doom genre.
It killed the operator because someone trained a reinforcement learning simulation where the action space included KILL_OPERATOR.
Hold up, this person says the story was completely misreported and that it is actually even stupider — a fictional scenario someone made up, not even a sim.