Gave a talk today to @cepr_org thanks invite from @akorinek ; great discussion by @Afinetheorem here are my slides in PNG form b/c of the link this 1/N
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The "there should be a rule/law/training/office" impetus after every bad event is both a) understandable and b) how you get the kinds of organizations most people hate to be in
For the specific thing I'm sub-tweeting - who exactly do ppl think would provide the oversight that would prevent this? Almost all the peculiarities of research orgs are a reaction to how difficulty the assessment of research is in practice; some putative office of research non-fakery staffed w/ non-experts faces the same problem but w/o a chance of solving it
TBC, I'm not diminishing how bad this is---it's terrible & unlike a lot of research fraud we see, it actually mattered---I'm sure it changed how many people were thinking, making career decisions, making investment decisions and so on.
I wanted to do a thread about how I've been thinking about Gen AI & production. So, imagine a job that's a sequence of tasks to be done 1/n
One of these tasks *might* be doable with AI, so we give a shot by asking nicely:
It will almost always give a shot ('let me delve into that for you...") but then you have to evaluate the output to decide if it is satisfactory for your particular purpose
I'll do a longer thread---and this little toy example just scratches the surface of what's possible---but edsl (pip install edsl) is an open source python package & domain specific language for asking LLMs questions
You can ask questions of various types (free text, multiple choice, numerical), combine them into surveys, add skip logic, parameterize w/ data (to create data labeling coding flows) & then administer to or model models. It abstracts away from the model-specific details
And let's you express the job you want to do in a declarative rather that procedural way e.g., I want to run this survey, with these scenarios, with these agents, with these models etc. rather than writing out all these loops yourself.
What we do, in a nutshell, is let the LLM propose hypotheses, design experiments, RUN those experiments insulation and then have it estimate the results
For example, here’s is the experiment it sets up for a hiring scenario