Some reflections on media coverage of tech/science/research. It seems to me that there are broadly speaking two separate paths of origin for these stories: In one, the journalist sees something that they think the public should be informed of, and digs into the scholarship.
>>
In the other, the researchers have something they want to draw the world's attention to. But there are two subcases here:
Researchers (usually in academia) who see a need for the public to be informed, either acutely (ppl need this info NOW) or long-term (science literacy).
>>
Subcase 2b: PR orgs (usually in industry) want the public to know about their cool research, because it serves as positive marketing for a company, halo effect of competence, etc etc.
>>
(Universities have PR orgs, too, which I think serve the dual purpose of helping scholars reach the public in cases of 2a above and promoting the university --- but here what's being sold is the university as a place of scholarship...)
>>
I'm seeing what I think is a pattern of tech co PR orgs + journalists together making news out of "new paper!!!1!" where the paper has not been peer reviewed. Sometimes, I get asked to comment on these papers, with really fast turn around.
>>
The typical pattern is that the paper is "embargoed" so that only some journalists and in some cases people they ask for comment, can see it.
>>
In the cases I've seen, the papers have been preprints (think about to go up on arXiv) and not peer-reviewed. Pushing out PR like this thus feels like circumventing the peer-review process while claiming to do scientific research.
>>
I'd like to see the tech media refuse to bite on that bait, when Meta/Google/Amazon/whoever says: we'll give you early access to this paper before we "publish" it tomorrow. If it hasn't been peer reviewed, getting a couple of other researchers to skim & comment is no substitute.>
That's all just contributing to the #AIhype cycle, it's not informing the public, and it's not like there's something there that can't wait for peer review.
>>
I can imagine folks saying: It's better that these big tech cos put the research out rather than keeping it behind closed doors. But I'm really not convinced, for at least two reasons:
>>
1. With all the big AI lab papers flooding arXiv (and our conferences), we still aren't getting transparency on the really key issues, like how content moderation is actually done, what's actually in the training data, how users are targeted & user data monetized.
>>
(Never forget that Google wanted to suppress the Stochastic Parrots paper, which wasn't even talking about anything specific being done AT GOOGLE.)
>>
2. Big tech attracts researchers with money, sure, but also with prestige. If researchers in the big tech labs don't get to publish, there will be less incentive to go there. But that is fully compatible with peer review, rather than publication by press release. /fin (for now)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Precision grammars (grammars as software) can be beneficial for linguistic hypothesis testing and language description. In a new @NEJLangTech paper (Howell & Bender 2022) we ask:
to what extent can they be built automatically?
@NEJLangTech Built automatically out of what? Two rich sources of linguistic knowledge:
1. Collections of IGT (interlinear glossed text), reflecting linguistic analysis of the language 2. The Grammar Matrix customization system, a distillation of typological and syntactic analyses
>>
This is the latest update from the AGGREGATION project (underway since ~2012), and builds on much previous work, by @OlgaZamaraeva, Goodman, @fxia8, @ryageo, Crowgey, Wax and others!
I see lots and lots of that distraction. Every time one of you talks about LLMs, DALL-E etc as "a step towards AGI" or "reasoning" or "maybe slightly conscious" you are setting up a context in which people are led to believe that "AIs" are here that can "make decisions".
>>
And then meanwhile OpenAI/Cohere/AI2 put out a weak-sauce "best practices" document which proclaims "represent diverse voices" as a key principle ... without any evidence of engaging with the work of the Black women scholars leading this field.
Without actually being in conversation (or better, if you could build those connections, in community) with the voices you said "we should represent" but then ignore/erase/avoid, you can't possibly see the things that the "gee-whiz look AGI!" discourse is distracting from.
This latest example comes from The Economist. It is a natural human reaction to *make sense of* what we see, but the thing is we have to keep in mind that all of that meaning making is on our side, not the machines'.
And can I just add that the tendency of journalists who write like this to center their own experience of awe---instead of actually informing the public---strikes me as quite self-absorbed.
I not infrequently see an argument that goes: "Making ethical NLP (or "AI") systems is too hard because humans haven't agreed on what is ethical/moral/right"
This always feels like a cop-out to me, and I think I've put my finger on why:
>>
That argument presupposes that the goal is to create autonomous systems that will "know" how to behave "ethically".
tl;dr blog post by new VP of AI at Halodi says the quiet parts out loud: "AI" industry is all about surveillance capitalism, sees gov't or even self- regulation as needless hurdles, and the movers & shakers are uninterested in building things that work. A thread:
First, here's the blog post, so you have the context:
1. No, LLMs can't do literature reviews. 2. Anyone who thinks a literature review can be automated doesn't understand what the purpose of a literature review is.
3. The web page linked to provides exactly 0 information about how this system was evaluated or even what it is designed for. Any they are targeting it a researchers? I sure hope researchers are more critical than they seem to expect.