Awesome -- But, how does one possibly map all these frontiers without employing actual people to read and annotate dozens of hundreds of abstracts at a time?
Well, that's what advanced Natural Language Processing #NLP is for
Specifically, zero-shot multi-class classification as enabled by fine-tuned LLMs
I'm gonna go full-disclosure here, so take notes: DeBERTa v3 (large) fine-tuned on WANLI
WANLI is a dataset composed of 107,885 NLI examples, you can read all about it here arxiv.org/abs/2201.05955
DeBERTA v3 is a transformer-based LM, a sophisticated descendant of BERT raised on pre-training steroids + some wicked smart innovations in the attention mechanism arxiv.org/abs/2111.09543
Of course, what took months was not the workflow implementation per se, but rather assessing various models till I found one that just does the job
Nothing stated in this thread is really ground-breaking -- so, the take home message is different:
I truly believe that modern science suffers from discoverability issues and information asymmetries
Vast expanses of knowledge remain under-utilized, never making it into the integral of our collective processing capacity
Such asymmetries will only compound over time as we put ever more research out there
It is my conviction that AI-powered tools (with lots of non-sentient quotes) present a unique opportunity to remedy the situation by making science easier to navigate
Anyhow, now that I've got a sci-field map at hand, I will start pivoting the website towards design patterns that better accommodate
/for these emerging themes and topics
• • •
Missing some Tweet in this thread? You can try to
force a refresh