(((ل()(ل() 'yoav))))👾 Profile picture
Jul 5, 2020 5 tweets 4 min read Read on X
A bunch of works from my group(s) coming up at #acl2020nlp tomorrow. Watch the videos and come visit us in the Q&A sessions!
In work with @lambdaviking @gail_w @royschwartz02 @nlpnoah and @yahave we provide *theoretical* results (yes, with proofs) of things that can and cannot be represented by various kinds of RNNs, and under what conditions.
virtual.acl2020.org/paper_main.43.…

+ blog:
lambdaviking.com/post/rr-hierar…
If you are working in computational social sciences, digitial humanities, etc, check out the work with @hila_gonen , and Ganesh Jawahar and @zehavoc .

We present a *simple* and *effective* method for identifying word usage change across corpora.

virtual.acl2020.org/paper_main.51.…
This work with @hilleltt, Micah Shlain, Shoval Sadde, is a *demonstration* of a system that allow you to perform powerful syntactic queries over large textual corpora, efficiently, and without needing to care much about syntax.

virtual.acl2020.org/paper_demo.44.…

demo in vid, more in Q&A!
Together with @aryeh_tiktinsky and @rtsarfaty we present a system that enriches and re-arranges universal dependency trees such that they expose event-structure better. Come see us re both about the utility and the underlying ling theories.

Demo + Video:
virtual.acl2020.org/paper_demo.59.…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with (((ل()(ل() 'yoav))))👾

(((ل()(ل() 'yoav))))👾 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @yoavgo

Jun 29, 2023
yes, the neural LMs learn a model of the world as projected through documents (and soon also images) found on the internet. the big question is to what extent this projection---even in the best case---provides an accurate or complete representation of the real world.
here the @emilymbender camp says "nothing whatsoever", and i and others say "to some extent, and certainly enough to be useful", but i don't think there is any doubt that its very far from being complete or accurate.
@OmerAlali7 (ודווקא ייצוג וקטורי הוא לדעתי מחליש את הטיעון, ולא מחזק. כי סתם מרחב וקטורי והגאומטריה שלו זה ייצוג מאד חלש של ידע. אני חושב שבמודלים המלאים יש הרבה יותר מזה.)
Read 5 tweets
Dec 23, 2021
for me there are two, both from my CS undergrad, both were the first assignment in a course, both involve programming, and both beautifully captured the essence of the course in a single, simple to explain and somewhat open-ended assignment, that you had to figure out on your own
the first is in the Compilation course by Prof. Mayer Goldberg (no family relation): we had to write an interpreter for a simple-but-turing-complete language (easy), and then we had to write a compiler from high level code to this language.
the second is in an NLP/NLG course by Prof @melhadad (who then became my msc and phd adviser), where the assignment was basically "Write a program where the input is a sequence of numbers, such as 1,2,3,7,9,10 and the output is a sucint description of the sequence in English".
Read 6 tweets
Dec 9, 2021
חלקכם אולי הבחנתם לפני כחודש ש ynet הוסיפו אופציה להקראה של כתבות בעברית, וזה אפילו עבד ממש לא רע.

ההקראה התבססה במידה רבה על טכנולוגיית הנקדן שפותחה בדיקטה, בפרוייקט מרשים מאד בהובלת אבי שמידמן ופיתוח עיקרי על ידי אבי ושאלתיאל שמידמן (ומעורבות מסויימת שלי, ומעורבות של משה קופל)
אבל למה אני מספר לכם על זה? האם כדי לספר על האתגרים בטקסטים חסרי ניקוד, ולהתגאות בהישג היפה שלנו? גם, אבל לא העיקר.

העיקר הוא הסיפור היפה הזה:
החברה שסיפקה את פתרון ההקראה של וואינט פשוט השתמשה ב-api של הנקדן בלי לבקש ובלי לדבר ושילבו במוצר מרכזי שלהם, ואפילו לא שלחו אימייל כדי לנסות להגיע להסכם מסודר יותר. מדהים.
Read 6 tweets
Sep 27, 2021
This paper is now finally on arxiv!

We (@yanaiela @rtsarfaty Vika Basmov) define a new core NLP task we call "Text-based NP Enrichment" (TNE), which we believe is both:

(a) very useful if performed with high accuracy,

(b) better benchmark for reading comprehension than QA is.
The task definition is very simple: for every pair of base-NPs in the text (in our dataset a text is ~3 paragraphs long), decide if they can be related by a preposition, and if so, which.

Why is this task interesting? We argue that its a core component of reading comprehension.
When reading text, we identify noun-phrases (NPs), and integreate each new one in a network of NPs, which we maintain.

This network is essential for "understanding".

One famous edge type is "coreference": indicating two NPs refer to the same entity.

But there are so many more!
Read 16 tweets
Sep 25, 2021
so, what do i think?
lets start w/ a disclaimer: i'm not active in summarization research, and only skimmed the paper, focusing on the eval part.

i think not using ROUGE but rather a human eval (for the main metric) is a nice step forward from a visible player like OpenAI, but,
but, it is also the really bare minimum of an eval. and it is far from being a good one (for starters, hardly any details are given re eval guidelines, what were the evaluators were instructed to eval). this is sort-of excusable here, since models are so far from human level,
but this most def won't be a good eval when trying to claim good performance. there has been a lot of prev work in the summarization community on how to properly eval (albeit not for this task of book-length summarization). i rec looking at what they did. start with Pyramid.
Read 4 tweets
Aug 29, 2021
a bit more on this: "oh the new large DL models in NLP are so soul-less, they only consider form and don't truly understand meaning, they are black-boxes, they expose and amplify sociatel biases in the data, etc etc etc":
well, all true, but at least they work. like, previous-gen models *also* didn't understand meaning, and *also* considered only form. they were just much worse at this. so much worse that no one could ever imagine that they capture any kind of meaning whatsoever. they didn't work.
(not that the current ones "work". but they do "work" much better than the previous gen models. much, much better.)
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(