Pei Zhou Profile picture
Sep 3, 2021 β€’ 6 tweets β€’ 5 min read β€’ Read on X
🚨 Can response generation models read between the lines? Our πŸ†• #EMNLP2021 paper probes RG models to see if they can identify common sense reasons by annotating CS explanations in dialogues and evaluating RG models for CS reasoning capabilities.
We formalize CS as a *latent variable* that helps explain the observed variable β€œresponse” in the RG process and instantiate CS using textual explanations of the response.
To collect annotations on CS explanations that justify dialogue responses. We first generate candidates by adopting a large T5 model trained on a story explanation dataset, GLUCOSE (@nasrinmmm et al). Next, we conduct a carefully designed two-stage human verification process.
To understand whether RG models can comprehend implicit CS, we *corrupt* explanations to break the logical coherence or grammar and compare model behaviors between a valid explanation and a corrupted one.
We find that SOTA RG models fail to understand CS that justifies proper responses according to performance on our probing settings and some models even do not distinguish gibberish sentences! Fine-tuning on in-domain dialogues and verified explanations do not help.
We hope our study motivates more research in making RG models emulate human reasoning processes!
Paper: arxiv.org/abs/2104.09574
Project Page: sites.google.com/usc.edu/cedar
Huge thanks to my co-authors @PegahJM, @HJCH0 @billyuchenlin @jay_mlr @xiangrenNLP

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Pei Zhou

Pei Zhou Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(