Check out our new awesome word aligner, AWESOME aligner by @ZiYiDou 😀: github.com/neulab/awesome…
* Uses multilingual BERT and can align sentences in all included languages
* No additional training needed, so you can align even a single sentence pair!
* Excellent accuracy
1/3
A paper describing the methodology will appear at #EACL2021: arxiv.org/abs/2101.08231
The model is trained on parallel data using contrastive and self-training losses. But it generalizes zero-shot to new language pairs without any training data! 2/3
Why do we need word alignments in the first place? We use them for lexicon learning, model analysis, and cross-lingual learning. For example, AWESOME aligner results in better cross-lingual results for annotation projection in NER. Check it out and we welcome comments/issues! 3/3

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Graham Neubig

Graham Neubig Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @gneubig

9 Mar 20
Super-excited about our new #ICASSP2020 paper on "Universal Phone Recognition with a Multilingual Allophone System" arxiv.org/abs/2002.11800

We create a multi-lingual ASR model that can do zero-shot phone recognition in up to 2,186 languages! How? A little linguistics :) 1/5
In our speech there are phonemes (sounds that can support lexical contrasts in a *particular* language) and their corresponding phones (the sounds that are actually spoken, which are language *independent*). Most multilingual ASR models conflate these two concepts. 2/5
We create a model that first recognizes to language-independent phones, and then converts these phones to language-specific phonemes. This makes our underlying representations of phones more universal and generalizable across languages. 3/5
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!