Tweet

Graham Neubig

22 Jan, 3 tweets, 3 min read

@ZiYiDou

Check out our new awesome word aligner, AWESOME aligner by @ZiYiDou 😀: github.com/neulab/awesome…
* Uses multilingual BERT and can align sentences in all included languages
* No additional training needed, so you can align even a single sentence pair!
* Excellent accuracy
1/3

A paper describing the methodology will appear at #EACL2021: arxiv.org/abs/2101.08231
The model is trained on parallel data using contrastive and self-training losses. But it generalizes zero-shot to new language pairs without any training data! 2/3

Why do we need word alignments in the first place? We use them for lexicon learning, model analysis, and cross-lingual learning. For example, AWESOME aligner results in better cross-lingual results for annotation projection in NER. Check it out and we welcome comments/issues! 3/3

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @gneubig

Graham Neubig

@gneubig

9 Mar 20

Super-excited about our new #ICASSP2020 paper on "Universal Phone Recognition with a Multilingual Allophone System" arxiv.org/abs/2002.11800

We create a multi-lingual ASR model that can do zero-shot phone recognition in up to 2,186 languages! How? A little linguistics :) 1/5

In our speech there are phonemes (sounds that can support lexical contrasts in a *particular* language) and their corresponding phones (the sounds that are actually spoken, which are language *independent*). Most multilingual ASR models conflate these two concepts. 2/5

We create a model that first recognizes to language-independent phones, and then converts these phones to language-specific phonemes. This makes our underlying representations of phones more universal and generalizable across languages. 3/5

Read 6 tweets

Share this page!

Graham Neubig

Try unrolling a thread yourself!

More from @gneubig

Graham Neubig

Did Thread Reader help you today?

Like this author's thread?