#acl2022nlp
What happens inside a multilingual neural cognate prediction model?
We show that predicting cognates between current Romance languages latently teaches the model about their proto-forms, allowing reconstruction without fine-tuning encoders on the task!🧵
In layman's terms, learning to predict special words (cognates) between related languages (French, Italian, Spanish, Portuguese, Galician, Catalan, Occitan, Romanian, and Aromanian) gives the model 'intuitive' knowledge about their parent, Latin!
How? We have no idea! The model does learn a phonetic "language model" latently (similar phones appear close to one another across languages), but not phonotactic information (sound order), apparently?
Not all models allow this: recurrent multilingual models with one encoder and one decoder per language (with or without sharing embeddings) outperform all other models, as multilinguality is vital for this task.
(However, sharing encoders and decoders decreases accuracy)