Post

Vivek Gupta

@keviv9

Apr 30 • 10 tweets • 4 min read • Read on X

Scrolly

#NAACL2025

(1/8) Excited to present our paper "Leveraging LLM For Synchronizing Information Across Multilingual Tables" at #NAACL2025! 🎉

Tackling the challenge of outdated info in low-resource language Wikipedia tables, we propose a novel LLM-based zero-shot approach using task decomposition.💡

🔗 Check out our work here: zero-shot-llm-infosync.github.io/zero-shot-llm-…

(2/8) - The problem is huge; the language gap is real.

🌍 300+ languages on Wikipedia

🇬🇧 English dominates: 23% of articles, 76% of edits

🌐 Outside the top 25 languages, active editors, pages, and edits make up <1%.

#Wikipedia #LanguageBarrier #DigitalDivide

(3/8) We tackle the problem of information mismatch in tables for different languages, where data can be outdated, unmapped, or missing cultural context.

This is a significant challenge given the vast number of languages on platforms like Wikipedia.

(4/8) 🚀 Proposing a 2-step approach:
🔄 Information Alignment
📈 Information Updation

📝 Prior work (Khincha et al., ACL 2023) missed info updation and didn’t leverage SOTA LLMs for these tasks! #AI #MachineLearning #LLMs #Innovation

We propose a new dataset for updating tables, using past versions as outdated references. #AI #DataScience #TableUpdate

(5/8) For information alignment, we use InfoSync (Khinch et. al. 2023 ACL Findings) and combine with LLMs ensemble to enhance performance.

(6/8) For information updation - Our paper, explores how state-of-the-art LLMs can enhance the synchronization of multilingual entity-centric tables.

🔄 Hierarchical Task Decomposition Prompt
Break tasks into multiple prompts for sequential inference: Translation, Knowledge graph conversion, Merging, Update from KG, and Back-Translation
#TaskDecomposition

We investigate zero-shot prompts and how to improve LLM performance in this task.

(7/8) Key contributions include using SOTA LLMs for information alignment and introducing InfoUpdate dataset, the first human-annotated benchmark for information updation.

We propose a multi-step decomposition zero-shot LLM approach that boosts information factuality and coherence across multilingual entity-centric tables..

(8/8) Catch our oral presentation (I'm presenting) on Wednesday, April 30th, during Session D: Oral/Poster 3, in Ballroom B, at 4 PM.

Join work with Sidharth Khincha, Ankita Anand, Tushar Kataria, @DanRothNLP

@SCAI_ASU, @IITGuwahati , @UUtah , @cogcomp

Looking forward to seeing you there! 👋🗓️📍 #NLP #LLMs #Multilingual #InformationSynchronization

7.1 - we also introduce a new metric for evaluation, based on tri-align, bi-align, un-aligned rows.

7.2 We outperform all other traditional baseline, even ones with multiple prompts

We add more information, correct outdated information, and also reduce hallucinated information.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

Enter URL or ID to Unroll

Vivek Gupta

Try unrolling a thread yourself!

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!