Vivek Gupta Profile picture
Assistant Professor @SCAI_ASU; PostDoc @cogcomp @Penn, ed-@UUtah,@iitkanpur. @Bloomberg @MSFTResearch Fellow; ex-@MetaAI @IBM @Verisk @samsungresearch @Synopsys

Apr 30, 10 tweets

#NAACL2025

(1/8) Excited to present our paper "Leveraging LLM For Synchronizing Information Across Multilingual Tables" at #NAACL2025! 🎉

Tackling the challenge of outdated info in low-resource language Wikipedia tables, we propose a novel LLM-based zero-shot approach using task decomposition.💡

🔗 Check out our work here: zero-shot-llm-infosync.github.io/zero-shot-llm-…

(2/8) - The problem is huge; the language gap is real.

🌍 300+ languages on Wikipedia

🇬🇧 English dominates: 23% of articles, 76% of edits

🌐 Outside the top 25 languages, active editors, pages, and edits make up <1%.

#Wikipedia #LanguageBarrier #DigitalDivide

(3/8) We tackle the problem of information mismatch in tables for different languages, where data can be outdated, unmapped, or missing cultural context.

This is a significant challenge given the vast number of languages on platforms like Wikipedia.

(4/8) 🚀 Proposing a 2-step approach:
🔄 Information Alignment
📈 Information Updation

📝 Prior work (Khincha et al., ACL 2023) missed info updation and didn’t leverage SOTA LLMs for these tasks! #AI #MachineLearning #LLMs #Innovation

We propose a new dataset for updating tables, using past versions as outdated references. #AI #DataScience #TableUpdate

(5/8) For information alignment, we use InfoSync (Khinch et. al. 2023 ACL Findings) and combine with LLMs ensemble to enhance performance.

(6/8) For information updation - Our paper, explores how state-of-the-art LLMs can enhance the synchronization of multilingual entity-centric tables.

🔄 Hierarchical Task Decomposition Prompt
Break tasks into multiple prompts for sequential inference: Translation, Knowledge graph conversion, Merging, Update from KG, and Back-Translation
#TaskDecomposition

We investigate zero-shot prompts and how to improve LLM performance in this task.

(7/8) Key contributions include using SOTA LLMs for information alignment and introducing InfoUpdate dataset, the first human-annotated benchmark for information updation.

We propose a multi-step decomposition zero-shot LLM approach that boosts information factuality and coherence across multilingual entity-centric tables..

(8/8) Catch our oral presentation (I'm presenting) on Wednesday, April 30th, during Session D: Oral/Poster 3, in Ballroom B, at 4 PM.

Join work with Sidharth Khincha, Ankita Anand, Tushar Kataria, @DanRothNLP

@SCAI_ASU, @IITGuwahati , @UUtah , @cogcomp

Looking forward to seeing you there! 👋🗓️📍 #NLP #LLMs #Multilingual #InformationSynchronization

7.1 - we also introduce a new metric for evaluation, based on tri-align, bi-align, un-aligned rows.

7.2 We outperform all other traditional baseline, even ones with multiple prompts

We add more information, correct outdated information, and also reduce hallucinated information.

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling