Latest Twitter Threads by @AntoineYang2 on Thread Reader App

Mar 1, 2023 • 5 tweets • 3 min read

Introducing Vid2Seq, a new visual language model for dense video captioning. To appear at #CVPR2023.

Work done @Google w/ @NagraniArsha P.H. Seo @antoine77340 @jponttuset I. Laptev J. Sivic @CordeliaSchmid.

Page: antoyang.github.io/vid2seq.html
Paper: arxiv.org/abs/2302.14115

🧵/5

Most video captioning systems can only describe a single event in short videos. But natural videos may contain numerous events. So we focus on the dense video captioning task, which requires temporally localizing and captioning all events in untrimmed minutes-long videos 🎞️.

2/5

Share this page!

Enter URL or ID to Unroll