Latest Twitter Threads by @RoyiRassin on Thread Reader App

Nov 7, 2024 • 12 tweets • 4 min read

How diverse are the outputs of text-to-image models and how can we measure that? In our new work, we propose a measure based on LLMs and Visual-QA (VQA), and show NONE of the 12 models we experiment with are diverse. 🧵
1/11

Project page (with interactive results to explore): royira.github.io/GRADE/
Code: github.com/RoyiRa/GRADE-Q…
HF: huggingface.co/papers/2410.22…
arXiv: arxiv.org/abs/2410.22592

Oct 20, 2022 • 8 tweets • 4 min read

New work! :D

We show evidence that DALL-E 2, in stark contrast to humans, does not respect the constraint that each word has a single role in its visual interpretation.

Work with @ravfogel and @yoav.
BlackboxNLP @ #emnlp2022

Below, "a person is hearing a bat"

We detail three types of behaviors that are inconsistent with the single role per word constraint

Share this page!

Enter URL or ID to Unroll