Research in @UNCNLP. Research intern @MSFTResearch. Incoming PhD @Berkeley_ai
May 22, 2023 • 10 tweets • 6 min read
🖼️🎞️🔊📄Excited to introduce Composable Diffusion (CoDi), a new generative-AI foundation model that can take any combo of input modalities & generate any combo of output modalities (text, audio, image, video)! codi-gen.github.io @yzy_ai@ChenguangZhu2@mohitban47 🧵👇 #CoDi
Many existing models are restricted to generating one modality from another, like text-to-image, text-to-audio, or audio-to-image. On the other hand, CoDi can generate multiple modalities in parallel and its input is not limited to a subset of modalities like text or image.