AK Profile picture
21 Dec, 4 tweets, 2 min read
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
abs: arxiv.org/abs/2112.10741

Samples from a 3.5B parameter text-conditional diffusion model using classifier free guidance are favored by human evaluators to those from DALL-E
Text-conditional image inpainting examples from GLIDE. The green region is erased, and the model fills it in conditioned on the given prompt. model is able to match the style and lighting of the surrounding context to produce a realistic completion.
Random image samples on MS-COCO prompts. For XMC-GAN,take samples from Zhang et al. (2021). For DALL-E, generate samples at temperature 0.85 and select the best of 256 using CLIP reranking. For GLIDE, use CLIP guidance with scale 2.0 and classifier-free guidance with scale 3.0.
github: github.com/openai/glide-t…
text2im notebook: github.com/openai/glide-t…

# This notebook supports both CPU and GPU.
# On CPU, generating one sample may take on the order of 20 minutes.
# On a GPU, it should be under a minute.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with AK

AK Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ak92501

10 Dec
GAN-Supervised Dense Visual Alignment
abs: arxiv.org/abs/2112.05143
project page: wpeebles.com/gangealing
github: github.com/wpeebles/gange…
Read 4 tweets
25 Nov
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
abs: arxiv.org/abs/2111.12417

presents a unified multimodal pretrained model that can generate new or manipulate existing visual data (i.e., images and videos) for various visual synthesis tasks
Image
Image
Read 17 tweets
6 Nov
.@Gradio Demo for AnimeGANv2 Face Portrait v2 now on @huggingface Spaces
demo: huggingface.co/spaces/akhaliq…
github: github.com/bryandlee/anim…
Read 7 tweets
24 Oct
.@Gradio demo for ByteTrack: Multi-Object Tracking by Associating Every Detection Box now on @huggingface Spaces
demo: huggingface.co/spaces/akhaliq…
github: github.com/ifzhang/ByteTr…
Read 5 tweets
23 Oct
CLIP prefix captioning
demo: huggingface.co/spaces/akhaliq…
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(