Christian Bluethgen Profile picture
Oct 11, 2022 10 tweets 8 min read Read on X
🎉 #StableDiffusion can be fine-tuned to generate medical images, and the outputs can be controlled using natural language text prompts!

In our latest work, we use SD to create synthetic chest xrays and insert pathologies like pleural effusions.

🧵 #Radiology #AI #StanfordAIMI Original and refined synthe...
#stablediffusion is a #LatentDiffusionModel and performs its generative tasks efficiently on low-dimensional representations of high-dimensional training inputs. SD's VAE latent space preserves relevant information contained in CXR; they can be reconstructed with high fidelity. Image
#StableDiffusion’s output can be controlled at inference time by using text prompts, but it is unclear how much medical imaging concepts SD incorporates. Simple text prompts show how hard it can be to get realistic-looking medical images out-of-the-box without specific training. Image
If SD’s frozen #CLIP text encoder does not include enough medical concepts to work with radiology prompts, how about switching it with a domain-specific one like PubmedBERT and projecting the embeddings? The results did not resemble CXR visually or quantitatively. Yikes! Image
How about teaching the model new concepts? Using #TextualInversion (@RinonGal et al.,2022), we introduce new tokens like <lungxray> and a small set of CXR. The results are visually and quantitatively better, but still far from the medical reality. ("a photo of a <lungxray>") Image
Fine-tuning the U-Net of the #StableDiffusion pipeline with a semantic prior as proposed by @natanielruizg et al. 2022 finally resulted in visually convincing chest x-rays with visible lack or presence of pleural effusion, depending on the text prompt. Image
Eventually we wanted to see how well a pretrained classification model trained on real CXR can do on generated data: DenseNet-121 was able to predict pleural effusions with an accuracy of 95% in the synthetic samples created with our best-looking approach. Image
Our work highlights the power of pretrained large multi-modal models like #StableDiffusion and gives a glimpse of how much there is to explore for the medical imaging domain! Can’t wait to test this on other modalities and pathologies to increase the diversity of the output.
This work results from working with the brillant @PierreChambon6, @Dr_ASChaudhari and @curtlanglotz at @StanfordAIMI.

It builds on the work of @robrombach, @RinonGal and @natanielruizg (and many others) and great resources @huggingface, @EMostaque and @StabilityAI are providing.
🎉 That's it! 🎉

To read more about the project:
📝 Full preprint (arXiv): arxiv.org/abs/2210.04133
🖼️ Website with sample gallery: bit.ly/3T1MD1g

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Christian Bluethgen

Christian Bluethgen Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @cxbln

Oct 4, 2023
Worrying if my job was jeopardized by AI this week or if we’re still good, I read a new paper evaluating #GPT4V - a #GPT4 version handling image and text inputs. It produces *impressive* radiology reports. But let’s delve deeper into some of the results... #radiology #AI Image
Here, GPT4V correctly identified a fracture of the 5th metatarsal bone. However, this is not a Jones fracture (which is in the proximal part of the bone and sometimes doesn’t heal well, requiring more aggressive management). Almost correct ≠ Correct, esp. in medicine. Image
Here, the model correctly identified a suspicious pulmonary nodule but incorrectly described its location and explicitly hallucinated its size. Additionally, it inferred a lack of pathologically enlarged lymph nodes, which is impossible to determine from just one slice. Image
Read 6 tweets
Nov 24, 2022
🎉Introducing RoentGen, a generative vision-language foundation model based on #StableDiffusion, fine-tuned on a large chest x-ray and radiology report dataset, and controllable through text prompts!

@PierreChambon6 @Dr_ASChaudhari @curtlanglotz

🧵#Radiology #AI #StanfordAIMI Text-conditioned synthesis of CXR. Each image was hand-picke
#RoentGen is able to generate a wide variety of radiological chest x-ray (CXR) findings with fidelity and high level of detail. Of note, this is without being explicitly trained on class labels. Synthetic images created by prompting a fine-tuned model for
Built on previous work, #RoentGen is a fine-tuned latent diffusion model based on #StableDiffusion. Free-form medical text prompts are used to condition a denoising process, resulting in high-fidelity yet diverse CXR, improving on a typical limitation of GAN-based methods.
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(