Christian Bluethgen Profile picture
Nov 24 13 tweets 10 min read
🎉Introducing RoentGen, a generative vision-language foundation model based on #StableDiffusion, fine-tuned on a large chest x-ray and radiology report dataset, and controllable through text prompts!

@PierreChambon6 @Dr_ASChaudhari @curtlanglotz

🧵#Radiology #AI #StanfordAIMI Text-conditioned synthesis of CXR. Each image was hand-picke
#RoentGen is able to generate a wide variety of radiological chest x-ray (CXR) findings with fidelity and high level of detail. Of note, this is without being explicitly trained on class labels. Synthetic images created by prompting a fine-tuned model for
Built on previous work, #RoentGen is a fine-tuned latent diffusion model based on #StableDiffusion. Free-form medical text prompts are used to condition a denoising process, resulting in high-fidelity yet diverse CXR, improving on a typical limitation of GAN-based methods.
Context: Latent diffusion models like #StableDiffusion trained on large natural image-text datasets like @laion_ai’s #LAION-5B are able to generate highly realistic images controlled by text prompts, but their knowledge about specific domains like medical imaging is limited.
Few-shot fine-tuning of #StableDiffusion with a prior-preserving loss (#DreamBooth) previously allowed us to insert pathologies in generated CXR by text prompt, but the generated images show comparatively little diversity and are constrained to the classes used during training.
After scaling to tens of thousands of CXR image-report pairs, SD starts replacing previously learned concepts in favor of medical domain-specific concepts like radiographic abnormalities (e.g., pleural effusions) with increasing levels of correctness and new abilites.
#RoentGen developed the ability to control CXR appearance with appropriate medical terminology and concepts. Note how in the first image, the generated images are in line with the radiological convention of displaying the right patient side on the left side of the image.
Compared to previous work, the outputs show a high degree of diversity. Note the variable appearance of the right-sided pleural effusion with varying amounts of interlobar fluid (top row, white arrowheads) for “big right (left) sided pleural effusion with adjacent atelectasis”. Intra-prompt generation diversity. Top row: four samples for
Why synthetic CXR? They can be used to improve downstream tasks! Fine-tuning RoentGen on fixed training data yields a 5% improvement of a classifier trained jointly on synthetic and real images, and a 3% improvement when trained on a larger but purely synthetic training set.
Outside of data augmentation, this high level of control over the generated output also opens up new ways for data sharing (sharing models instead of the data itself), education and could be used to mitigate data imbalances and biases.
To balance the benefits of open-source science with the challenges of improper use of generative models, we aim to ensure sharing weights in accordance with the data usage agreement of MIMIC-CXR.
Weights can be requested in a tiered release at: forms.gle/Ggu2Kbu2MjMjxw…
This project was a strong team effort accomplished by @PierreChambon6 Jean Benoit Delbrouck @sluijsjr @MPolacin @JMZambranoC @curtlanglotz @Dr_ASChaudhari from @StanfordAIMI and @StanfordRad and made possible with the help of @iScienceLuvr and S. Purohit from @StabilityAI.
That’s it! If you want to know more, check out the following resources:
Full preprint (arXiv): arxiv.org/abs/2211.12737
Project website: stanfordmimi.github.io/RoentGen/

#StanfordAIMI @StanfordAIMI @StanfordHAI

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Christian Bluethgen

Christian Bluethgen Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @cxbln

Oct 11
🎉 #StableDiffusion can be fine-tuned to generate medical images, and the outputs can be controlled using natural language text prompts!

In our latest work, we use SD to create synthetic chest xrays and insert pathologies like pleural effusions.

🧵 #Radiology #AI #StanfordAIMI Original and refined synthe...
#stablediffusion is a #LatentDiffusionModel and performs its generative tasks efficiently on low-dimensional representations of high-dimensional training inputs. SD's VAE latent space preserves relevant information contained in CXR; they can be reconstructed with high fidelity. Image
#StableDiffusion’s output can be controlled at inference time by using text prompts, but it is unclear how much medical imaging concepts SD incorporates. Simple text prompts show how hard it can be to get realistic-looking medical images out-of-the-box without specific training. Image
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(