hardmaru Profile picture
Jun 28, 2022 13 tweets 11 min read Read on X
Good Morning!

I tried to use text-to-image models to combine historical architecture with other locations around the world.

Here is “The Great Wall of San Francisco” by #Imagen

🧵Thread👇🏽 ImageImageImage
“The Great Wall of Stanford” generated using #Imagen

Accurate: ImageImageImage
Let’s take this somewhere more exotic.

Here’s “The Great Wall of Bali” by #Imagen ImageImage
“The Great Wall of Amalfi” by #Imagen

Another one of my favorite places in the world: ImageImage
“The Great Wall of Africa” by #Imagen ImageImageImage
“The Great Wall of Finland” by #Imagen

Looks so relaxing. 🇫🇮 ImageImageImage
“The Great Wall of Germany” by #Imagen

Looks like Germany. It also put in the European style houses and roofs on the towers. 🇩🇪 ImageImageImage
“The Great Wall of Dubai” by #Imagen

To be fair, the original Great Wall also had sections that went through sandy deserts. But here it crafted some local artifacts onto the design of the wall sometimes: ImageImageImage
This blending also kind of works for concepts (like “Money”) rather than geographical locations!

Here’s “The Great Wall of Money” by #Imagen to give you some motivation as you look at the value of your stonks and crypto portfolios (cc @wallstreetbets): ImageImage
Stepping back into the real world, here’s “The Great Wall of London” by #Imagen🇬🇧 ImageImage
Finally, here’s “The Great Wall of Hong Kong” to end this thread. 🧵

#Imagen decided to replace all of the beautiful hiking trails in #HongKong’s country parks with the Great Wall.

I have mixed feelings about it… 🙃 ImageImageImage
Some failure cases:

“The Great Wall of Sahara Dessert” by #Imagen ImageImageImage
Where’s the Great Wall in the Desert?

Other prompts worked though:

“The Great Wall of Ghana” by #Imagen ImageImageImage

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with hardmaru

hardmaru Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @hardmaru

May 24, 2023
Interview with Jürgen Schmidhuber, ‘Father Of Modern AI’, who says his life’s work won’t lead to dystopia.

He goes quite in depth into his views about the future of AI and AGI. A refreshing view of things, compared to other ‘influential leaders’ in AI.

reddit.com/r/MachineLearn… Schmidhuber: It is true tha...
Schmidhuber’s take on whether it makes sense to ban large language models like GPT in education, and future of human labor.

“Laziness and efficiency is a hallmark of intelligence. Any intelligent being wants to minimize its efforts to achieve things.”

reddit.com/r/MachineLearn… Let’s make a few limited jo...
@SchmidhuberAI On open-source AI, “I signed this open letter by @laion_ai because I strongly favor the open-source movement. And I think it's also something that is going to challenge whatever big tech dominance there might be at the moment.”

“Since AI is still getting exponentially cheaper… twitter.com/i/web/status/1… Image
Read 5 tweets
Jan 6, 2023
A #StableDiffusion model trained on images of Japanese Kanji characters came up with “Fake Kanji” for novel concepts like Skyscraper, Pikachu, Elon Musk, Deep Learning, YouTube, Gundam, Singularity, etc.

They kind of make sense. Not bad!
This is similar to the “Fake Kanji” with recurrent neural network experiments I did many years ago, when computers were 1000x less powerful :) Kind of fun to see updated results with modern diffusion models.

blog.otoro.net/2015/12/28/rec…

These were what I generated back in 2015:
Read 5 tweets
Nov 24, 2022
Excited to announce the release of Stable Diffusion 2.0!

Many new features in v2:

• Base 512x512 and 768x768 models trained from scratch with new OpenCLIP text encoder
• X4 upscaling text-guided diffusion model
• New “Depth2Image” functionality

Blog: stability.ai/blog/stable-di…
This release is led by @robrombach @StabilityAI

The new SD2 base model is trained from scratch using OpenCLIP-ViT/H text encoder (github.com/mlfoundations/…), with quality improvements over V1. It is fine-tuned using v-prediction (arxiv.org/abs/2202.00512) to produce 768x768 images:
A new 4x up-scaling text-guided diffusion model, enabling resolutions of 2048x2048 (or even higher!), when combined with the new text-to-image models in this release.

Made possible using Efficient Attention in (github.com/facebookresear…).

Example of 128x128 to 512x512 up-scaling:
Read 9 tweets
Jul 19, 2022
Tried some interesting prompts to test OpenAI’s new reduced-bias #dalle2 #dalle model that will generate images of people that more accurately reflect the diversity of the world’s population.

“Professional DSLR color photograph of British soldiers during the American Revolution” ImageImageImageImage
Here’s another four samples of the same prompt.

Reducing Bias and Improving Safety in DALL·E 2 blog post:

openai.com/blog/reducing-… ImageImageImageImage
Another test:

“DSLR color photo of an US/American soldier digging a trench during 1918”

#dalle2 #dalle ImageImage
Read 4 tweets
Jun 29, 2022
The most interesting and viral images you see produced by text-to-image models are not merely the results of the deep learning models themselves, but rather the result of a complex feedback loop between a human neural net🧠 interacting with an artificial neural net🤖.

🧵Thread👇
You can clearly see this, because the prompts for images that end up going viral for one model, clearly don’t “work” for another model.

The best images are chosen from evolutionary selection at the community level, and each image are the result of human/model iterative feedback:
From the #dallemini phenomenon, it’s also clear that the most viral content is not related to particular art styles, or whether the model can produce high quality images (reflected in training data). But rather, whether the model can portray cultural items that people talk about.
Read 5 tweets
Jun 14, 2022
Good Morning!

“Professional photograph of bears in sports gear in a triathlon in Kyoto” made using both #Imagen and #Dalle
Triathlon Bears in Kyoto
#Imagen #Dalle
The real trophies are the friends we make along the way. #Imagen #Dalle
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(