It is extremely instructive to follow the exponential growth of diffusion models in AI art. At present 3 major players @openAI with Dall-E, @midjourney founded by @DavidSHolz, and @StabilityAI founded by @EMostaque. Each player has taken a different path in their release.
More capable systems exists from Google (i.e. Imagen imagen.research.google ) and Facebook (Make-A-Scene ai.facebook.com/blog/greater-c…), but these aren't available to the public.
I want to discuss how players release their AI products into the wild. #dalle2 has been available since July 20th after opening up a restrictive waitlist 2 months earlier. I had the chance to refine my prompting skills since then:
When you first access a tool like this, you don't know how to prompt what you imagine. You learn either through experimentation or through examples of other users. Twitter was the medium where you could learn from others. @hardmaru generated a ton of interesting ideas.
The other tool that was also out there was @midjourney. I've not had access (I'm not an artist) until recently. I was surprised to discover that you can only use it via Discord. I find Discord to be too noisy and always thought using Discord was for convenience and not design.
@StabilityAI mirrored the same roll-out on Discord. They employed a superior algorithm based on Perceiver-IO that serendipitously could run on a single GPU.
Stable Diffusion is trained on an aesthetic subset of images (i.e. aesthetics score of 6 or more). Thus unlike Dall-E 2, it's biased towards a certain kind of aesthetics. huggingface.co/datasets/Chris…
We often focus too much on how AI is trained but overlook the design of its community of users. There is a clear difference in how people engage with these tooling, and I would like to discuss these differences.
Stable Diffusion came out first on Discord. They've shut it down once they open sourced the neural network. The interesting feature of SD was that you could borrow the seeds used by other users and recreate the images or variants of the images. You can't do this with Dall-E 2.
Thus the Discord server offered a means for users to reuse images created by other users. With a seed, you can generate new variants by tweaking the original prompt. Discord's shared space thus became a hive-mind brainstorming space.
This reuse of images is also available with Midjourney, but the engagement is different. Instead a user can only upscale or create variants of the generated image. The seeds are not made available. The consequence of this is you can always discover an image's ancestry.
Ancestry is very important because it's necessary to recreate images. Recreating images is next to impossible in Dall-E 2.
David Holz @DavidSHolz Midjourney takes the genetic approach of ArtBreeder and fuses it with a Stable Fusion network. It's an open-ended ecosystem for artists to discover novel ways of expression. artbreeder.com/beta/image/93c…
In contrast, Stable Diffusion open source leads to another kind of open-ended evolution. Programmers are now tweaking the algorithm to add new capabilities such as img2img, inpainting, reverse text, etc. This can only lead to the development of more innovative workflows.
It feels like drinking from a firehouse when you enter the Discord servers of Midjourney or StabilityAI. It's too much to grasp too soon! Many are in panic and there's a noticeable interest in "AI alignment." David Sholz says its like swimming in water. theverge.com/2022/8/2/23287…
You can either drown by consuming too much water or go with the flow and accept the richness of collaboration. In AI, we must accept that new technologies lead to new ways of design. Therefore how we design human interaction enabled by AI should not be overlooked.
The utter failure of the AI and AI Alignment community is the lack of realization of the importance of collaborative design. Humans, after all, have designed our societies just as we've designed our technologies. Treating these as separate things solves just half the problem.
So here we are today, inundated with AI art, yet we are still grasping in the dark as to how to deal with it. We can't deal with it unless we change the framing of what it means to be human. An area of study that civilization has been ignoring for too long.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Carlos E. Perez

Carlos E. Perez Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @IntuitMachine

Sep 7
Here are some nice visualizations for cognitive science. Let's begin with "brain in a vat." #midjourney Image
Cartesian Mind-Body dualism Image
"The Brain is a Computer" Image
Read 9 tweets
Sep 6
The progress in AI art is just mindblowing! Dall-E was impressive, but StableDiffusion took it to the next level. Midjourney is now just outstanding!
These textures are just ludicrous. I'm not an art expert, but one can't help but notice how visually stunning these are.
Midjourney is an intriguing experiment. It's on a discord chart where you see a constant stream of ideas and you branch out the images that you like.
Read 10 tweets
Sep 6
The argument for a software-hardware divide in brains is a consequence of Cartesian dualistic thinking. Brains evolve their own cognition and hence mind and body are inextricably tangled. Brains are devoid of the crisp interfaces we design between software and hardware.
Software in computers is encoded in machine language. Complex algorithms are achieved because computer science has developed programming languages that allow humans to translate their intentions into code. We too often ignore the importance of these interfaces to computation.
When we invoke the metaphor of the software-hardware divide, we assume an excluded middle. But it's that middle that is of most importance. What is the process in between other than a process of interpretation?
Read 7 tweets
Sep 4
Written 36 years ago, perhaps one of the most important books on AI ever written. Absolutely relevant even today!!
Do you know what's really telling? Almost none of the AI people who I engage with on Twitter has liked or retweeted my tweet? Why is this? Because they are afraid to admit either (1) that they've never read it or (2) they never understood it.
This is a very short book of only 179 pages in 12 chapters. I would like to start a Zoom online book reading group for this so I can explain how it relates to present-day AI (i.e. deep learning). Like if you are interested!
Read 4 tweets
Sep 1
Substance metaphysics frames reality around immutable things. Categorization is an abstract, immutable thing. The category for dogs should not be changing. Yet, in science, the definition of categories keeps evolving. Pluto is not a planet 'coz a change in the meaning of planet.
Lao Tzu wrote in the 6th century BCE that labels are transient and that our labels and habits should act as guides to discover new unities in wholes.
C.S.Pierce explains that symbols are a consequence of our collective habits. Symbols, like categories, remain the same for a long time. We think of the meaning of words to be constant, but we know that language like evolution changes over time. sciencenode.org/feature/evolut…
Read 16 tweets
Aug 30
The most glaring flaw of Deep Learning is masked because our entire society has been indoctrinated into an ontology based on substance metaphysics. We assume a feature about cognition that is simply non-existent but we think is real because of our metaphysics. Let me explain!
The value of deep learning became apparent in 2012 when it began topping the leaderboards in image classification benchmarks. The proof of its usefulness was a consequence of its ability to perform image categorization.
This got everyone excited because categorization is thought of as being the core of cognition. This is a category error. We believe categorization is at the core because our philosophy is based on things. It's substance based. It is not based on processes and relations.
Read 24 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(