Prompting Sucks:
authorized fine-tunings are the future

(a thread for AI-gen developers and enthusiasts)

#AIart #AIArtwork An AI-generated image by #dalle2 + the artist; horses cantorAn AI-generated image by #dalle2 + the artist; a portrait of
This current period of prompt engineering is going to be short lived. The majority are not here for the new niche of prompt design. People use these tools because there are things they want to see and experience in high fidelity, right now.
The minute someone can provide a better experience than:

"brilliant 8K [shot type] [film title], featuring [subject] [action], [environment], [lighting type], directed by [director/production company]"

with truly excellent results, people will come running.
What then can AI devs do to provide a more consistent, high-quality experience and reduce prompting complexity for people engaging with these AIs?
1. Few Shot Learning (fine-tuning) — already being embraced by creators like @rainisto and @Nitrosocke to create artistically consistent outputs. A curated group of 20-40ish images teaches the model unified styles, consistent characters, composition structures, etc.
2. Reinforcement Learning — can be used to feed these consistent model outputs back to the model, formally, thus retraining and *reweighting* model output priorities.
3. Retrain the model with an order of magnitude increase to the data set — this is unlikely to happen because there just aren't that many quality images laying around that haven't already been used to train these models.
Side note: If you somehow happen to be sitting on such a dataset, you're starting to realize that it has incredible NEW value, e.g. if you're Hollywood right now, you're spinning up new legal teams, new contracts, maybe your own models, everything.
Current devs are therefore likely to stick with option 2, bypassing a new war on the most exclusive data, and instead creating their own datasets. After all, they have image-generators, why not use them?
Unfortunately, most of these developers are not image and art experts. They will instead think like product managers and look toward the desired outputs of their current core audiences.
Who is the current core audience and what are they producing right now? Well, to be fair, lots of people and lots of stuff. But IMO the majority are producing the type of things people fascinate on at the beginning of their artistic journey, not the end.
So product managers at big models are busy trying to build for their current audience of early art-interested, rather than a total potential audience who benefit most from art-precision and art-expertise.
We can already see it in Stable Diffusion v2.0 and Midjourney 4. The models now prioritize the fetish of hyper-reality, deep saturation, concept art, graphics, etc, over true photographic representation. To get there you need to use long lists of negative prompts/hacks.
"So what? Not everybody's into cinematic portraits!"

That's fine. But these model changes live in everything: the lighting, the sculpts, the color grade, the environment, the depictions, the actions, etc.
If you are someone who DOESN'T want these aesthetic priorities in your outputs, then bad news for you.
These are all indications that these early general purpose models are susceptible to trends and not the total solve. And that the AI-gen arms race won't necessarily be won by the largest data, the quickest to build, or the most marketing videos — because those don't help.
The smartest developers will parter with dataset owners, creatives, and other experts to produce specific models that deliver high quality results. They will promote the strengths of these models, be experts on them, and be honest about what they can and cannot do well.
This will allow for the participation of individual creators and larger rights holders within an AI-centric model economy that is sustainable and beneficial to all. It's a massive stream of potential new revenue that is largely being ignored in current AI-gen convos.
How could it work?:

AI developers could reach out to individual creatives, taste makers, editors, producers, rights houses, etc, to begin work on authoring fine-tunings. These fine-tunings would then be marketable by both parties to the end user.
How could it work (cont.):

End users could select from licensed fine-tunings, make their own fine tunings, and mix and match these fine-tunings; both in isolated scenarios, and with other prompt types.
Thanks for reading. To anybody who made it this far, I hope this thread is useful :)
Wanna use this thread to also acknowledge the major innovations that have taken place around prompting, one each from the big 3:

Outpainting - #Dalle2
IMG:IMG - #stablediffusion
IMG IMG prompt: IMG - #midjourney 4

More of this, please.
Finally, if you do want to know more about those negative prompts and hacks, check out @GanWeaving

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Stephen Loggins Parker

Stephen Loggins Parker Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Stephen_Parker

Sep 13
Family

Today I would like to share this series of #dalle2 AI-generated images. (thread for detail and process) Image
The series uses an initial prompt:

"Brilliant 8K portraiture from the film [film title], featuring people wrapped in faded blue linen, standing in the desert, directed by Annie Leibovitz"

to generate starter portraits in Leibovitz style.
The prompt is not a secret incantation or anything, it's just a rough guess at a good starting point (after generating tens of thousands of images with Dalle).
Read 17 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(