Alex J. Champandard Profile picture
Nov 3 5 tweets 2 min read
In NVIDIA's new paper on #Diffusion Models, they show how more denoisers (for each stage) and more embeddings (text, image) helps with quality!

TL;DR: If you buy more GPUs, you get correct spelling too.
deepimagination.cc/eDiffi/ #AI #ML
With so many different labs rushing to research and deploy this kind of technology, this will quickly turn into a race for more efficiency as different providers compete on costs too.
The paper is a bit evasive on the dataset (LAION?) — I presume for legal reasons. But the good news is that it's "only" 1B text/image pairs... although they are highly filtered.

IMHO there's much more room to improve quality with the current datasets.
Note that in the first tweet, the "Trending On ArtStation" prompt engineering hack that's equivalent for photos is "4K DSLR"!

It still can't get fingers right though! (eDiffi is on the right, DALLE-2 in the middle, Stable on left.)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex J. Champandard

Alex J. Champandard Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @alexjc

Nov 4
"The Right To Read Is The Right To Mine" was a campaign from ~2012-2015 to convince the public & legislators that machines should bypass copyright for data-mining.

IMHO we're at the next stage of this campaign, now for generative systems — should they act outside copyright?
Articles like this one are at the tail end of the first pro-mining campaign and precursors to this new generative campaign?

It tries to establish that "reading by robots doesn’t count" and "infringement is for humans only".

ilr.law.uiowa.edu/print/volume-1… (via @GradySimon)
Of course, this view obfuscates the situation! Think systemically:

Robots that are mining or learning are operated by companies that operate under copyright (both for benefit and liability).

Robots that are generating are operated by human users also operating under copyright.
Read 4 tweets
Nov 4
If you're working at a generative company, and worried about the lawsuit against GitHub for their generative model, please take some comfort in the fact that I think they made *many* missteps — with either a serious lack of due care, or the intent to break the law.
For instance, Google announced they had a similar code model and they didn't release it. They used it internally & measured a 6% improvement on productivity while they understand the legal and ethical implications.

(Could also be that Google wanted to see others get sued first?)
I will compile my best advice for companies who, understandably, want to continue their work in a promising/competitive field, but also don't want to spend all their money on lawyers!

Stay tuned...
Read 4 tweets
Nov 3
Reading through the GitHub CoPilot litigation submitted; although it was pulled off quickly — it's a solid piece of work!

My assessment is that the defendants, GitHub, Microsoft and OpenAI are in a very bad position...
githubcopilotlitigation.com
The documents show how Codex and CoPilot act like databases; they have three different examples of JS code that is recited verbatim — with mistakes — from licensed sources.

Including this debug code below isPrime(n): Image
The documents then proceeds to cast doubt on the claim of FairUse, that even if it was applicable here, it wouldn't help circumvent (a) the breach of contract, (b) the privacy issues, and (c) the DMCA. Image
Read 17 tweets
Nov 3
You know how hands & fingers are particularly difficult to generate?

Wouldn't it be funny if people having important conversations online (in the near future) used hand gestures in front of their faces, so both sides know it's not a #DeepFake.

Anchor: I'm sorry to ask Mr. President, but before this TV interview can proceed please make a creative gesture with your hands.

Pres: What?

Anchor: Well, in the last election multiple candidates were caught using DeepFakes to make them look & sound smarter than they are.
Bank: Sir, we need to authenticate you by online video because of climate lockdown #37.

Customer: OK, let's do it!

B: Make the vulcan hand sign, flip it a round, turn it into a finger gun, and pretend to shoot in the air.

C: Wait, what!

B: Yes, because #DeepFakes.
Read 4 tweets
Oct 17
When large language models are explicitly trained to use Python and look-up Wikipedia, we'll be entering scary territory for #InfoSec#AI!
OpenAI engineers probably did this a few months ago, now frantically trying to make sure their Python sandboxed environments are sufficiently safe...
Thread predicting this is the best next direction for LLMs and why it's important (e.g. you don't need to retrain models with new information, just use an API for DB access):
Read 4 tweets
Oct 17
It's amazing how this great paper about prompt engineering from August (arxiv.org/abs/2208.01626) is only really getting wide-spread attention now there are good open-source implementations:
- github.com/google/prompt-…
- github.com/bloc97/CrossAt…

Academic Impact: OSS or GTFO?
Prompt-To-Prompt editing allows you to easily change your input text without needing to completely regenerating the image. This makes it much easier to control the diffusion!

Example from bloc97's GitHub, four seasons of the same scene:
Prompt-To-Prompt falls into the category of UX improvements of stable diffusion, and speed of iteration is major competitive factor.

Platforms able to deliver speed, e.g. by caching temporary data about the generation (not just the random seed) have a big advantage! [1/3]
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(