Tweet

https://twitter.com/GalaxyKate/status/1588210859196776449

https://twitter.com/GalaxyKate/status/1588255873557635072

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @alexjc

Alex J. Champandard

@alexjc

Nov 4

"The Right To Read Is The Right To Mine" was a campaign from ~2012-2015 to convince the public & legislators that machines should bypass copyright for data-mining.

IMHO we're at the next stage of this campaign, now for generative systems — should they act outside copyright?

@GradySimon

Articles like this one are at the tail end of the first pro-mining campaign and precursors to this new generative campaign?

It tries to establish that "reading by robots doesn’t count" and "infringement is for humans only".

ilr.law.uiowa.edu/print/volume-1… (via @GradySimon)

Of course, this view obfuscates the situation! Think systemically:

Robots that are mining or learning are operated by companies that operate under copyright (both for benefit and liability).

Robots that are generating are operated by human users also operating under copyright.

Read 4 tweets

Alex J. Champandard

@alexjc

Nov 4

If you're working at a generative company, and worried about the lawsuit against GitHub for their generative model, please take some comfort in the fact that I think they made *many* missteps — with either a serious lack of due care, or the intent to break the law.

For instance, Google announced they had a similar code model and they didn't release it. They used it internally & measured a 6% improvement on productivity while they understand the legal and ethical implications.

(Could also be that Google wanted to see others get sued first?)

I will compile my best advice for companies who, understandably, want to continue their work in a promising/competitive field, but also don't want to spend all their money on lawyers!

Stay tuned...

Read 4 tweets

Alex J. Champandard

@alexjc

Nov 3

Reading through the GitHub CoPilot litigation submitted; although it was pulled off quickly — it's a solid piece of work!

My assessment is that the defendants, GitHub, Microsoft and OpenAI are in a very bad position...
githubcopilotlitigation.com

The documents show how Codex and CoPilot act like databases; they have three different examples of JS code that is recited verbatim — with mistakes — from licensed sources.

Including this debug code below isPrime(n):

The documents then proceeds to cast doubt on the claim of FairUse, that even if it was applicable here, it wouldn't help circumvent (a) the breach of contract, (b) the privacy issues, and (c) the DMCA.

Read 17 tweets

Alex J. Champandard

@alexjc

Nov 3

In NVIDIA's new paper on #Diffusion Models, they show how more denoisers (for each stage) and more embeddings (text, image) helps with quality!

TL;DR: If you buy more GPUs, you get correct spelling too.
deepimagination.cc/eDiffi/ #AI #ML

With so many different labs rushing to research and deploy this kind of technology, this will quickly turn into a race for more efficiency as different providers compete on costs too.

The paper is a bit evasive on the dataset (LAION?) — I presume for legal reasons. But the good news is that it's "only" 1B text/image pairs... although they are highly filtered.

IMHO there's much more room to improve quality with the current datasets.

Read 5 tweets

Alex J. Champandard

@alexjc

Oct 17

https://twitter.com/goodside/status/1581805503897735168

When large language models are explicitly trained to use Python and look-up Wikipedia, we'll be entering scary territory for #InfoSec ∩ #AI!

https://twitter.com/goodside/status/1581805503897735168

OpenAI engineers probably did this a few months ago, now frantically trying to make sure their Python sandboxed environments are sufficiently safe...

https://twitter.com/alexjc/status/1517422782103105537

Thread predicting this is the best next direction for LLMs and why it's important (e.g. you don't need to retrain models with new information, just use an API for DB access):

https://twitter.com/alexjc/status/1517422782103105537

Read 4 tweets

Alex J. Champandard

@alexjc

Oct 17

It's amazing how this great paper about prompt engineering from August (arxiv.org/abs/2208.01626) is only really getting wide-spread attention now there are good open-source implementations:
- github.com/google/prompt-…
- github.com/bloc97/CrossAt…

Academic Impact: OSS or GTFO?

Prompt-To-Prompt editing allows you to easily change your input text without needing to completely regenerating the image. This makes it much easier to control the diffusion!

Example from bloc97's GitHub, four seasons of the same scene:

Prompt-To-Prompt falls into the category of UX improvements of stable diffusion, and speed of iteration is major competitive factor.

Platforms able to deliver speed, e.g. by caching temporary data about the generation (not just the random seed) have a big advantage! [1/3]

Read 5 tweets

Share this page!

Alex J. Champandard

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @alexjc

Alex J. Champandard

Alex J. Champandard

Alex J. Champandard

Alex J. Champandard

Alex J. Champandard

Alex J. Champandard

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!