@kevin2kelly@glichfield@FlyingTrilobite@neilturkewitz@WIRED@sciam 1/ Mr. Kelly you are operating on assumptions. You claim that "AI has not harmed actual people", which is entirely speculation and not facts. So let's talk facts, and with the evidence you ask for. Let's begin, and please be patient, it's going to be a bit long 🧵👇:
@kevin2kelly@glichfield@FlyingTrilobite@neilturkewitz@WIRED@sciam 2/ A quick intro to who am I. I am a professional artist of over 15 years working in the film industry (Marvel, HBO, Universal etc), and have been an advocate for my community, raising alarms on the unethical practices of AI companies, and how it harms our community.
@kevin2kelly@glichfield@FlyingTrilobite@neilturkewitz@WIRED@sciam@atg_abhishek 7. Data Laundering, From Research to Commercial:
Stability AI, Midjourney and even Google have utilized various datasets from LAION in their commercial models. This is surprising since LAION is supposed to be a non-commercial research data set (cont.)
@kevin2kelly@glichfield@FlyingTrilobite@neilturkewitz@WIRED@sciam@atg_abhishek 9/ Our full names used, potential for Privacy violations: To generate media through AI/ML models, users have to input prompts telling the software what to generate. Artists' full names are commonly used—in fact, encouraged to be used—as part of those prompts (cont.)
1/ Hi @Oprah . This event will be the first time many people will get info on Generative Ai. However it is shaping up to be a misinformed marketing event staring vested interests (some who are under a litany of lawsuits) who ignore the harms GenAi inflicts on communities NOW 🧵
2/ With your power and reach there is an opportunity, even a responsibility, to show the world the real ways GenAi harms the public, harms creatives whose work these models solely depend on to function at capacity and so on. Harms seen from the inception of the tech to its output
3/ Let’s start with its inception. GenAi models rely on vast amounts of data (in the billions) This data is acquired by use of web crawlers who take whatever data it can find.
We won BIG as the judge allowed ALL of our claims on copyright infringement to proceed and we historically move on The Lanham Act (trade dress) claims! We can now proceed onto discovery!
The implications on this order is huge on so many fronts!
2/3 Not only do we proceed on our copyright claims, this order also means companies who utilize SD models for and/or LAION like datasets could now be liable for copyright infringement violations, amongst other violations.
The whole thing is worth a read. I love these parts!
3/3 Some motions got fully dismissed, like DMCA (which respectfully i dont agree with but it is what it is). But we are now potentially one of THE biggest copyright infringement and trade dress cases ever! Looking forward to the next stage of our fight! ✊ storage.courtlistener.com/recap/gov.usco…
1/ Apple “Intelligence” is here and 0 questions of “where does the data come from?” to be seen in press.
APPLE is trying to shove a huge privacy risk and tech that screams scraped off the internet without consent to the public. So here’s a list of potential data sources 🧵
2/ The thing is Apple has been hinting to what these models could be or where the data comes from for quite a bit.
For starters let’s take a look at one potential lead to where the data could come from.
3/ DataComp is a project from researchers of various universities, but also more urgently from researchers at Apple and LAION (yes. LAION) to build an even bigger version of LAION, using the same exact common crawl web scraping practices that LAION had.
1/6 Everything that makes the internet feel wonderful and open, where creatives shared their work freely, these wonderful works are now being stolen and extracted by tech execs in a rush to train genAI models that mimic those wonderful creators.
2/6 Honestly, I’m not convinced any ToS really can hold because of the uniqueness of this scenario in so many ways. For starters companies want to claim Section 230 protections so they’re not responsible for user content, but at the same time they want to own that user content?-
3/6 Secondly, how this tech uniquely consumes our creativity to function and its massive impact to creatives industries is unlike anything that has come before. I really don’t believe any contract would have realistically foreseen this particular one sided exploitation.
1/ I really hope what @Adobe claims is true and if it were it would be a good step in the right direction. However after some cursory digging I have serious questions. Lets dig in 🧵
2/ Right off the bat, have Adobe Stock contributors given their explicit full consent to be a part of this? Did they agree to opt in? Why is there no option for Adobe contributors to not opt out? This seems concerning to me.
3/ The Firefly FAQ mentions training data. So far the explanation is that the model is trained on Adobe Stock, openly licensed work and public domain content with expired copyright. What exactly does openly licensed work mean? There must be complete transparency here.
1/ This might be the most important oil painting I’ve made:
Musa Victoriosa
The first painting released to the world that utilizes Glaze, a protective tech against unethical AI/ML models, developed by the @UChicago team led by @ravenben. App out now 👇
2/ This painting is a love letter to the efforts of this incredible research team and to the amazing artist community. This transformational tech takes the first of many steps, to help us reclaim our agency on the web, by making our work not be so easily exploited. Detail shots:
3/ So how does Glaze work?
@ravenben describes:
“Glaze analyzes your art, and generates a modified version (with barely visible changes). This "cloaked" image disrupts AI mimicry process.” Quote tweet below.