ok, I've now read the full NYT complaint filed this morning vs OpenAI and Microsoft. I'm impressed - it's future-focused around fair value for work vital to democracy. It also contains 220k pages of exhibits although the pages of Ex J stood out to me. more on that in a minute. /1
The complaint is a must-read imho, it's the only way to understand the alleged violations and the extent as to which the systems have been designed and tuned in order to generate certain output. It's filed in SDNY and it may well be a landmark case. /2
It's rooted in copyright law and the US Constitution and that's very much where it begins. /3
And as it notes there is a lot of money at stake. But it does well to look towards the future showing how violations used to create a substitute undermin existing (and future) biz models (including AI licensing) which fund the critical and costly journalism around the globe. /4
The complaint makes it clear early on that the goal in negotiations to license its content is to receive fair value and to help shepherd a future world with responsible AI and a healthy news ecosystem. /5
It cites a number of examples as to the human and financial cost that goes into journalism which can span multiple continents and require working through very challenging access limitations. /6
That cost in 2023 includes mass shootings, wars, terrorist attacks, elections, financial infrastructure and natural disasters around the world. There can be no debate as to what it is at stake here. /7
And with a clearly tied role for Microsoft, the complaint highlights abuses even in the most recent months. It shows this example of content lifted verbatim from a NYT report and then compares it to the approach taken by a search engine. /8
Here is the search comparison using Microsoft's own search engine. The difference in handling of copy from the content is immediately obvious and impossible to debate. /9
The complaint also steps through the preference and weighting used for sources with claims NYT-sourced content is more valued for training. And that undermining that real investment will undermine the entire market for journalism - including licensing it for future AI. /10
There are a number of examples in the complaint around weighting showing not all brands and content are equal but I found the overweighting of WebText2 as a pretty good example of how "high-quality" content is given preference. /11
And Google PageRank as one of the oldest approaches on the web to sorting authority across websites... here it's noted that nearly all of the few entries above NYT are social media so significantly less helpful to training a model. /12
So back to Exhibit J. Unlike the other 220k+ pages of exhibits documenting registered works, this exhibit contains 100 examples of alleged copyright violations with nearly identical content being outputted by ChatGPT. Again, it's impossible to argue with this. /13
Here are four examples. Again, the lawsuit includes one hundred of them. You get the point. I find this exhibit to be an incredibly powerful illustration for a lawsuit that will go before a jury of Americans. Again, it's impossible to argue with this. /14
And finally the lawsuit rips a gigantic hole into the presentation of OpenAI as a benevolent nonprofit. /15
and on top of this, it also systematically walks through Microsoft's role in facilitating and contributing to the alleged copyright. Side note from me: Microsoft has gained one trillion dollars in market cap in 2023. /16
Finally a quote from me on all of this that I supplied to news outlets earlier today. /17
Here is a link to the full complaint and all of the exhibits. I would start with the 69 page complaint and then skip to Exhibit J if interested and I were you. Cheers. /18 courtlistener.com/docket/6811704…
ok, this is HUGE. Late Friday, Penske (PMC) filed a wicked-smart, landmark antitrust lawsuit against Google. I've now read it in full and I'm very impressed. Importantly, it's the first antitrust suit for Google tying its AI-driven products to its adjudicated search monopoly. /1
The core claim: Google is abusing its search monopoly to force pubs to hand over content - not just for traditional search indexing but to feed its AI. Google then repurposes it to substitute them with its own services breaking the fundamental bargain of the open web. /2
Penske says this is not a fair exchange. If it weren't for Google's adjudicated monopoly power (recall Judge Mehta said they get 19x as many queries as next biggest), Google would be paying pubs for these rights or if it didn't then they would opt-out of providing them. /3
OK all ye people depressed Judge Mehta didn't order Google broken into bits this week. I'm here to cheer you up. DOJ has its other remedies trial in 16 days and just posted its PFJ (Proposed Final Remedies) now 60+ pages of brilliant detail. Let me walk you through key terms. /1
This is the 2023 US v Google adtech win - the one DCN and its premium publishers have long been much more deep and focused on. Here’s what it means for publishers of all types - and why it will be a massive win for the open web if Judge Brinkema signs on (I believe she will). /2
First, clear structural remedies. Google must divest AdX, its ad exchange, w/in 2yrs and likely DFP, its publisher ad server. No more vertical ad stack monopoly with interest conflicts. This would finally decouple tools Google can use to rig auctions and suppress pub revenues. /3
All eyes at Google on streaming NFL game tonight but Google Inc and its many monopolies have had quite the week. I’ve been absorbing on this end, some quick Friday thoughts on things missed. Bad news certainly for the public, and also DCN members, in US v Google Search case. /1
Judge Mehta said "no thanks" to helping publishers - because he said no pubs testified. Maybe that’s what retaliation fear looks like??? He also noted the unlawful conduct was about distribution deals, not deals with publishers. More on that in a minute. /2
Despite Mehta finding Google illegally maintained its 95%+ search monopoly with browser deals, he also said it’s OK for Google to keep owning Chrome - the world’s biggest browser - so they can keep paying everyone else and free riding on their own browser. All bad here. /3
Woah. Facebook just settled immediately before board members Andreessen, Thiel, Zuckerberg, Desmond-Hellman, and Sheryl Sandberg were set to testify as to who knew what and when…depriving public of any accountability and facts in courtroom from board and officer comms. 1/3
Counter to Facebook lawyers framing yesterday, the DC AG suit isn’t dead (awaiting DC Circuit from 1/30 hearing), and NdCal shareholder suit also still alive. This is the closest to
Courtroom testimony after about $8B+ in settlements. 2/3
Credit to Reuters, Delaware Online who I saw actually showed up to cover. It’s likely why Facebook, Zuckerberg and its board, let this one get so close. But the grid. But today things were likely to get very very hot. 3/3
News cycles. News cycles. What I called the "mother of all lawsuits" for Facebook in 2021 goes to trial TOMORROW. Zuckerberg, Marc Andreessen, Sheryl Sandberg, Peter Thiel, other board members expected to testify live as to who knew what and when in its largest scandal ever. /1
Meanwhile, Zuckerberg and Facebook comms have successfully flooded the zone with AI-hype and exclusive CEO interviews mostly distracting the press away from a trial on how they leveraged, and allegedly abused, personal data to drive a decade of massive growth in mobile share. /2
The case involves allegations the board broke its loyalty to company (and Zuckerberg insider traded on stock) after Facebook had been long violating its FTC consent decree and other privacy laws - all covered up by nearly $8 billion in settlements ($5B alone with the FTC). /3
Woah. Exhibit list just posted for Facebook trial in DE starting in a few weeks. We finally have confirmation Sheryl Sandberg was deposed by the SEC - one week prior to Zuckerberg which also kept secret until a lawsuit unsealed it. Sandberg was also sanctioned in this case. /1
This matters as it gets at Who Knew What When at FB ahead of the world finding out its platform was leaking personal data for years. Zuckerberg was dodgy at best under oath to Congress, FB responses to Parliaments focused on 2018 news. But exhibits include Jan 2017 MZ emails. /2
The DE lawsuit claims Facebook's $5 billion record settlement was inflated in order to protect its CEO, Zuckerberg, and also includes (civil) insider trading claims. Zuckerberg was ordered to sit for multiple day depo this year, will have to testify live. /3