What's interesting about this proposal in the context of #generative and Copyright: the place with the most creator-friendly legislation can have jurisdiction. Under the Berne Convention, it's where infringement occurs—so that could be the most creator-friendly place you chose.
That is to say, "allowing #generative AI research under ethical standards [...]" that require Copyright are already regulated by the most human-friendly, pro-creator, and jurisdiction is automatically (e.g. that's where data is hosted, service is used) as per Berne.
Thus, we have the tools to make this happen for #generative systems. Just need the guts to enforce them now before the U.S. court system normalizes exploitation that goes against Berne...
Subtweeting @GaryMarcus on this. I'm pretty sure we have the legislation we need, just need #AiEthics to grow a pair and step up.
(Using triggering words on purpose to stimulate action here. 😂 Those "breaking things" move much faster.)
For #generative, I actually expect international Copyright to break down within the next few years with people abandoning Berne implicitly or explicitly.
(Could happen sooner as we move into multipolar cold war scenario, mostly there already just waiting for the launch trailer.)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
UPDATE: About the Cease & Desist sent to @StabilityAI with a deadline March 1st: I got confirmation of receipt but no formal reply.
The C&D included many points detailing how training and distribution must be conducted to be EU compliant, and they did not have any answer.
I have proceeded in good faith & assuming the best, but I now believe:
- Their involvement in SD 1.x, training of 2.x and likely 3.x is non-compliant.
- They know this is the case and are doing their best to cover up.
- Everything you hear from them is carefully crafted PR only.
Stability believes it can raise another $250 million (rumoured valuation $4b) and even if they get caught in 2 years by the time all court cases and appeals are done, they will have unethically cornered the market and destroyed communities without consent.
Looking more closely, I think Success Kid also *ate* sand and still had that defiant look — like he's staring down the universe. "I'd do it again."
NOTE: Fingers are not required to have the look of success. In fact, if you have mashed-up AI fingers and still have the look of success, it would enhance the effect.
If there's a BigTech #AI API available via payment for the feature or model you want to open-source, then it's ethical to open-source yours! 💯
Rationale: Bad actors are already using the API and BigTech is already profiting from that.
⚖️ Rule #2 ⚖️
If your model is less than 10% better on average for a variety of commonly used benchmarks compared to existing open-source models, then it's ethical to open-source yours! 💯
⚖️ Rule #3 ⚖️
If a dedicated team of best-in-class professionals could reproduce your system in less than a week, then it's ethical to open-source yours! 💯
That's a wrap folks, show's over! Diffusion models as *lossy* databases that can regenerate their training data:
"Diffusion models are less private than prior generative models [...] mitigating these vulnerabilities may require new advances in privacy-preserving training."
I have successfully compiled and run GLM-130b on a local machine! It's now running in `int4` quantization mode and answering my queries.
I'll explain the installation below; if you have any questions, feel free to ask! github.com/THUDM/GLM-130B
130B parameters on 4x 3090s is impressive. GPT-3 for reference is 175B parameters, but it's possible that it's over capacity for the data & compute it was trained on...
I feel like a #mlops hacker having got this to work! (Though it should be much easier than it was.)
To get GLM to work, the hardest part was CMake from the FasterTransformer fork. I'm not a fan of CMake, I don't think anyone is.
I had to install cudnn libraries manually into my conda environment, then hack CMakeCache.txt to point to those...
Write to GitHub, ask for contact details of their Data Protection Officer. If they refuse, explain it's mandated by GDPR to provide contact details. Ask DPO what's their policy on personally-identifying information under GDPR. Post response!
If you go via Support, you'll probably have to ask three times for DPO contact because frontline Google Support is not GDPR aware, and will refuse a few times to see if you're serious.
Not sure if intentional or incompetence!
Remember that the DPO is "protected from internal interference" from the organization itself (GitHub, Google) so if they give you 'internal policy' as an excuse, remind them that their role mandates they prioritize data & privacy issues first and foremost.