I've done a deep dive into SB 1047 over the last few weeks, and here's what you need to know:
*Nobody* should be supporting this bill in its current state. It will *not* actually cover the largest models, nor will it actually protect open source.
But it can be easily fixed!🧵
This is important, so don't just read this thread, instead read the 6000+ word article I just published.
In the article I explain how AI *actually* works, and why these details totally break legislation like SB 1047. Policy makers *need* to know this: answer.ai/posts/2024-06-…
SB 1047 does not cover "base models". But these are the models where >99% of compute is used. By not covering these models, the bill will probably actually not cover any models at all.
(There are also dozens of trivial workarounds for anyone wanting to train uncovered models.)
If the "influence physical or virtual environments" constraint is removed then the impact would be to make development of open source AI models larger than the covered threshold impossible.
However, the stated aims of the bill are to ensure open source developers *can* comply.
Thankfully, the issues in SB 1047 can all easily be fixed by legislating the deployment of “AI Systems” and not legislating the release of “AI Models”.
Regulating the deployment of services, instead of the release of models, would not impact big tech at all, since they rarely (if ever) release large models.
So the big tech companies would be just as covered as before, and open source would be protected.
If we can't fine-tune open sourced models, then we'll be stuck with whatever values and aims the model creators had. Chinese propaganda is a very real current example of this issue (and remember that the best current open source models are Chinese).
I don't propose that we exempt AI from regulation. However, we should be careful to regulate with an understanding of the delicate balance between control and centralization, vs transparency and access, as we've done with other technologies throughout history.
Instead of "p(doom)", let's consider "p(salvation)" too, and bring a new concept to the AI safety discussion:
“Human Existential Enhancement Factor” (HEEF): the degree to which AI enhances our ability to overcome existential threats and ensure our long-term well-being.
If you care about open source AI model development, then submit your views here, where they will be sent to the authors and appear on the public record: calegislation.lc.ca.gov/Advocates/
• • •
Missing some Tweet in this thread? You can try to
force a refresh
It replaced usages of `sleep`. However `sleep` is a posix standard command. GitHub Actions already assumes the existence of a great many even non-posix commands, so the script is an odd choice. github.com/actions/runner…
It was implemented in a way that, very obviously to nearly anyone at first glance, uses 100% CPU all the time, and will run forever unless the task happens to check the time during the correct second. github.com/actions/runner…
It's a strange time to be a programmer—easier than ever to get started, but easier to let AI steer you into frustration. We've got an antidote that we've been using ourselves with 1000 preview users for the last year: "solveit"
Today we're launching a 5 week course, including access to the new solveit platform, starting Oct 20th. If you want to join us or learn more, go here: solve.it.com
A year ago we ran a small trial titled "How To Solve It With Code". The response was so overwhelming that we closed signups after one day. We explored using our approach for very small iterations with constant feedback for web development, AI, business (with @ericries)…
For folks wondering what's happening here technically, an explainer:
When there's lots of training data with a particular style, using a similar style in your prompt will trigger the LLM to respond in that style. In this case, there's LOADS of fanfic: scp-wiki.wikidot.com/scp-series🧵 x.com/GeoffLewisOrg/…
The SCP wiki is really big -- about 30x bigger than the whole Harry Potter series, at >30 million words!
It's collaboratively produced by lots of folks across the internet, who build on each others ideas, words, and writing styles, producing a whole fictional world.
Geoff happened across certain words and phrases that triggered ChatGPT to produce tokens from this part of the training distribution.
And the tokens it produced triggered Geoff in turn. That's not a coincidence, the collaboratively-produced fanfic is meant to be compelling!
I'm glad @levelsio checked this, but sad our contrib has been erased by later big tech co's. Alec Radford said ULMFiT inspired GPT. ULMFiT's first demo predated BERT.
Today's 3-stage LLM approach of general corpus pretraining and 2 stages of fine-tuning was pioneered by ULMFiT.
There have been many other important contributions, including attention (Bahdanau et al), transformers, RLHF, etc.
But before all this, basically everyone in NLP assumed that each new domain needed a new model. ULMFiT showed that a large pretrained model was actually the key.
I got push-back from pretty much everyone about this. My claim that fine-tuning that model was the critical step to achieving success in NLP was not something people were ready to hear at that time.
I gave many talks trying to convince academics to pursue this direction.
Announcing fasttransform: a Python lib that makes data transformations reversible/extensible. No more writing inverse functions to see what your model sees. Debug pipelines by actually looking at your data.
We took the `Transform` class out of fastcore, replaced the custom type dispatch system with @ikwess's plum-dispatch, mixed it all together, and voila: fasttransform! :D