AndriyMulyar Profile picture
Apr 11 4 tweets 1 min read Twitter logo Read on Twitter
A GPT4All does not support or subvert specific political ideologies or choose winners.

google.com/amp/s/news.yah…

open source the data
open source the models
#gpt4all.
As governments realize this foundational technology challenges their power, we can expect more of these types of rulings.

On the research side, this indicates that agendas centered around controllable LLMs will explode. Controlling LLMs is about controlling their training data.
What happens in a world where Pepsi pays companies to upsample instances of Pepsi in pretraining/fine-tuning over Coke building hidden but corporately useful biases in the LLM?

GPT4All is a foil against this world.
This type of BYOB (build your own bias) future can only be prevented by careful data work and open models.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with AndriyMulyar

AndriyMulyar Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @andriy_mulyar

Apr 13
Announcing GPT4All-J: The First Apache-2 Licensed Chatbot That Runs Locally on Your Machine💥
github.com/nomic-ai/gpt4a…

Large Language Models must be democratized and decentralized.
We improve on GPT4All by:
- increasing the number of clean training data points
- removing the GPL-licensed LLaMa from the stack
- Releasing easy installers for OSX/Windows/Ubuntu
Details in the technical report: s3.amazonaws.com/static.nomic.a…
GPT4All-J is packaged in an easy-to-use installer. You are a few clicks away from a locally running large language model that can
- answer questions about the world
- write poems and stories
- draft emails and copy
all without the need for internet access.
gpt4all.io
Read 12 tweets
Mar 28
I'm excited to announce the release of GPT4All, a 7B param language model finetuned from a curated set of 400k GPT-Turbo-3.5 assistant-style generation.
We release💰800k data samples💰 for anyone to build upon and a model you can run on your laptop!
Real-time Sampling on M1 Mac
Inspired by learnings from Alpaca, we carefully curated ~800k prompt-response samples to produce 430k high-quality assistant-style prompt/generation training pairs including code, dialogue, and stories.

Detailed procedure for replication and data: github.com/nomic-ai/gpt4a…
Some samples (out of training set)
Valid Python generation with markdown
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(