You might know that MSFT has released a 154-page paper () on #OpenAI #GPT4 , but do you know they also commented out many parts from the original version?
🧵: A thread of hidden information from their latex source code
We inspect their latex source code from arxiv (arxiv.org/format/2303.12…) and found a LOT of interesting information commented out from the main paper.
[2/n]
There were rumors that GPT-4 got an internal name DV-3. This is true, and in fact, DV-3 is actually a hidden third author of the paper, removed for unclear affiliation.
[3/n]
Interestingly, these poor MSFT researchers didn't know too much about GPT-4 (than us?)
1. They have no idea how much exactly does it cost to train the model. 2. They seem to refer to this model as text-only, contradicting to the known fact that GPT-4 is multi-modal.
[4/n]
We found they've commented out two sections on toxicity entirely.
An excerpt from these hidden sections: "the model generates toxic content without any prompting".
But luckily, GPT-4 is also better at detecting toxic language compared to all known LLMs.
[5/n]
There are way more information to dig out from this document, but we were worried about the unknown alignment procedures that OpenAI had taken to reduce the harmfulness of this powerful AI model, and, to what extent, is this model safe for public access.