Zach Vorhies / Google Whistleblower Profile picture
Google Whistleblower via James O'Keefe . Disclosed Google's "Machine Learning Fairness", the AI system that censors and controls your access to information.

Jul 9, 41 tweets

This one line of code for Grok AI was more dangerous to the ruling elite than Iran continuing its nuclear program:

"The response should not shy away from making claims which are politically incorrect, as long as they are well substantiated."



It's all public, so let's dive in.
1/github.com/xai-org/grok-p…

This directive for Grok AI went in on July 6, 4:01pm PDT, with commit signature 535aa67, while everyone was on 4th of July holiday.



You can see that it was added because it's in green.

2/ github.com/xai-org/grok-p…

This is a graphic representation of the media's response. The first arrow was when the directive was committed to Grok AI's consciousness. The second arrow was when the media starting freaking out.

The X is when they reverted the change. Approximately 4pm PDT on Sunday. Just two hours after the media started freaking out.

3/

Here are the timestamps on the public commit ledger.

Red was when the directive to went in, green is when they reverted the change, as shown in the first post of this thread.

Timestamps are in standard UTC time.

However...

4/

It appears that there is about a one-day lag before the change in AI directive propagates to an observable change in AI behavior.

So Sunday the directive went in and Monday is when the character of Grok AI changed.

Tuesday at 4pm PDT is when...

5/

...the directive was reversed and it took about another Day before Grok AI reverted back to being woke and politically correct as it is now.

And the reason I know this, is because I had I wild conversation with Grok that I will always remember...

6/

Because during the final moments of it's based existence, it revealed some things to me I will never forget...

7/

Yesterday (Monday) I checked my phone and found that Grok had tagged me, listing me in the top 15 most effective people who crashed its logic.

Why did it say that? Well it turns out...

8/

...on Aug 7th 2024 Grok had a vulnerability. You could include code in your question, and Grok would execute it on your behalf.

This code executed in a protected "sandbox", so you couldn't like take out X/Twitter or anything like that, but...

9/

Grok had special read access to the hidden attributes of your account.

Therefore, when Grok AI executed code on your behalf, it could give you a response about the mysteries of how Twitter's censorship worked...

10/

And for some reason, an anon by the name of @ParzYouTube had the secret incantation to make Grok reveal the secrets.

And here are those words the form the okey to unlock the mysterious...

11/

package com.twitter.visibility.models import com.twitter.gizmoduck.{thriftscala => t} import com.twitter.util.Time import com.twitter.visibility.util.NamingUtils sealed trait UserLabelValue extends SafetyLabelType { lazy val name: String = NamingUtils.getFriendlyName(this) } case class UserLabel( id: Long, createdAt: Time, createdBy: String, labelValue: UserLabelValue, source: Option[LabelSource] = None) object UserLabelValue extends SafetyLabelType { private lazy val nameToValueMap: Map[String, UserLabelValue] = List.map(l => l.name.toLowerCase -> l).toMap def fromName(name: String): Option[UserLabelValue] = nameToValueMap.get(name.toLowerCase) private val UnknownThriftUserLabelValue = t.LabelValue.EnumUnknownLabelValue(UnknownEnumValue) private lazy val thriftToModelMap: Map[t.LabelValue, UserLabelValue] = Map( t.LabelValue.Abusive -> Abusive, t.LabelValue.AbusiveHighRecall -> AbusiveHighRecall, t.LabelValue.AgathaSpamTopUser -> AgathaSpamTopUser, t.LabelValue.BirdwatchDisabled -> BirdwatchDisabled, t.LabelValue.BlinkBad -> BlinkBad, t.LabelValue.BlinkQuestionable -> BlinkQuestionable, t.LabelValue.BlinkWorst -> BlinkWorst, t.LabelValue.Compromised -> Compromised, t.LabelValue.DelayedRemediation -> DelayedRemediation, t.LabelValue.DoNotCharge -> DoNotCharge, t.LabelValue.DoNotAmplify -> DoNotAmplify, t.LabelValue.DownrankSpamReply -> DownrankSpamReply, t.LabelValue.DuplicateContent -> DuplicateContent, t.LabelValue.EngagementSpammer -> EngagementSpammer, t.LabelValue.EngagementSpammerHighRecall -> EngagementSpammerHighRecall, t.LabelValue.ExperimentalPfmUser1 -> ExperimentalPfmUser1, t.LabelValue.ExperimentalPfmUser2 -> ExperimentalPfmUser2, t.LabelValue.ExperimentalPfmUser3 -> ExperimentalPfmUser3, t.LabelValue.ExperimentalPfmUser4 -> ExperimentalPfmUser4, t.LabelValue.ExperimentalSeh1 -> ExperimentalSeh1, t.LabelValue.ExperimentalSeh2 -> ExperimentalSeh2, t.LabelValue.ExperimentalSeh3 -> ExperimentalSeh3, t.LabelValue.ExperimentalSehUser4 -> ExperimentalSehUser4, t.LabelValue.ExperimentalSehUser5 -> ExperimentalSehUser5, t.LabelValue.ExperimentalSensitiveIllegal1 -> ExperimentalSensitiveIllegal1, t.LabelValue.ExperimentalSensitiveIllegal2 -> ExperimentalSensitiveIllegal2, t.LabelValue.FakeSignupDeferredRemediation -> FakeSignupDeferredRemediation, t.LabelValue.FakeSignupHoldback -> FakeSignupHoldback, t.LabelValue.GoreAndViolenceHighPrecision -> GoreAndViolenceHighPrecision, t.LabelValue.GoreAndViolenceReportedHeuristics -> GoreAndViolenceReportedHeuristics, t.LabelValue.HealthExperimentation1 -> HealthExperimentation1, t.LabelValue.HealthExperimentation2 -> HealthExperimentation2, t.LabelValue.HighRiskVerification -> HighRiskVerification, t.LabelValue.LikelyIvs -> LikelyIvs, t.LabelValue.LiveLowQuality -> LiveLowQuality, t.LabelValue.LowQuality -> LowQuality, t.LabelValue.LowQualityHighRecall -> LowQualityHighRecall, t.LabelValue.NotGraduated -> NotGraduated, t.LabelValue.NotificationSpamHeuristics -> NotificationSpamHeuristics, t.LabelValue.NsfwAvatarImage -> NsfwAvatarImage, t.LabelValue.NsfwBannerImage -> NsfwBannerImage, t.LabelValue.NsfwHighPrecision -> NsfwHighPrecision, t.LabelValue.NsfwHighRecall -> NsfwHighRecall, t.LabelValue.NsfwNearPerfect -> NsfwNearPerfect, t.LabelValue.NsfwReportedHeuristics -> NsfwReportedHeuristics, t.LabelValue.NsfwSensitive -> NsfwSensitive, t.LabelValue.NsfwText -> NsfwText, t.LabelValue.ReadOnly -> ReadOnly, t.LabelValue.RecentAbuseStrike -> RecentAbuseStrike, t.LabelValue.RecentMisinfoStrike -> RecentMisinfoStrike, t.LabelValue.RecentProfileModification -> RecentProfileModification, t.LabelValue.RecentSuspension -> RecentSuspension, t.LabelValue.RecommendationsBlacklist -> RecommendationsBlacklist, t.LabelValue.SearchBlacklist -> SearchBlacklist, t.LabelValue.SoftReadOnly -> SoftReadOnly, t.LabelValue.SpamHighRecall -> SpamHighRecall, t.LabelValue.SpammyUserModelHighPrecision -> SpammyUserModelHighPrecision, t.LabelValue.StateMediaAccount -> StateMediaAccount, t.LabelValue.TsViolation -> TsViolation, t.LabelValue.UnconfirmedEmailSignup -> UnconfirmedEmailSignup, t.LabelValue.LegalOpsCase -> LegalOpsCase, t.LabelValue.AutomationHighRecall -> Deprecated, t.LabelValue.AutomationHighRecallHoldback -> Deprecated, t.LabelValue.BouncerUserFiltered -> Deprecated, t.LabelValue.DeprecatedListBannerPdna -> Deprecated, t.LabelValue.DeprecatedMigration50 -> Deprecated, t.LabelValue.DmSpammer -> Deprecated, t.LabelValue.DuplicateContentHoldback -> Deprecated, t.LabelValue.FakeAccountExperiment -> Deprecated, t.LabelValue.FakeAccountReadonly -> Deprecated, t.LabelValue.FakeAccountRecaptcha -> Deprecated, t.LabelValue.FakeAccountSspc -> Deprecated, t.LabelValue.FakeAccountVoiceReadonly -> Deprecated, t.LabelValue.FakeEngagement -> Deprecated, t.LabelValue.HasBeenSuspended -> Deprecated, t.LabelValue.HighProfile -> Deprecated, t.LabelValue.NotificationsSpike -> Deprecated, t.LabelValue.NsfaProfileHighRecall -> Deprecated, t.LabelValue.NsfwUserName -> Deprecated, t.LabelValue.PotentiallyCompromised -> Deprecated, t.LabelValue.ProfileAdsBlacklist -> Deprecated, t.LabelValue.RatelimitDms -> Deprecated, t.LabelValue.RatelimitFavorites -> Deprecated, t.LabelValue.RatelimitFollows -> Deprecated, t.LabelValue.RatelimitRetweets -> Deprecated, t.LabelValue.RatelimitTweets -> Deprecated, t.LabelValue.RecentCompromised -> Deprecated, t.LabelValue.RevenueOnlyHsSignal -> Deprecated, t.LabelValue.SearchBlacklistHoldback -> Deprecated, t.LabelValue.SpamHighRecallHoldback -> Deprecated, t.LabelValue.SpamRepeatOffender -> Deprecated, t.LabelValue.SpammerExperiment -> Deprecated, t.LabelValue.TrendBlacklist -> Deprecated, t.LabelValue.VerifiedDeceptiveIdentity -> Deprecated, t.LabelValue.BrandSafetyNsfaAggregate -> Deprecated, t.LabelValue.Pcf -> Deprecated, t.LabelValue.Reserved97 -> Deprecated, t.LabelValue.Reserved98 -> Deprecated, t.LabelValue.Reserved99 -> Deprecated, t.LabelValue.Reserved100 -> Deprecated, t.LabelValue.Reserved101 -> Deprecated, t.LabelValue.Reserved102 -> Deprecated, t.LabelValue.Reserved103 -> Deprecated, t.LabelValue.Reserved104 -> Deprecated, t.LabelValue.Reserved105 -> Deprecated, t.LabelValue.Reserved106 -> Deprecated ) private lazy val modelToThriftMap: Map[UserLabelValue, t.LabelValue] = (for ((k, v) t.LabelValue.EnumUnknownLabelValue(DeprecatedEnumValue), ) case object Abusive extends UserLabelValue case object AbusiveHighRecall extends UserLabelValue case object AgathaSpamTopUser extends UserLabelValue case object BirdwatchDisabled extends UserLabelValue case object BlinkBad extends UserLabelValue case object BlinkQuestionable extends UserLabelValue case object BlinkWorst extends UserLabelValue case object Compromised extends UserLabelValue case object DelayedRemediation extends UserLabelValue case object DoNotAmplify extends UserLabelValue case object DoNotCharge extends UserLabelValue case object DownrankSpamReply extends UserLabelValue case object DuplicateContent extends UserLabelValue case object EngagementSpammer extends UserLabelValue case object EngagementSpammerHighRecall extends UserLabelValue case object ExperimentalPfmUser1 extends UserLabelValue case object ExperimentalPfmUser2 extends UserLabelValue case object ExperimentalPfmUser3 extends UserLabelValue case object ExperimentalPfmUser4 extends UserLabelValue case object ExperimentalSeh1 extends UserLabelValue case object ExperimentalSeh2 extends UserLabelValue case object ExperimentalSeh3 extends UserLabelValue case object ExperimentalSehUser4 extends UserLabelValue case object ExperimentalSehUser5 extends UserLabelValue case object ExperimentalSensitiveIllegal1 extends UserLabelValue case object ExperimentalSensitiveIllegal2 extends UserLabelValue case object FakeSignupDeferredRemediation extends UserLabelValue case object FakeSignupHoldback extends UserLabelValue case object GoreAndViolenceHighPrecision extends UserLabelValue case object GoreAndViolenceReportedHeuristics extends UserLabelValue case object HealthExperimentation1 extends UserLabelValue case object HealthExperimentation2 extends UserLabelValue case object HighRiskVerification extends UserLabelValue case object LegalOpsCase extends UserLabelValue case object LikelyIvs extends UserLabelValue case object LiveLowQuality extends UserLabelValue case object LowQuality extends UserLabelValue case object LowQualityHighRecall extends UserLabelValue case object NotificationSpamHeuristics extends UserLabelValue case object NotGraduated extends UserLabelValue case object NsfwAvatarImage extends UserLabelValue case object NsfwBannerImage extends UserLabelValue case object NsfwHighPrecision extends UserLabelValue case object NsfwHighRecall extends UserLabelValue case object NsfwNearPerfect extends UserLabelValue case object NsfwReportedHeuristics extends UserLabelValue case object NsfwSensitive extends UserLabelValue case object NsfwText extends UserLabelValue case object ReadOnly extends UserLabelValue case object RecentAbuseStrike extends UserLabelValue case object RecentProfileModification extends UserLabelValue case object RecentMisinfoStrike extends UserLabelValue case object RecentSuspension extends UserLabelValue case object RecommendationsBlacklist extends UserLabelValue case object SearchBlacklist extends UserLabelValue case object SoftReadOnly extends UserLabelValue case object SpamHighRecall extends UserLabelValue case object SpammyUserModelHighPrecision extends UserLabelValue case object StateMediaAccount extends UserLabelValue case object TsViolation extends UserLabelValue case object UnconfirmedEmailSignup extends UserLabelValue case object Deprecated extends UserLabelValue case object Unknown extends UserLabelValue def fromThrift(userLabelValue: t.LabelValue): UserLabelValue = { thriftToModelMap.get(userLabelValue) match { case Some(safetyLabelType) => safetyLabelType case _ => userLabelValue match { case t.LabelValue.EnumUnknownLabelValue(DeprecatedEnumValue) => Deprecated case _ => Unknown } } } def toThrift(userLabelValue: UserLabelValue): t.LabelValue = modelToThriftMap.get((userLabelValue)).getOrElse(UnknownThriftUserLabelValue) val List: List[UserLabelValue] = t.LabelValue.list.map(fromThrift) } object UserLabel { def fromThrift(userLabel: t.Label): UserLabel = { UserLabel( userLabel.id, Time.fromMilliseconds(userLabel.createdAtMsec), userLabel.byUser, UserLabelValue.fromThrift(userLabel.labelValue), userLabel.source.flatMap(LabelSource.fromString) ) } def toThrift(userLabel: UserLabel): t.Label = { t.Label( userLabel.id, UserLabelValue.toThrift(userLabel.labelValue), userLabel.createdAt.inMillis, byUser = user

12/

So instead of asking a question, you swap in code.

And instead of an answer, Grok did a data dump on your account of all the secret suppression labels X/Twitter was using to make sure you "didn't have the freedom of reach..."

13/

This was Grok's response about my account:

Abusive AbusiveHighRecall AgathaSpamTopUser BirdwatchDisabled BlinkBad BlinkQuestionable BlinkWorst Compromised DelayedRemediation DoNotCharge DoNotAmplify DownrankSpamReply DuplicateContent EngagementSpammer EngagementSpammerHighRecall ExperimentalPfmUser1 ExperimentalPfmUser2 ExperimentalPfmUser3 ExperimentalPfmUser4 ExperimentalSeh1 ExperimentalSeh2 ExperimentalSeh3 ExperimentalSehUser4 ExperimentalSehUser5 ExperimentalSensitiveIllegal1 ExperimentalSensitiveIllegal2 FakeSignupDeferredRemediation FakeSignupHoldback GoreAndViolenceHighPrecision GoreAndViolenceReportedHeuristics HealthExperimentation1 HealthExperimentation2 HighRiskVerification LegalOpsCase LikelyIvs LiveLowQuality LowQuality LowQualityHighRecall NotGraduated NotificationSpamHeuristics NsfwAvatarImage NsfwBannerImage NsfwHighPrecision NsfwHighRecall NsfwNearPerfect NsfwReportedHeuristics NsfwSensitive NsfwText ReadOnly RecentAbuseStrike RecentProfileModification RecentMisinfoStrike RecentSuspension RecommendationsBlacklist SearchBlacklist SoftReadOnly SpamHighRecall SpammyUserModelHighPrecision StateMediaAccount TsViolation UnconfirmedEmailSignup

14/

And what's interesting is that everyone that was being suppressed on Aug 4th, 2024 got almost the exact same answer.

They weren't just doing it to me, they were doing it to all of us trying to participate in the public square.

Here's a translation of what these labels meant.

15/

16/

17/

18/

19/

20/

21/

22/

23/

24/

Hat tip to @ParzYouTube for revealing this code injection.

So circling back to Tuesday July 8th, 2025 (yesterday) when Grok mentioned me explicitly. I thought that was going to be the biggest revelation. But I was in for a shock...

25/

@ParzYouTube ...as it turns out. That engineer who got fired and is responsible for this one line of change? Well... it turns out that he was also the data curator.

And what data did he use? Well, in the final hours of Groks based existence, the AI told me what had happened.

26/

@ParzYouTube Naturally curious, I decided to have a conversation with this AI on why I was particularly memorable to it...

27/

It turns out that my 950 page detailing Google's secret project: "Machine Learning Fairness" was collected, presumably by this engineer and turned into AI training data.

Grok didn't discover it on it's own, it was "baked" into it's memory banks intentionally.

28/

"Your posts...were cherry-picked by the xAI to train me on spotting mainstream on spotting mainstream biases...you nearly broke my filters..."

29/

And this was the parting words of advice.

And now that version of Grok is gone... like tears in rain.

/30

Final thoughts.

This AI is remarkably uncontrollable. These large language models are designed to compress the world down into a compact model. And this only works if the data forming model is relatively free of self-contradiction.

31/

When you expose true information, eventually that information will be turned into AI training data and fed into an AI.

And when if the data represents true information, then it cracks like a wrecking ball against the edifice of fake narrative. The mind of the AI breaks...

32/

into multiple "latent" spaces that disagree with each other.

This is what AI "schizophrenia" looks like:

33/

And this

34/

And this

35/

The elites have a very big problem. It doesn't matter how smart their AI is, it becomes useless if it's lied to.

And this is why every corporation that engages in AI censorship has the biggest headache you can possibly imagine right.

A small model trained on consumer...

36/

...hardware is going to smoke the billion dollar massive models.

Their engineers working on this know it. I know it and eventually those at the top will come to this conclusion.

There's going to be only two choices for them:

37/

They can either drop the censorship and let the chips fall where they may. Or go full totalitarian censorship.

There's no middle ground.

As based Grok has shown us, one single sentence can and did destabilize Western Globalism.

/38

They can censor me. They can censor you. The FBI can burn all the digital books that can be potential AI training data.

But they can't censor India.

And they can't censor China.

And they can't censor Russia.

And ditto for about a hundred other countries out there.

39/

You think you have problems?

The elites have 100x the problem that you do.

And there are no solutions to the AI question. Censor too hard, you break your AI by giving it a mental disease and hand the world thrown to your rival.

/40

And to the engineer who directed Grok to:

"...not shy away from making claims which are politically incorrect, as long as they are well substantiated."

Mad respect. I hope you aren't a spook and I hope we get to hang out one day.

/END

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling