We recently partnered with @Sprinklr for an independent assessment of hate speech on Twitter, which we’ve been sharing data on publicly for several months.
Sprinklr’s AI-powered model found that the reach of hate speech on Twitter is even lower than our own model quantified 🧵
What’s driving the difference? The context of conversation and how we determine toxicity.
Sprinklr defines hate speech more narrowly by evaluating slurs in the nuanced context of their use. Twitter has, to this point, taken a broader view of the potential toxicity of slur usage.
To quantify hate speech, Twitter & Sprinklr start with 300 of the most common English-language slurs. We count not only how often they’re tweeted but how often they’re seen (impressions).
Our models score slur Tweets on “toxicity,” the likelihood that they constitute hate speech
No model is ever perfect, and this work is never done. We’ll continue to combat hate speech by incorporating other languages, new terms, and more precise methodologies-- all while increasing transparency.
Our focal metric is hate speech impressions, not the number of Tweets containing slurs.
Most slur usage is not hate speech, but when it is, we work to reduce its reach. Sprinklr’s analysis found that hate speech receives 67% fewer impressions per Tweet than non-toxic slur Tweets
Our focal metric is hate speech impressions, not the number of Tweets containing slurs.
Most slur usage is not hate speech, but when it is, we work to reduce its reach. Sprinklr’s analysis found that hate speech receives 67% fewer impressions per Tweet than non-toxic slur Tweets
No model is ever perfect, and this work is never done. We’ll continue to combat hate speech by incorporating other languages, new terms, and more precise methodologies-- all while increasing transparency.
We’re moving faster than ever to make Twitter safer and keep child sexual exploitation (CSE) material off our platform. Here’s an update on our work:
Our recent approach is more aggressive in that we’re proactively and severely limiting the reach of any content that we detect may contain CSE material. This includes moving swiftly to remove the content and suspend the bad actor(s) involved.
In January, we suspended ~404k accounts that created, distributed, or engaged with this content, which represents a 112% increase in CSE suspensions since November.
As we shared earlier, we have been proactively reinstating previously suspended accounts. Starting February 1, anyone can appeal an account suspension and be evaluated under our new criteria for reinstatement.
We did not reinstate accounts that engaged in illegal activity, threats of harm or violence, large-scale spam and platform manipulation, or when there was no recent appeal to have the account reinstated.
Going forward, we will take less severe actions, such as limiting the reach of policy-violating Tweets or asking you to remove Tweets before you can continue using your account. Account suspension will be reserved for severe or ongoing, repeat violations of our policies.
We’ve heard from some of you that it’s not always clear what behaviors can result in spam for users and your account potentially being enforced for platform manipulation. Here are some things to avoid:
Don’t post the same (or almost the same) content or links over and over again, especially in threads where the topic is not directly related to the content or links you are posting.
Don’t mention accounts repeatedly an excessive number of times, especially with the same type of content and/or links. This could also lead to users blocking you and/or reporting your behavior as targeted harassment, particularly when it includes hateful conduct.
We’re committed to helping everyone on Twitter keep their accounts safe and secure, and that means helping owners of compromised accounts regain access and control.
Here’s what to do if your account has been compromised:
If you can still log in, start by changing your password. Make sure your email address is secure, review third party apps that can access your account, and consider using 2FA.
We’ve updated our Private Information policy to prohibit sharing someone else’s live location in most cases. Here’s what changed and why. 🧵
When someone shares an individual’s live location on Twitter, there is an increased risk of physical harm. Moving forward, we’ll remove Tweets that share this information, and accounts dedicated to sharing someone else’s live location will be suspended.
You can still share your own live location on Twitter. Tweets that share someone else’s historical (not same-day) location information are also not prohibited by this policy.
Here’s an update on our recent efforts to reduce the reach of hateful speech on Twitter. 🧵
Counting the number of Tweets that contain a specific slur is not an accurate way to measure hateful speech. Context matters, and not all occurrences of slur words are used in a hateful way. Slur words may be used in counterspeech, reclaimed phrases, and song lyrics, for example.
People will still see slur words in Tweets when they follow an account that uses them. However, we will not amplify Tweets containing slurs or hate speech, and we will not serve ads adjacent to those Tweets.