Today we’re recognizing #SaferInternetDay and encouraging conversations around a healthy, safe, and open internet.
This month, we're leading education workshops for youth and families, amplifying our partners' campaigns, and working with safety industry leaders around the world.
Learn more about the organizations we’re working with in our blog
To support #SaferInternetDay, here are a few ways to control your own experience on Twitter.
For words or topics you don’t want to see, go to Settings and mute them.
Choose who views your Tweets. Block an account so they can’t see your Tweets or DM you.
We use technology to detect Tweets that may break our rules, so we can review them faster and you won’t have to report them.
But if you see something before we do, please report it.
We know that it’s critical to work with both you and our partners in shaping new policies.
We recently used your feedback to develop a rule on synthetic and manipulated media. Starting March 5, we will label Tweets with deceptively altered media to provide context.
Throughout 2020, we will continue supporting our safety partners and working openly with you to make Twitter a safer place for conversations. #SaferInternetDay
Posting a reminder on how to appeal a suspended account along with a few tips. We are working appeals in less than 48 hours in most cases, but we do require that a user appeals their suspension directly (you can't appeal on someone's behalf). Some tips for a successful appeal… twitter.com/i/web/status/1…
Explain the reason for the policy violation that led to the suspension or if not sure, ask for the reason your account was suspended. Soon the suspension reason will be displayed on the account page, but we will continue to send email notifications as well.
Let us know if you believe the suspension reason is an error or if you are acknowledging the policy violation and taking steps to avoid similar violations in the future. Appeals that do not have this information are likely to be denied.
We’re adding more transparency to the enforcement actions we take on Tweets. As a first step, soon you’ll start to see labels on some Tweets identified as potentially violating our rules around Hateful Conduct letting you know that we’ve limited their visibility. 🧵… twitter.com/i/web/status/1…
These actions will be taken at a tweet level only and will not affect a user’s account. Restricting the reach of Tweets helps reduce binary “leave up versus take down” content moderation decisions and supports our freedom of speech vs freedom of reach approach.
We may get it wrong occasionally, so authors will be able to submit feedback on the label if they think we incorrectly limited their content’s visibility. In the future, we plan to allow authors to appeal our decision to limit a Tweet’s visibility.
Sharing information surrounding media that we have determined will not be allowed based on our policies. This thread is intended to cover the most common questions being asked.
We don't immediately detect every single violating image on our platform the minute it is posted. It may be detected proactively using models/ algorithms or detected through user reports. Various events and new information can result in more severe treatment of content.
Once we determine media will not be allowed we run automated processes that find and restrict tweets, which is the only way we can remove it fast and at scale. Rules are applied to everyone and considering context is not possible at thousands of tweets hourly.
We recently partnered with @Sprinklr for an independent assessment of hate speech on Twitter, which we’ve been sharing data on publicly for several months.
Sprinklr’s AI-powered model found that the reach of hate speech on Twitter is even lower than our own model quantified 🧵
What’s driving the difference? The context of conversation and how we determine toxicity.
Sprinklr defines hate speech more narrowly by evaluating slurs in the nuanced context of their use. Twitter has, to this point, taken a broader view of the potential toxicity of slur usage.
To quantify hate speech, Twitter & Sprinklr start with 300 of the most common English-language slurs. We count not only how often they’re tweeted but how often they’re seen (impressions).
Our models score slur Tweets on “toxicity,” the likelihood that they constitute hate speech
We’re moving faster than ever to make Twitter safer and keep child sexual exploitation (CSE) material off our platform. Here’s an update on our work:
Our recent approach is more aggressive in that we’re proactively and severely limiting the reach of any content that we detect may contain CSE material. This includes moving swiftly to remove the content and suspend the bad actor(s) involved.
In January, we suspended ~404k accounts that created, distributed, or engaged with this content, which represents a 112% increase in CSE suspensions since November.
As we shared earlier, we have been proactively reinstating previously suspended accounts. Starting February 1, anyone can appeal an account suspension and be evaluated under our new criteria for reinstatement.
We did not reinstate accounts that engaged in illegal activity, threats of harm or violence, large-scale spam and platform manipulation, or when there was no recent appeal to have the account reinstated.
Going forward, we will take less severe actions, such as limiting the reach of policy-violating Tweets or asking you to remove Tweets before you can continue using your account. Account suspension will be reserved for severe or ongoing, repeat violations of our policies.