You Can Read Facebook's Leaked Rules for Policing Hate Speech

A series of training slides have been uncovered.

Getty Images / Justin Sullivan

If you’ve ever wondered how Facebook decides if a comment is inflammatory enough to block the user who wrote it, a new set of leaked slides from the network’s training program may answer your questions.

The set of rules and regulations that Facebook employs to regulate posts and comments were leaked to ProPublica and published Wednesday. On display within them are several certain-to-be-controversial revelations: the two primary data sets reveal which demographics are considered “protected” by Facebook settings, and how the social network determines what (or who) merits protection from abuse.

Per ProPublica:

A trove of internal documents reviewed by ProPublica sheds new light on the secret guidelines that Facebook’s censors use to distinguish between hate speech and legitimate political expression. The documents reveal the rationale behind seemingly inconsistent decisions.

Facebook’s code appears to protect whole groups from what it deems hate speech, rather than subsets. All of the data and policy described in ProPublica’s report was originally demonstrated in a series of training slides made for its content reviewers-in-training. Representatives from ProPublica saw the slides and recreated them for publication.

By Facebook’s standards, a comment might be flagged if it targets an entire race or gender — “white men,” to use an example given on one slide — but will be let through if it is limited to a subgroup — “black children,” to pull from that same slide.


"White men" are the only broad group listed, whereas "female drivers" and "black children" are both subgroups.


The training slides also differentiate between different types of attacks (seven, in all) that would cause a comment referencing a protected category to be flagged and removed. They are:

  • Calling for violence
  • Calling for segregation
  • Calling for exclusion
  • Degrading generalization
  • Dismissing
  • Cursing
  • Slurs

All of this amounts to a policy of colorblind enforcement by Facebook, where all broad groups receive an equal degree of protection and all subgroups do not, regardless of any relative threat discrepancies between them.

The efficacy and ethics of this approach will inevitably be debated in the coming weeks, but no matter which side one falls on, Facebook seems like the loser. The company is already going through something of an identity crisis, in part because of its treatment — or lack thereof — of content like fake news. So it’s hard for this not to read like a worst-case scenario.

Regulating online comments in a way that satisfies all involved is an utterly impossible task. For Facebook, a company whose model requires that a massive, diverse, and increasingly divided user base all stay happy, having its methodology brought into the light is sure to complicate things even further.

Read the full ProPublica report and view the entire slideshow here.