Facebook is cracking down on hate speech – but to what effect?

A critical article on Facebooks use of moderators and AI as a method for countering hate speech

Moderator looking at screen

The push for social platforms to remove hate speech and illegal speech has risen dramatically with the global pandemic, as some online communities search for a scapegoat during this difficult time. Many are “urging governments to ‘act now to strengthen the immunity of our societies against the virus of hate.’”(Peters, 2020). Online platforms are more obligated than ever to enforce Europe’s 2016 Code of Conduct on Countering Illegal Hate Speech Online.

Over the years Facebook has been under relentless fire for its inaction in regards to hate speech, with a large proportion of the blame on their content moderators and their loose and unspecific guidelines on hate speech (Sharma, 2020)(Effron, 2020)(Sonnemaker, 2020), but has recently become more involved in the fight to diminish hate speech online with the introduction of their AI to better detect hate speech. In their guidelines, they state that there is no tolerance for any “direct attack on people based on what we call protected characteristics — race, ethnicity, national origin, religious affiliation, sexual orientation, caste, sex, gender, gender identity, and serious disease or disability.”, but has the introduction of this AI aided them in the moderation of their platform, or is it just making matters worse?

Rule ‘benders’

“Content moderation is hard. This should be obvious, but it is easily forgotten. Moderation is hard because it is resource intensive and relentless; because it requires making difficult and often untenable distinctions; because it is wholly unclear what the standards should be; and because one failure can incur enough public outrage to overshadow a million quiet successes.”(Gillespie, 2018)

With the introduction of AI that is able to flag content more easily using keywords and phrases, it has become easier for posts and comments to be marked as potential hate speech for moderators to examine. But with the increase in moderation, people who were not originally informed on what could be considered ‘hate speech’ are learning from they mistakes, and finding ways to navigate around them. These people who may have originally not known much about how the platform moderates their hate speech, are no users who “know the rules because they’re determined to break them.”(Gillespie, 2018) In the example below, we see users navigating the space of racial hate speech through using codewords such as “yt people” as a replacement for “white people” in order to prevent their comments being flagged by moderators. Although in the context of this post there is no ill intention about race, the concept of using codewords to refer to race, gender, sexual orientation etc. is spread throughout many facebook groups of various agendas, making it more difficult for AI to flag possible hate speech.

Screenshot of Facebook Comments
Facebook comments describing the use of alternative descriptors to avoid being flagged by the Facebook hate speech algorithm. Retrieved from: https://www.facebook.com/groups/imaginenot/permalink/3335479496568485/

People seem to be constantly finding ways around moderation, and although AI may make it less difficult to flag posts, it may also be simultaneously making it more difficult to find these posts in the first place as people find new ways around it, rendering this new improved way of moderation ineffective.

Increase in anonymity and toxic techno-cultures

Retrieved from: https://www.pxfuel.com/en/free-photo-jcaiv

Aside from the fact that many communities in the centre of political engagement on Facebook are finding ways to bend the rules – rendering new moderation technologies ineffective, moderation of content is also having adverse effects on the community by indirectly promoting toxic techno-cultures.

According to Facebooks “Enforcing Community Standards”, when a single profile/page has produced many posts/comments that have required a moderators intervention/removal, the profile may be temporary blocked or suspended. Although on the surface this may seem a step in the right direction, within political communities on Facebook it has become standard to create multiple anonymous profiles in order to avoid being silenced and sustain the racial discourse. (Farkas, 2018)

“Anonymity, or the condition of being unknown (nameless) to others, is considered a major determinant of disinhibitive behavior.”(Lapidot-Lefler & Barak, 2012) This increase in anonymous profiles as a result of moderator intervention is only exacerbating the problem, as more anonymous profiles are created, the people behind them lessen all their inhibitions and contribute harsher and more problematic content to online discourse.

Being moderated as a ‘Badge of Honour’

“Some in the right wing considered the label a badge of honor, … It is being rolled into the larger narrative about conservatives being silenced by social media companies.” (Donovan, 2020)

Although to many people, being blocked or suspended would be an inconvenience,  and something that we would much rather avoid, in the case of some users, it is seen as a “badge of honor”, that supports their theory that they are trying to be silenced or censored for speaking the ‘truth’ and being part of some larger resistance. The image below is a prime example of this, taken from a member of the Facebook group “Americans against blm”.

Facebook Group PostRetrieved from: https://www.facebook.com/groups/404285083888462/permalink/423426365307667/
This narrative has promoted users to continue their output of hate speech on social media, as they believe that they are martyrs for doing so, and encourages others within these toxic techno-cultures to do the same.
I suggest that through silencing these groups for their hate speech, moderation is affirming their beliefs, and motivating them to push their agenda more aggressively than before, fanning the flames of these toxic techno-cultures .

Screenshot taken from ‘Americans against blm’ Facebook group

Relevance to other platforms

Facebook is one of the most influential platforms currently, and many other social medias seem to follow its lead. If Facebook is falling short on its management of hate speech, we can be sure that this is an issue that will be faced by the majority of platforms that allow users to express themselves through the use of language or visuals. “It matters how Facebook sets and enforces rules, even if you’re not on Facebook. “(Gillespie, 2018)

What are the implications?

Despite the implicit need to prevent hate speech online, moderators and AI programs seem to be exasperating the problem. Although it is an imperfect system, the alternative of complete unmoderated self expression would be much worse as we have previously seen in the history of the internet preceding the introduction of moderators. Moderators have become necessary in this day and age (Bengani, 2018), but we must also consider the possibilities of a less flawed alternative. Excluding the possible issues of human error and bias in platform moderation, an exploration of the adverse effects silencing hate speech can have conveys a need for a different system, as it evident that the current one has many faults.

Although I personally have no solution to this, and do continue to believe hate speech has no place in this world, I can only suggest that attempting to silence people is the wrong approach.









Bengani, P., Ananny, M., & Bell, E. (2018). Controlling the Conversation: The Ethics of Social Platforms and Content Moderation. Retrieved from https://academiccommons.columbia.edu/doi/10.7916/D84F3751

CODE OF CONDUCT ON COUNTERING ILLEGAL HATE SPEECH ONLINE. (2016). Retrieved from http://ec.europa.eu/justice/fundamental-rights/files/hate_speech_code_of_conduct_en.pdf

Dansby, R., Fang, H., Ma, H., Moghbel, C., Ozertem, U., & Peng, X. (2020). AI advances to better detect hate speech. Retrieved  from https://ai.facebook.com/blog/ai-advances-to-better-detect-hate-speech/

Donovan, J. (2020). Interview for NBC News [In person]. https://www.nbcnews.com/tech/social-media/facebook-twitter-put-warning-label-edited-video-biden-n1153506.

Effron, O. (2020). Facebook will ban Holocaust denial posts under hate speech policy. Retrieved from https://edition.cnn.com/2020/10/12/tech/facebook-holocaust-denial-hate-speech/index.html

Enforcing Our Community Standards – About Facebook. (2020). Retrieved from https://about.fb.com/news/2018/08/enforcing-our-community-standards/Gillespie, T. (2018). Custodians of the internet : platforms, content moderation, and the hidden decisions that shape social media . Yale University Press.

Guynn, J. (2019). Facebook while black: Users call it getting ‘Zucked,’ say talking about racism is censored as hate speech. Retrieved from https://www.usatoday.com/story/news/2019/04/24/facebook-while-black-zucked-users-say-they-get-blocked-racism-discussion/2859593002/

Lapidot-Lefler, N., & Barak, A. (2012). Effects of anonymity, invisibility, and lack of eye-contact on toxic online disinhibition. Retrieved from https://www.sciencedirect.com/science/article/pii/S0747563211002317?casa_token=gl10_IDoKv4AAAAA:dFD0v9x-zYBVhFfrNwEk0N12SqD7AXaLwhHpxRw4o6vCeBoPmUPsO02AklOx7gzak0DXFi0

Peters, M. (2020). Limiting the capacity for hate: Hate speech, hate groups and the philosophy of hate. Retrieved from https://www.tandfonline.com/doi/full/10.1080/00131857.2020.1802818

Seth, S. (2020). Protected by online anonymity, hate speech becomes an online mainstay. Retrieved from https://edition.cnn.com/2010/LIVING/08/16/online.anonymity/index.html

Sharma, N. (2020). Allegations of favouritism in India couldn’t have come at a worse time for Facebook. Retrieved from https://qz.com/india/1893001/zuckerbergs-facebook-under-fire-over-hate-speech-modis-bjp/

Sonnemaker, T. (2020). Facebook ‘did nothing’ about violent militia and hate groups for 5 years despite being warned at least 10 times, advocacy group says. Retrieved from https://www.businessinsider.com.au/facebook-ignored-warnings-violent-anti-muslim-militia-hate-groups-2015-2020-9?r=US&IR=T