The Problem of Automation & Content Moderation

Can automating the process of content moderation solve platforms governance woes?

"Illustration of content moderation assembly line" by Paru Ramesh, all rights reserved

Can automating the process of content moderation solve platforms governance woes?

This web essay argues that while it is an “appealing idea” it is not a fool proof solution and “may be contrary to the principles of governance that platforms should be pursing” (Gillespie, 2020, p.2).

The first part of the essay will provide a brief overview of automated content moderations’ genesis, with context to wider communication trends. Followed by an assessment of who benefits from Artificial Intelligence (AI) moderating content across platforms, and who does not. Finally, there will be an examination of the implications of automated content moderation on your everyday internet user.

So, what is automated content moderation?

In the context of social media platforms, the term “automated content moderation” refers to the machine learning tools that are used to make governance decisions for user generated content (Gorwa, Binns & Katzenbach, 2020). They are platforms first line of defence when it comes to detecting and preventing violations of community guidelines such as hate speech, child abuse and organised crime (Gorwa et al., 2020).

There are two types of machine learning tools used by platforms to moderate content:

  1. The most common is “pattern matching,” it involves content being compared to known black-listed examples for similarity (Gillespie, 2020).
  2. The other is “prediction” based, and involves algorithms “inducing generalisations” about content from known features of categories (Gorwa et al., 2020, p.5).

It is important to note, human moderators still constitute a part of the moderation process. According to Cambridge Consultants (2019) they become involved for three reasons:

  1. If the machine learning tools are uncertain if the content is harmful
  2. If published content is identified as harmful by users
  3. If users make an appeal following the removal of content

The process of automated content moderation and the involvement of human moderators is summarised in the flowchart below.

“Flow chart of the content moderation process” by Cambridge Consultants, all rights reserved

The genesis of automated content moderation

Content moderation has been around since the beginning of online communication (Gorwa et al., 2020). However, became automated as online forums grew in scale, and, hence, made manual moderation impractical (Gillespie, 2020).

Automated content moderation was first seen used by USENET in 1993 with the development of a program called “Automated Retroactive Minimal Moderation,” which was designed to tackle abuse on USENET’s online bulletins (Gowra et al., 2020).

Automated content moderation was later seen used by Wikipedia in 2001 with the development of “automated bots” to address the “vandalism” of Wikipedia pages (Gowra et al., 2020, p.3). Wikipedia (2020) regards vandalism as “any editing” that intervenes with Wikipedia’s mission to create a “free” and accurate “encyclopaedia” for the world, as illustrated in the example below.

“Example of Wikipedia Vandalism on Charlie Sheen’s Wikipedia page” by boredpanda, all rights reserved

Alike USENET and Wikipedia, social media platforms have had to adopt automated content moderation practices due to exponential growth in the number of users, and the amount and variety of content published over the past decade (Gillespie, 2020), as depicted in the graph below.

“Graph showing number of people using social media platforms, 2004 to 2018” by Statista and TNW, all rights reserved

The development of Germany’s Network Enforcement Act (NetzDG) in 2018 and the European Union’s code of conduct on Hate speech in 2016, has also prompted the use of AI in moderation due to the short time frames platforms are given to take down prohibited content (Gowra et al., 2020).

“GIPHY of handshake in front of the European Union Flag,” by European Commission, all rights reserved

Most recently in April 2020 automated systems almost exclusively moderated content, as human moderators were sent due to COVID-19 (Gillespie, 2020). Click to see Twitter’s, Facebook’s and YouTube’s statements regarding the announcement.

Who are the winners & losers of automated content moderation?

The winners

Arguably, human moderators benefit from the use of AI in the content moderation processes.  Moderation work is taxing to moderators’ mental health, with many experiencing mental health issues like PTSD and anxiety (Gillespie, 2020). In spotting copies of violated materials and predicting instances of abuse, automation can reduce the case load of moderators and ease their mental burden.

Have a look at this video, for a deeper insight into the trauma caused by moderation work, that necessitates the use automated content moderation services.

Platforms also benefit from the use of AI technology in content moderation as it helps them enforce their community guidelines and meet their obligation to NetzDG and the EU (Gorwa et al., 2020). The graphs below demonstrate how Facebook and Instagram have been better able to detect and remove content containing hate speech, as automation has improved and extended to include other languages (Facebook, 2020).

“Graph depicting Facebook’s proactive rate at detecting violating content on Facebook,” by Facebook, all rights reserved
“Graph depicting Facebook’s proactive rate at detecting violating content on Instagram,” by Facebook, all rights reserved

Automated content moderation services also protect copyright holders (Gorwa et al., 2020). As the pattern matching tools allow copyrighted material to be detected easily, and thus, be removed quickly. This practice is seen on YouTube, where copyright holders can choose to remove copyrighted material or receive a portion of the advertising revenue for its use (Gorwa et al., 2020). For more information on how automation is used to detect copyrighted material on YouTube check out the quick explainer below.

The Losers

Unfortunately, when platforms use AI in content moderation, marginalised groups are often negatively affected as software is not advanced enough to reflect  “cultural nuances” (Srinivasan, 2020). Most recently, Facebook blocked photos of a cultural ceremony in Vanuatu because it did not meet their community guidelines, the photos are depicted below (Srinivasan, 2020).

“Photo of traditional Vanuatu ceremony that was blocked on Facebook,” by Witnol Benko, all rights reserved

The censoring of the images by Facebook illustrates the bias of automated tools towards western cultures, and demonstrates how these tools can promote and entrench discrimination of  marginalised groups (Srinivasan, 2020).

Machine learning tools also negatively impact journalists and online activists, as they are unable to account for differences in context (Gillespie, 2020). This was shown when algorithms blocked journalists that had shared information on terrorist groups in Syria, because they were “glorifying terrorism” (Scott & Kayali, 2020). Similar problems have occurred for articles about COVID-19 deniers, as articulated in the tweet below.

Automated moderation systems can not account or the complexity of copyright and can not make fair use judgments, meaning content is often over-blocked (Gorwa et al., 2020).

Automated systems will also never be able to fully replace human moderators (Mack, 2019). Meaning the poor working conditions and mental health of moderators is still an huge issue platforms must address (Mack, 2019).

To learn more about why automation is not the sole solution to platforms moderation problems, listen to this conversation between UCLA academic Sarah Roberts and verge editor Nilay Patel.

How does automated content moderation affect you (the ordinary internet user)?

As platforms increasingly rely on AI to make moderation decisions, more mistakes will be made (Gowra et al., 2020). Whether that be blocking sources of news information that are in line with community guidelines (Gowra et al., 2020). Or the failure to remove content that is in violation of platforms community guidelines, as was seen in France where Twitter’s AI failed to remove hate speech on anti-racist activists accounts (Scott & Kayali, 2020).

The shift towards machine learning tools, will also see an increase in people “gaming systems” (Mack, 2019).This involves editing photos, videos and other content in a way that makes it hard for machine learning tools to detect abhorrent material (Matamoros-Fernandez & Kaye, 2020). As was recently the case, when a suicide video was mixed with “harmless cat videos,” and was distributed across TikTok and Instagram for a number of days without successful detection and prevention by AI software (Matamoros-Fernandez & Kaye, 2020). With more people evading moderation due to automation the safety of online platforms may become compromised (Mack, 2019).

Automated content moderation will also have a disproportionate impact on minority groups, as was discussed previously (Gowra et al., 2020). This will be the case until platforms reflect sensitivity in their algorithms, across different languages and cultures (Srinivasan, 2020).


Overall, this web essay has demonstrated the automation of content moderation to be an issue wrought with complexity. The size of social media platforms necessitates the use of AI to moderate content. However, it’s implementation has been shown to be counterproductive in some areas, leading to increased instances of discrimination, over and under blocking, and system gaming.  Ultimately, while platforms champion AI as the “pancea for the ills of social media” there is still a critical need for human intelligence (Gillespie, 2020, p. 2). Thus, platforms should focus their efforts on creating a sustainable, efficient and inclusive hybrid model of content moderation that involves AI and humans.


Cambridge Consultants. (2009). Use of AI in Online Content Moderation. Retrieved from Ofcom website:

Facebook (2020). Community Standards Enforcement Report. Retrieved from Facebook website:

Gillespie, T. (2020). Content moderation, AI, and the question of scale. Big Data & Society, 7(2), 1-5. doi:10.1177/2053951720943234

Gorwa, R., Binns, R., & Katzenbach, C. (2020). Algorithmic content moderation: Technical and political challenges in the automation of platform governance. Big Data & Society, 7, 1-15. doi: 10.1177/2053951719897945.

Mack, Z. (2019, July 2). Why AI can’t fix content moderation. The Verge. Retrieved from

Matamoros-Fernandez, A., & and Kaye, D. B. (2020, September 9) TikTok suicide video: It’s time social media platforms collaborated to limit disturbing content. ABC News. Retrieved from

Scott, M., & Kayali, L. (2020, October 21). What happened when humans stopped managing social media content. Politico. Retrieved from

Srinivasan, P. (2020, October 23). Facebook blocks user for nudity in photos of Indigenous Vanuatu ceremony. ABC News. Retrieved from

Wikipedia. (2020). Vandalism on Wikipedia. Retrieved from Wikipedia website:

Multimedia References

Benko, W. (2020). This photo of men on the island of Pentecost wearing traditional nambas coverings around their waists was mistakenly banned by Facebook’s automated system [Image]. Retrieved from

Bored Panda. (2017). Wikipedia Vandalism on Charlie Sheen’s Wikipedia page [Screenshot]. Retrieved from

Cambridge Consultants. (2009). The content moderation workflow combines automated systems and human moderators for pre-, post- and reactive moderation[Diagram]. Retrieved from

Derella, M. (2020, April 1). An update on our continuity strategy during COVID-19 [Blog post]. Retrieved from

European Commission. (2019).  Stars Peace GIF By European Commission [Giphy]. Retrieved from

Facebook. (2020). Of the violating content we actioned, how much did we find before users reported it? [Graph]. Retrieved from

Facebook. (2020). Of the violating content we actioned, how much did we find before users reported it? [Graph]. Retrieved from

Jin, K. (2020, March 19). Keeping People Safe and Informed About the Coronavirus [Blog post]. Retrieved from

Ortiz-Ospina, E. (2019). Number of people using social media platforms, 2004 to 2018 [Graph]. Retrieved from

Ramesh, P. (2019). People working at separate desks on screens that are fed onto an assembly line, leading to a large screen [Illustration]. Retrieved from

Roberts, S., & Patel, N. (Presenter). (2019, July 2). Why big companies will never get content management right, with UCLA’s Sarah T Roberts [Audio podcast]. Retrieved from

SachaBaronCohen. (2020, October 14). AI criticism [Twitter post]. Retrieved from

The Verge. (2019, June 19). Inside the traumatic life of a Facebook moderator [Video file]. Retrieved from

YouTube Creators. (2018, October 8). Fair Use – Copyright on YouTube [Video file]. Retrieved from

YouTube (2020, March 16). Coronavirus disease 2019 (COVID-19) updates [Blog post]. Retrieved from

About Greta Salgo 3 Articles
Greta Salgo is a student at the University of Sydney, studying Bachelor of Arts/Bachelor of Advanced Studies, majoring in Digital Cultures and Psychology.