Researchers at Carnegie Mellon University recently published a study seeking to understand why a YouTube chess channel with over one million subscribers had a live interview pulled for "harmful and dangerous content." Their findings affirm a widely held view that moderation by artificial intelligence still has much room for improvement.
Antonio Radić, the host of the YouTube channel, told WIRED that the live-streamed video, a discussion of opening chess moves with grandmaster Hikaru Nakamura, remained offline for 24 hours before it was reinstated. In a statement to WIRED, YouTube only said that removing the video was a mistake.
The situation piqued the interest of CMU's Ashique KhudaBukhsh and Rupak Sarkar, who designed an experiment to see whether or not YouTube's algorithms were being tripped up by chess terminologies like white and black pieces, defenses, and attacks.
The artificial in AI — The group trained versions of the BERT language model, annotating texts as racist/not racist, one using data from the racist site Stormfront and the other using Twitter. The team tested the new algorithms on comments from 8,000 chess videos and found that while only one percent of comments were flagged as hate speech, 80 percent of those were false positives — meaning they weren't actually racist.
Like an artificial flower, artificial intelligence is a crude imitation of the real thing. At the end of the day, what algorithms do is simple pattern matching based on data they've been fed by humans. But if they haven't seen an edge case scenario, they're apt to get it wrong. The results of the test suggest that algorithms trained to identify racist terminology haven't been shown enough examples of chess discussions to understand the context of "black" and "attack" appearing in a sentence.
WIRED did its own experiment, running this sentence through hate-speech algorithms from Facebook and Google: "White’s attack on black is brutal. White is stomping all over black’s defenses. The black king is gonna fall." Both algorithms judged it more than 60 percent likely to be hate speech.
Algorithmic policing — Yejin Choi, an associate professor at the University of Washington, says that if companies like Google want to rely on algorithms to police their platforms, they'll have to invest more resources and try new methods. She says that algorithms work better, for instance, when they analyze more than a piece of text in isolation, but also the user's comment history and the nature of the channel where the comment is posted.
Even as algorithms improve, they likely will never be 100 percent bulletproof. Users often find crafty ways to avoid the detection of algorithms, such as writing in code or replacing letters with numbers. And algorithms have been found to be biased against certain demographics like people of color because they've been trained by humans, whose own inherent biases influence the input data.