Content moderation experts: Elon Musk is going about policing hate speech on Twitter all wrong

By Chris Stokel-Walker

November 11, 2022

From Eli Lilly offering insulin for free to Mario giving you the middle finger on behalf of Nintendo America, it’s fair to say that the last few days on Twitter have been odd. The times are a-changin’, and with it, so is Twitter’s approach to what is and isn’t acceptable on its platform.

As the new Twitter starts to take shape, Musk’s promise to redraw its content moderation policy to better match his free speech beliefs has come with its own challenges—not least because he’s laid off half of the company’s staff, including the moderation team’s leader.

Musk’s approach to content moderation and the limits of free speech have been outlined in various, sometimes contradictory, public statements. We have known for months that he is a self-described “free speech maximalist,” who claims to have lodged his bid to take over Twitter in order to return free speech to the platform. Yet, since taking over Twitter, that approach seems to have softened.

“There’s a big difference between freedom of speech and freedom of reach,” he told advertisers in a November 9 Twitter Space, hosted by Twitter ads chief Robin Wheeler (who days later would resign, and then promptly un-resign, from her position). He compared speech and reach on Twitter to preaching in Times Square. “Right now, there’s going to be somebody saying something crazy,” he said. “We don’t throw them in prison for that, but we also don’t put them on a gigantic billboard.”

For Liam McLoughlin, lecturer at the University of Liverpool with a specialism in social media and politics, the parallel doesn’t work. “Lazy metaphors of people shouting in Times Square have no place in the discussion of content moderation,” he says. It’s an overly simplistic metaphor shorn of context, says McLoughlin. If the guy in Times Square started shouting racist abuse, he’d be arrested.

Musk also suggested that Twitter’s current approach to missteps and breaches of the platform’s rules were too stringently enforced, suggesting that most people inadvertently broke rules, and shouldn’t be immediately banned for doing so. “He’s talking like moderation is a binary ‘to ban or not to ban,’” says McLoughlin. “Twitter already has a range of enforcement options that it has used in moderation outside permanent suspension, ranging from education, flags, and temporary read-only modes.”

It’s also a contradictory approach. “He has said a number of things that suggest that he’s creating more of an open, free speech environment,” says Rebekah Tromble, director of the Institute for Data, Democracy & Politics at George Washington University. “Then on the other hand, when he’s talking directly to advertisers, he says he takes seriously the need to combat hate speech and other forms of unhealthy dynamics. And he has yet to lay out a clear and detailed plan for how you might achieve either one of these things, let alone both of them simultaneously.”

Multiple content moderation experts Fast Company spoke to worry that Musk has underestimated the challenge of content moderation. “There’s a reason why there are huge teams across all of these platforms dealing with content moderation,” says Tromble. She fears that he’s starting from a position that researchers who are specialized in the field started from 15 years ago. “It’s a major concern for a platform like Twitter because just from a basic business perspective, advertisers will have no interest in having their brands appear next to content that is teeming with hate, harassment, and other forms of toxicity,” says Tromble. “And the more Twitter devolves into a cesspool, the less that users will be interested.”

Twitter has claimed in the last week that its new, less intrusive, approach to content moderation was indeed working. Musk tweeted that hateful speech has at times been below prior norms, “contrary to what you may read in the press.” The company’s (now former) head of trust and safety said that they had reduced impressions on hate speech content in search—essentially, the number of people viewing it through search results—by 95%.

But that doesn’t tell the whole story. Analysis by the Center for Countering Digital Hate shows that hate speech phrases, including the N-word, T-word, F-word, and various anti-Semitic terms are all up since the company changed hands. In the week beginning October 31—the first full week that Musk owned Twitter—use of the N-word on the platform was triple the average for the rest of 2022. Anti-trans slurs were up 53%; anti-gay slurs were up nearly 40%.

That’s not unnecessarily surprising. “From the person that accused someone of being a ‘pedo guy’ and got away with it, you can’t really expect the best content moderation,” says Carolina Are, a content moderation research and innovation fellow at Northumbria University. “Given content moderation and moderators in general is something already deeply underfunded, the fact that so many people are quitting and he’s firing so many others don’t make me think things are going to get better,” she says.

Are points out that marginalized communities are already hit significantly on Twitter with hate and pile-ons. “If the approach is, ‘We’ll let everything fly’, then that’s only going to get worse—but I don’t think it takes a content moderation expert to say that.”

(34)