New anti-harassment AI program has 90% success rate

And Big Tech needs to notice.
Original image replaced with Mashable logo
Original image has been replaced. Credit: Mashable

Yahoo has developed an AI algorithm that it says can correctly detect up to 90 percent of abusive comments online, making it outperform other "state-of-the-art" deep-learning-based algorithms, according to a report by the algorithm's developers.

"While automatically detecting abusive language online is an important topic and task, the prior [abuse detection] has not been very unified, thus slowing progress ... [abuse] can have a profound impact on the civility of a community or a user’s experience," the developers wrote.

The algorithm used a mix of machine learning and crowdsourced abuse detection to scan the comment sections of Yahoo News and Finance.


You May Also Like

Currently, most abusive language detectors are keyword-based systems. The problem is that abusers might avoid certain words to avoid the filters or come up with new slang. Additionally, these systems are bad at reading for context or sarcasm.

Yahoo, on the other hand, went a little deeper and was able to track responses by the length of comments and words, the number of punctuations, URLS and capitalized letters. It also tracked usage of "politeness words," modal verbs (like "could," "would" or "should," which can indicate either hedging or a confident speaker) and "insult and hate blacklist words." All in all, the algorithm outperformed Yahoo's old detectors by about 10 percent.

All in all, the algorithm outperformed Yahoo's old detectors by about 10 percent

Specially trained Yahoo employees also looked at the same comments and rated them as abusive or not, which helped to train the algorithm to look for implicit abuse. (The annotated database of what was marked as abusive will soon be available online on Yahoo Webscope.)

Yahoo crowdsourced abuse ratings from Amazon's Mechanical Turk as well, which allows anyone to sign up and filter through images or language. Participants received $0.02 for every comment they tried to categorize as abusive or not. These people, however, had not been trained in abuse detection like Yahoo's employees, and were found to be much worse at it. So even with the AI, human judgement is still vital to the operation.

Ultimately, the program might be severely limited as abuse is still defined by Yahoo itself and not the user -- unlike Instagram's new anti-harassment measures, which allow users to filter out comments with certain words. The program also may not be able find fake accounts or filter through abusive pictures or videos tweeted at a user, which happened to Leslie Jones on Twitter.

Some websites also have double standards when it comes to their community guidelines. On Facebook, Australian journalist and feminist activist Clementine Ford pointed out that she'd been blocked for telling a man to "fuck off" while someone who called her a "diseased whore" hadn't been flagged. That post was ultimately removed but the situation is a good example of how nuanced determining what is offensive and what isn't can be.

For instance, critics argue that women's bodies are often policed unnecessarily, censoring images with womens' nipples or period stains.

Still, the team at Yahoo remains optimistic.

"As the amount of online user generated content quickly grows, it is necessary to use accurate, automated methods to flag abusive language," they wrote. "In our work we take a major step forward."

Have something to add to this story? Share it in the comments.

Mashable Potato

Recommended For You
AdultFriendFinder success rate: Real AFF info from someone who’s tried it
By Jack Dawes
Hands shaking surrounding by dice

Jimmy Kimmel lambasts Trump's claims that anti-ICE protests are 'fake riots'
Jimmy Kimmel presents his show.

Rate your favorite audio brands for a chance to win a $250 Amazon gift card
Mashable Readers' Choice Award logo against background of audio products


The rise of Anti-Valentine's Day online
ripped paper with a broken heart on it

Trending on Mashable
NYT Connections hints today: Clues, answers for April 3, 2026
Connections game on a smartphone

Wordle today: Answer, hints for April 3, 2026
Wordle game on a smartphone


What's new to streaming this week? (April 3, 2026)
A composite of images from film and TV streaming this week.

NYT Connections hints today: Clues, answers for April 2, 2026
Connections game on a smartphone
The biggest stories of the day delivered to your inbox.
These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up. See you at your inbox!