Spectrum Labs launches world’s first AI solution for identifying and preventing toxic content produced by Generative AI

Spectrum Labs, a leading provider of Text Analysis AI tools for scaling content moderation in games, apps, and online platforms, has launched the world’s first AI content moderation solution to identify and prevent harmful and toxic behavior generated by Generative AI.

The proliferation of Generative AI, including platforms like ChatGPT, Dall-E, Bard, Stable Diffusion, and others, has made it possible for bad actors to create racist images, spread hate speech, promote radicalization, engage in spamming, scamming, grooming, and harassment on a large scale, with minimal investment in time.

Spectrum Labs has taken an essential step towards addressing this problem by creating a pioneering moderation tool designed to handle Generative AI content. This tool enables platforms to safeguard their communities against such highly scalable adversarial content automatically.

“Platforms have already been struggling to sift through mountains of user-generated online content produced daily to identify and remove hateful, illegal, and predatory content before Generative AI came along. Now, whether you are a spammer, a child groomer, a bully, or a recruiter for violent organizations, your job just got a lot easier,” said Justin Davis, CEO of Spectrum Labs.

“Fortunately, our existing contextual AI content moderation tools can be adapted to address this new flood of content because it was built to detect intent, not just a list of keywords or specific phrases, which Generative AI can easily avoid.”

Multi-layer, real-time AI moderation of Generative AI content could have far-reaching applications

Multi-layer, real-time AI moderation of Generative AI has several potential future applications. These include identifying copyright infringement and detecting bias in AI-generated content. Such technology can also provide better analytics on the types of content people create and how it’s used. However, Spectrum Labs provides a basic set of tools as quickly as possible to safeguard users and platforms from a potential flood of harmful content.

Due to Generative AI’s ability to generate realistic versions of human language, conventional moderation tools that rely on keyword identification are ineffective in detecting hateful intent in content that does not use specific racist language.

Other contextual models identifying sexual, threatening, or toxic content cannot recognize positive behaviors, including encouragement, acknowledgment, and rapport. As a result, these models would censor Generative AI responses related to sensitive topics, even if the content was intended to be helpful, supportive, or reassuring.

“At Spectrum Labs, we’re on a mission to make the internet safer. Trust and safety workers are the unsung heroes in this fight, and we’re honored to support them. We can build a safer digital world, one post at a time,” Davis added.


Team Eela

TechEela, the Bedrock of MarTech and Innovation, is a Digital Media Publication Website. We see a lot around us that needs to be told, shared, and experienced, and that is exactly what we offer to you as shots. As we like to say, “Here’s to everything you ever thought you knew. To everything, you never thought you knew”

Leave a Reply

Your email address will not be published. Required fields are marked *