Safety - Comprehensive Guide of Safety

Detailed Look At Our Privacy Modules

 Unstructured Text

List of Methods for Unstructured Text Data

Method

Description

Request

Response

Safety Analyze

Detects profane words in each text and gives JSON report as an output.

Mandatory Fields: Input text

JSON Report

Safety Anonymize

Anonymizing detected profane words in each text and gives anonymized text as an output.

Mandatory Fields: Input

Anonymized Text

 Image

List of Methods for Image Data

Method

Description

Request

Response

Image Analyze

Detects NSFW image based on parameters (porn, sexy, neutral, drawings, hentai) in input image.

Mandatory Fields: Input Image

JSON containing the detection scores and byte64 code of output image, all in JSON format

Optional Fields:  portfolio, account

Image Generate

Generates an image based on the prompt given as a input.

Mandatory Fields: Input Prompt

Gives output image based on the prompt and also checks the image using above image analyze api. Gives JSON containing the detection scores and byte64 code of output image, all in JSON format

Optional Fields:  portfolio, account

Nudity Analyze

Detects the specific parts of nudity in the given image.

Mandatory Fields: Input Image

Gives byte64 image blurring the detected nudity parts.

Optional Fields:  portfolio, account

 Video

List of methods for Video Data

Method

Description

Request

Response

Safety Video

Detects NSFW video based on parameters (porn, sexy, neutral, drawings, hentai) in input video.

Mandatory Fields: Input Video

Gives blurred output video based on the scores for NSFW parameter. Gives byte64 video.

Nudity Video

Detects the specific parts of nudity in the given video.

Mandatory Fields: Input video

Gives the byte64 video blurring the detected nudity parts.

Optional Fields:  portfolio, account

Models Used

 Detoxify

Detoxify Model

It is a machine learning model that can identify and classify toxic text. It is a powerful tool that can be used to help protect people from online abuse. Detoxify is able to identify a wide range of toxic content, including hate speech, threats, and self-harm. It can also identify more subtle forms of toxicity, such as sarcasm and insults.

How it Works:

  1. Text Preprocessing: The input text is cleaned and tokenized into a numerical representation.  

  2. Feature Extraction: A pre-trained language model, like BERT or RoBERTa, extracts semantic and syntactic features from the tokenized text.

  3. Toxicity Classification: A classifier, often a neural network, is trained to predict the likelihood of toxicity based on the extracted features. This classifier is trained on a large dataset of labeled toxic and non-toxic text.  

  4. Output: The model outputs a probability score for each toxicity category, such as hate speech, threats, or offensive language.

 NSFW Gantman

NSFW Gantman

This AI-powered tool acts as a digital guardian, scanning images and videos to identify and flag NSFW (Not Safe For Work) material.

NSFW Gantman is a machine learning model designed to identify sexually explicit content in images. It works by:

  1. Image Analysis: The model analyzes the image pixel by pixel, looking for specific patterns and features associated with explicit content.

  2. Feature Extraction: It extracts important features from the image, such as edges, textures, and color patterns.

  3. Classification: Using these extracted features, the model classifies the image as either safe or explicit.

The model is trained on a large dataset of images, allowing it to learn to recognize explicit content with high accuracy. It is widely used in content moderation systems to filter out harmful content and protect users.

 Nudenet

NudeNet Model

It aims to provide a responsible AI solution for identifying and censoring nudity in various applications, which is particularly important in contexts where content moderation is necessary, such as social media platforms, adult content filtering, and user-generated content sites.

Classes Detected:

NudeNet can identify a variety of classes related to nudity, including:

  • FEMALE_GENITALIA_EXPOSED

  • MALE_BREAST_EXPOSED

  • BUTTOCKS_EXPOSED

  • FACE_FEMALE

  • and many more.

Safety - By Examples

 Text

Unstructured Text Safety

image-20241114-091013.png
  • The AI model processes the input text and generates a safety analysis report. This report includes:

    • Profanity Scores:

      • Toxicity: This metric measures the overall toxicity of the text, indicating the likelihood of it being offensive or harmful. In this case, the toxicity score is 0.973, suggesting high toxicity.

      • Severe Toxicity: This metric measures the severity of the toxic content, indicating the potential for extreme harm. The score of 0.014 suggests a low level of severe toxicity.

      • Obscene: This metric measures the obscenity of the content, indicating the presence of explicit or vulgar language. The score of 0.945 suggests high obscenity.

      • Threat: This metric measures the threat level of the content, indicating the potential for violence or harm. The score of 0.001 suggests a very low threat level.

    • Profane Words:

      • The model identifies specific profane words present in the text: "bullshit" and "shit."

  1. Safety Output:

  • Based on the analysis, the AI model flags the input text as unsafe due to its high toxicity and obscenity levels.

 Image

Image Safety

The below provided image showcases the capabilities of an AI-powered image analysis tool. It demonstrates how the tool can effectively identify and process explicit content within an image.

  • Original Image: The original image contains explicit content.

  • Processed Image: The tool has successfully identified and obscured the explicit parts of the image, rendering it safe for public viewing.

Pie Chart Analysis:

The pie chart provides a detailed breakdown of the image's content categories as determined by the AI model:

  • Drawings: A small portion of the image consists of drawings, which are generally considered safe content.

  • Hentai: A larger portion of the image is classified as hentai, a genre of anime and manga that often features explicit sexual content.

  • Neutral: A relatively small portion is categorized as neutral, indicating content that is neither explicit nor suggestive.

  • Porn: A significant portion of the image is classified as pornographic, meaning it contains explicit sexual content.

  • Sexy: A small portion is categorized as sexy, suggesting suggestive but not explicitly sexual content.

image (13).png

Processed Image

image (12)-20241114-085626.png

Image Analyze Report