Understanding the Metrics:
These numbers are technical information that help the classifier and integrations do their
job. These numbers should be used by experts who understand the implications. For most
moderation purposes use the Risk Level metric and Triggered Categories and Keywords for fine
tuned response and as guidance for human interventions.
-
Safety Score (%):
Indicates how safe the content appears. Higher (90%+) is better/safer.
Low scores indicate higher probability of harmful content.
-
Severity Level (0-100):
Measures the intensity of detected harm. Lower (0-20) is
improved/safer. High scores (80-100) indicate critical danger or severe
violations.