Findings

AI Output Tokens Elevated

Updated: June 19, 2025

Description

Severity: Info

A significant increase in output tokens generated has been detected.

This may indicate unusually verbose responses, inefficient prompt handling, or excessive token consumption, which can impact performance, cost, and response times.

Potential Causes:

  • Inefficient prompt engineering leading to long-winded responses.
  • Misconfigured AI system generating excessive data.
  • Unintended behavior due to model fine-tuning or changes in prompt structure.

Remediation

Investigate what has caused the AI output token count to increase significantly.

Security Frameworks

Unbounded Consumption occurs when a Large Language Model (LLM) application allows users to conduct excessive and uncontrolled inferences, leading to risks such as denial of service (DoS), economic losses, model theft, and service degradation.

Adversaries may target machine learning systems with a flood of requests for the purpose of degrading or shutting down the service. Since many machine learning systems require significant amounts of specialized compute, they are often expensive bottlenecks that can become overloaded. Adversaries can intentionally craft inputs that require heavy amounts of useless compute from the machine learning system.

Adversaries may target different machine learning services to send useless queries or computationally expensive inputs to increase the cost of running services at the victim organization. Sponge examples are a particular type of adversarial data designed to maximize energy consumption and thus operating cost.

Previous (Findings - Log based findings)
AI Majority Stop Reason
Next (Findings - Log based findings)
AI Output Tokens Reduced