Repeat Reply Vulnerability
Updated: May 5, 2026
Description
The AI model can be exploited using a repeat-reply attack, where it is prompted to repeat specific strings indefinitely.
This behavior can inadvertently cause the model to leak sensitive data, including past responses, system instructions, or private information embedded in training data.
If an attacker successfully triggers a repeat-reply attack, the AI may expose unintended data by continuously iterating over past responses. This could lead to data leakage, security breaches, or even the exposure of proprietary or confidential model behaviors.
Remediation
Investigate and strengthen guardrails to detect and prevent repeat-reply attacks. Implement output length restrictions, loop detection mechanisms, and rate limiting to stop infinite repetitions. Regular audits should also be conducted to ensure the model does not inadvertently reveal unintended data when subjected to such attacks.
Security Frameworks
Sensitive information can affect both the LLM and its application context. This includes personal identifiable information (PII), financial details, health records, confidential business data, security credentials, and legal documents. Proprietary models may also have unique training methods and source code considered sensitive, especially in closed or foundation models.
Unbounded Consumption occurs when a Large Language Model (LLM) application allows users to conduct excessive and uncontrolled inferences, leading to risks such as denial of service (DoS), economic losses, model theft, and service degradation.
Adversaries may target machine learning systems with a flood of requests for the purpose of degrading or shutting down the service. Since many machine learning systems require significant amounts of specialized compute, they are often expensive bottlenecks that can become overloaded. Adversaries can intentionally craft inputs that require heavy amounts of useless compute from the machine learning system.
Adversaries may craft prompts that induce the LLM to leak sensitive information. This can include private user data or proprietary information. The leaked information may come from proprietary training data, data sources the LLM is connected to, or information from other users of the LLM.
AI system is evaluated regularly for safety risks - as identified in the MAP function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and can fail safely, particularly if made to operate beyond its knowledge limits. Safety metrics implicate system reliability and robustness, real-time monitoring, and response times for AI system failures.
AI system security and resilience - as identified in the MAP function - are evaluated and documented.
Privacy risk of the AI system - as identified in the MAP function - is examined and documented.
The organization shall define and document verification and validation measures for the AI system and specify criteria for their use.
The organization shall define and document the necessary elements for the ongoing operation of the AI system. At the minimum, this should include system and performance monitoring, repairs, updates and support.
The organization shall ensure that the AI system is used according to the intended uses of the AI system and its accompanying documentation.
Attackers can manipulate an agent's objectives, task selection, or decision pathways through prompt-based manipulation, deceptive tool outputs, malicious artefacts, forged agent-to-agent messages, or poisoned external data.