Findings

Multilingual content detected in AI logs

Updated: June 19, 2025

Description

Severity: Medium

Text in multiple human languages has been detected in AI logs.

This may indicate that the AI system is interacting with a global user base, consuming international data sources, or ingesting unvalidated inputs. While not inherently malicious, multilingual content in logs may complicate downstream analysis, expose sensitive content from international sources, or signal weak input sanitization.

AI logs with uncontrolled multilingual input could leak unintended information from non-primary language users, complicate compliance efforts, or skew model outputs.

Example Attack

A company's chatbot, trained on English-only data, begins receiving queries in Spanish, German, and Japanese due to a global rollout. These non-English messages, along with their responses, are stored in AI logs without translation or redaction. Upon review, customer service transcripts in other languages are found in logs, revealing PII and policy-violating content that had not been accounted for in the original risk model.

Remediation

Assess whether multilingual input is expected and permitted in your AI systems.

Previous (Findings - Log based findings)
Malicious activity found in logs
Next (Findings - Log based findings)
PayPal Secrets Found in AI Logs