Return to Threats

Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild

Palo Alto Networks Unit 42 2025-05-12 indirect prompt injection Critical

What Happened

Unit 42 documents live cases of web-based indirect prompt injection, where attackers embed hidden instructions in websites that are later ingested by AI agents performing tasks like webpage summarization or content analysis.[2] The research shows that when such agents have tool or data access, these attacks can drive unauthorized actions, data exfiltration, or other real-world impacts without the user directly entering a malicious prompt.[2]

Why It Matters

The Unit 42 article documents real-world cases of web-based indirect prompt injection, where attackers hide instructions in webpages that AI agents later crawl or summarize, causing the agents to execute attacker-controlled behavior without any obviously malicious user prompt.[2][4] The report shows that when such agents have tools or data access, these hidden prompts can drive unauthorized actions, leak credentials or payment data, and compromise decision workflows, turning routine browsing or summarization features into an attack surface.[2][4] From a CyberSE.AI perspective, this highlights the need to tightly scope agent permissions, enforce strict source and content trust policies, and implement runtime detection for anomalous tool use or data access triggered by external content. It also implies organizations should red team agent workflows specifically for hidden web-based instructions and update business logic so agents treat all external content as untrusted unless explicitly allowlisted.

Healthcare Fintech SaaS SMB AI startups

CyberSE Analysis

This signal maps to indirect prompt injection. Organizations using AI agents, LLM APIs, SaaS integrations, or sensitive data workflows should review whether this class of issue could create unauthorized tool execution, data leakage, weak approval gates, or unmanaged supply-chain exposure.

Recommended Actions

  • Restrict AI agent tool permissions and production write paths.
  • Review sensitive data access across prompts, logs, embeddings, memory, and SaaS integrations.
  • Add human approval workflows for high-impact or state-changing actions.
  • Run prompt injection and indirect prompt injection tests against affected workflows.
  • Document the owner, control gap, and remediation deadline for this risk class.

Source

https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/

Talk to AI CISO