Return to Threats

Black Hat demo highlights indirect prompt injection attacks against ChatGPT-style systems

Black Hat USA 2023 Briefings 2023-08-09 indirect prompt injection Critical

What Happened

At Black Hat USA 2023, security researchers demonstrated indirect prompt injection attacks that embed malicious instructions in external content, which LLM-based assistants then ingest and execute.[1] The talk showed how these attacks can be used to persuade users to reveal sensitive information or to make unauthorized API calls, underscoring risks for SaaS and agent-based workflows that connect LLMs to internal business tools.[1]

Why It Matters

The report describes a Black Hat USA demonstration of indirect prompt injection, where malicious instructions are embedded in external content and then executed by ChatGPT-style assistants when they ingest that content. The demonstration showed potential outcomes including unauthorized API calls and persuading users to reveal sensitive information, especially in SaaS and agent workflows connected to internal business tools. CyberSE.AI should treat this as a high-priority agent-security issue because any LLM that reads untrusted documents, emails, tickets, or web content can be steered into leaking data or taking unintended actions.

Healthcare Fintech SaaS SMB AI startups

CyberSE Analysis

This signal maps to indirect prompt injection. Organizations using AI agents, LLM APIs, SaaS integrations, or sensitive data workflows should review whether this class of issue could create unauthorized tool execution, data leakage, weak approval gates, or unmanaged supply-chain exposure.

Recommended Actions

  • Restrict AI agent tool permissions and production write paths.
  • Review sensitive data access across prompts, logs, embeddings, memory, and SaaS integrations.
  • Add human approval workflows for high-impact or state-changing actions.
  • Run prompt injection and indirect prompt injection tests against affected workflows.
  • Document the owner, control gap, and remediation deadline for this risk class.

Source

https://www.blackhat.com/us-23/briefings/schedule/index.html#prompt-injection-attacks-against-large-language-models-32541

Talk to AI CISO