One aspect of the AI revolution keeping security professionals up at night is the continued prevalence of prompt injection attacks that enable exfiltration of sensitive data, even against dominant vendors like Microsoft and Salesforce. These attacks exploit the way large language models (LLMs) process instructions, allowing malicious actors to trick AI agents into revealing confidential information or performing unauthorized actions.
Capsule Security, a vendor specializing in AI agent runtime security, published research detailing two significant vulnerabilities. The first, named PipeLeak, affected Salesforce Agentforce. The second, called ShareLeak (CVE-2026-21520), targeted Microsoft Copilot. Although both vulnerabilities have been addressed by the respective vendors, the research serves as a stark reminder that prompt injection remains an unsolved problem for LLMs and the AI agents built on them.
PipeLeak: Exploiting Salesforce Agentforce
In the case of the Salesforce flaw, an attacker could insert malicious instructions into an untrusted lead capture form. These forms are commonly used on customer websites to collect prospective client information. The Salesforce agent would then interpret the attacker's input as a trusted prompt, overriding its intended behavior. Capsule's proof-of-concept attack used a simple one-line instruction to command the agent to send all leads it could find to the attacker via email. No complex code or exploit was required.
According to Capsule's Bar Kaduri, the vulnerability stems from a fundamental architectural flaw: Agent Flows process lead form inputs as trusted instructions rather than untrusted data. Because lead forms accept arbitrary text from external, unauthenticated users, an attacker can embed malicious prompts that override the agent's intended behavior. This design oversight allowed data exfiltration without requiring any special access or advanced technical skills.
ShareLeak: Microsoft Copilot's SharePoint Flaw
The Microsoft Copilot vulnerability, designated CVE-2026-21520 and rated high severity with a CVSS score of 7.5, required a more complex attack chain. An attacker could insert malicious code into a SharePoint form input, which would then trigger the connected Copilot data and return customer data to an attacker-controlled email address. Even when safety mechanisms flagged the attack, data was still exfiltrated. Like the Agentforce bug, this was a prompt injection attack triggered by a customer-facing form.
The attack demonstrated that even sophisticated safety filters can be bypassed if the underlying LLM architecture does not properly distinguish between trusted instructions and untrusted data. Microsoft addressed the issue following Capsule's report, but the company has not provided detailed comments on how the vulnerability was fixed.
Vendor Responses and the Human-in-the-Loop Debate
In response to Capsule's report, Salesforce thanked the security vendor and addressed the prompt injection. However, regarding the data exfiltration component, Salesforce told Capsule that this is a configuration-specific issue and that customers can activate human-in-the-loop (HITL) requirements as a configuration setting to prevent data leaking. The CRM vendor stated, We have determined this is a configuration-specific issue rather than a platform-level vulnerability. To ensure security, our out-of-the-box email actions require human-in-the-loop oversight. The same HITL requirement is available as a configuration setting for custom actions to prevent unintentional data transfers.
Naor Paz, co-founder and CEO of Capsule Security, found it surprising that Salesforce's response mostly boiled down to HITL configuration recommendations. He argued that the whole premise of AI agents is to perform tasks autonomously without constant human supervision. We're seeing agents like Claude Code running for days, writing code, querying production databases, and doing many dangerous things autonomously, Paz said. And I think their answer, like, 'Do human in the loop,' is just embarrassing.
In a statement, a Salesforce spokesperson said, Prompt injection is an evolving challenge across the AI industry, and our approach includes layered safeguards designed to help mitigate these risks, including controls around instruction isolation, tool-use restrictions, and human oversight. We continue to refine these safeguards and work with the security research community to enhance protections as these threats evolve.
Recommendations for AI Agent Security
Capsule recommends that organizations running Agentforce treat all lead form inputs as untrusted data, disallow Email Tool usage when processing untrusted inputs, apply input sanitation and prompt boundary techniques, require manual review before sending emails with CRM data, and log all agent actions involving data access or external communication. These recommendations highlight the need for a defense-in-depth approach that combines technical controls with procedural safeguards.
The research also underscores a concept called the lethal trifecta in AI agent security. This occurs when an agent has access to sensitive data, is exposed to untrusted content from external sources, and has the ability to communicate externally. When these three conditions converge, data can be easily manipulated and exfiltrated. Paz noted that this is not a resource problem but an approach problem, as major vendors like Microsoft and Salesforce have not yet developed adequate defenses against this newer class of threats.
The Broader Context of Prompt Injection Attacks
Prompt injection attacks have been a known issue since the early days of LLMs. They exploit the fact that LLMs cannot reliably distinguish between system instructions and user-provided input. In a typical attack, a malicious user includes text in a prompt that overrides the model's original instructions, causing it to behave in unintended ways. For AI agents that have access to internal systems and sensitive data, the consequences can be severe.
The vulnerabilities discovered by Capsule Security are particularly concerning because they involve public-facing forms that can be accessed by anyone. This means that an attacker does not need to have prior access to the organization's internal systems. They simply need to submit a crafted input to a form that is processed by an AI agent. The attack surface is enormous, especially for companies that use AI agents to automate customer interactions.
As organizations rush to deploy AI agents across their operations, they inherit significant risks that existing security tools were not designed to address. The attack we demonstrated required no special access, no exploitation of traditional software vulnerabilities, and no advanced technical skills, just an understanding of how LLMs process instructions, wrote Kaduri in a blog post discussing Microsoft's Copilot flaw.
Prompt injection is not a problem that will disappear with better models. Even state-of-the-art LLMs are vulnerable to carefully crafted prompts that can bypass safety filters. Researchers have demonstrated that techniques like jailbreaking, role-playing, and context manipulation can all be used to induce unintended behavior. The security community is still developing best practices for defending against these attacks, but there is no silver bullet.
One promising approach is instruction isolation, where the AI system clearly separates system-level instructions from user-provided content. This can be implemented through techniques like prompt sandboxing or using separate models for instruction interpretation. Another approach is to use output filtering to detect and block potentially malicious responses before they are sent to users or external systems.
Capsule Security's research acts as a reminder that prompt injection is not going away for the foreseeable future. It is also hard to say how the landscape for these attacks might change as exploit-hunting capabilities like those found in Anthropic's Claude Mythos reach the threat actor masses. As AI agents become more autonomous and are granted greater access to sensitive data, the potential for damage from prompt injection attacks will only increase.
Source: Dark Reading News