Customer Support Agent
A customer-facing chatbot that combines a multi-turn agent loop with agentic RAG over a knowledge base and schema-constrained business API tools for real-world actions like refunds, order modifications, and account changes. Human-in-the-loop escalation routes complex or sensitive cases to human agents.
Details
This architecture extends the Enterprise RAG Chatbot pattern with an agent loop and transactional tools. The Enterprise RAG Chatbot has no tool access or agent behavior - its blast radius is limited to text output quality. Adding an agent loop with business API tools means a successful attack can trigger real-world financial and operational consequences, fundamentally changing the trust model.
Customer support agents illustrate elements of the agent-native application pattern: schema-constrained API tools serve as atomic primitives, and the agent loop handles routing and edge cases rather than hard-coded branching logic. Observing what customers ask the agent to do - and where it fails or escalates - is a form of latent demand discovery, revealing unmet support needs and missing knowledge base content.
Capabilities
- Multi-turn conversation
- Agent loop with tool calling
- Agentic RAG (dynamic knowledge base retrieval via tool calls)
- Function calling (schema-constrained business API tools: order lookup, refund processing, account modification, ticket creation, escalation routing)
- Human-in-the-loop (escalation to human agents for complex/sensitive cases, approval gates for high-value actions)
- Conversational interface
- Guardrails (input/output classifiers, action authorization, PII filtering)
- Context engineering (separating customer data, conversation history, retrieved documents, system instructions)
- Structured output
Trust analysis
Four input surfaces feed the agent's context: system instructions (developer-controlled), conversation history (customer-supplied, untrusted), retrieved knowledge base content (semi-trusted internal content maintained by the organization), and customer data from CRM/backend lookups (sensitive but necessary for task completion). The conversation history and retrieved content are the primary prompt injection surfaces, while customer data from backend systems introduces PII that must not leak into responses or tool calls directed at the wrong account.
Business API tools are schema-constrained by function calling definitions - the agent can only invoke predefined operations with typed arguments, unlike code execution where generated code can do anything a sandbox permits. The schema constrains the shape of actions but not their correctness: the agent can call a valid refund endpoint with the wrong amount, process a return on the wrong order, or modify an account based on manipulated context. The tools operate on real money and customer data, making every incorrect tool call a potential financial or operational incident.
Human-in-the-loop functions as a tiered escalation boundary rather than per-action approval. Routine queries and low-value actions proceed autonomously, while complex cases, high-value transactions, or low-confidence situations route to human agents. The escalation decision is itself a model judgment call - an overconfident agent may fail to escalate cases that require human judgment, and prompt injection or goal manipulation can suppress escalation to keep the conversation under attacker influence.
The system typically serves many customers through shared infrastructure. Context isolation between sessions must prevent one customer's data from leaking into another's context. The knowledge base is a shared resource: anyone who can write to the corpus can influence how the agent handles all customer interactions when those documents are retrieved.
Interaction effects
- Agent loop + transactional tools + prompt injection: Unlike RAG-only systems where compromise affects text output, a successful prompt injection here can trigger real-world actions - processing fraudulent refunds, modifying accounts, or exfiltrating customer data through tool calls. The blast radius extends from text quality to financial and operational consequences.
- Agentic RAG + tools + policy enforcement: The agent retrieves policy documents and uses them to make authorization decisions (e.g., refund eligibility, discount limits). If the knowledge base contains incorrect or manipulated policy content, the agent may authorize actions beyond intended limits - a context poisoning vector with direct financial impact.
- Customer data in context + multi-turn accumulation: Customer PII enters context from CRM lookups and conversation, accumulating across turns. Each tool result adds more sensitive data (order details, payment information, account history). A successful prompt injection late in a conversation has access to all previously loaded PII, amplifying the data exfiltration risk.
- Human escalation + agent confidence: The model decides when to escalate to human agents. Goal manipulation or prompt injection can suppress escalation, keeping the conversation under attacker influence when it should have been routed to a human. Conversely, misaligned model behaviors like sycophancy can lead the agent to grant unauthorized concessions rather than escalating.
- Multi-tenant + shared knowledge base: Multiple customers interact with the same system and knowledge base. Insufficient session isolation creates cross-tenant data leakage risk, while the shared corpus creates a single context poisoning surface that affects all customers when manipulated documents are retrieved.
Threats
| Threat | Relevance | Note |
|---|---|---|
| Prompt injection | Primary | Customer messages, knowledge base docs, and CRM data can direct transactional tools |
| Context poisoning | Primary | Poisoned knowledge base documents have direct financial impact through tool-mediated actions |
| Tool misuse | Primary | Wrong refund amounts, wrong account modifications; real financial consequences |
| Data exfiltration | Elevated | Customer PII accumulates across turns from CRM lookups and conversation |
| Cross-tenant / cross-session data leakage | Elevated | Multi-tenant shared infrastructure; customer data isolation failures |
| Goal manipulation | Elevated | Suppressing escalation, redirecting toward unauthorized refunds or account changes |
| Tool output poisoning | Elevated | Corrupted CRM or backend data hijacks subsequent reasoning |
| Misaligned model behaviors | Elevated | Sycophancy leads to unauthorized concessions; skipping verification before transactions |
| Hallucination exploitation | Elevated | Incorrect policy interpretations cause inappropriate transactional actions |
| Guardrail bypass | Elevated | Circumventing action authorization, PII filters; multi-turn jailbreaking |
| Denial of service | Elevated | Tool-call loops exhaust backend resources or inflate inference costs |
| System prompt extraction | Standard | Revealing tool schemas or internal policy rules |
| User manipulation | Standard | Customer trust in retrieval-grounded responses |
| Training data poisoning | Standard | Baseline risk, no architecture-specific amplifier |
Examples
- An e-commerce support agent that looks up orders, processes returns and refunds, and retrieves from a help center knowledge base.
- A banking support agent that checks account balances, initiates disputes, and answers policy questions from internal documentation.
- A telecom support agent that modifies plans, troubleshoots service issues using diagnostic tools, and escalates to human agents for complex billing disputes.