Local Coding Agent
A local coding agent system combines a multi-turn agent loop with filesystem and shell access, subagent delegation, human-in-the-loop approval, skills loaded from the workspace, and MCP tool server connections. The defining trust property is that the agent runs unsandboxed under the developer's full identity and permissions.
Details
Local coding agents ship in two form factors that share the same core architecture but differ in interface and review affordances.
CLI agents (e.g. Claude Code) run as standalone terminal processes, making them editor-agnostic and usable in headless automation (CI/CD pipelines, scripted workflows). The developer reviews proposed actions as terminal text. A headless mode removes the human-in-the-loop boundary entirely, making every other defense layer load-bearing.
IDE-embedded agents (e.g. Cursor, Windsurf) operate as a copilot interface inside the editor. IDE embedding adds capabilities absent from a CLI: rich diff views for reviewing multi-file edits, inline tab completions, background agents that work on tasks asynchronously, and multi-model routing that dispatches requests to different LLMs by task type. The richer review UX can improve human oversight quality for complex changes compared to terminal-only review, but the higher interaction surface (auto-applied suggestions, background edits) also creates more opportunities for actions to bypass deliberate review.
Both form factors share the same trust-critical property: the agent runs with the developer's full permissions on a shared filesystem that serves simultaneously as workspace, instruction source, and attack surface. Supervision techniques like TDD as agent constraint and agent demonstration artifacts help the developer maintain oversight during extended sessions by producing verifiable checkpoints rather than requiring line-by-line review. Local agents are also the primary consumers of agent UX design: libraries, CLIs, and APIs shaped around what agents naturally try to do reduce friction in the development loop.
Capabilities
- Multi-turn conversation
- Agent loop with tool calling
- Filesystem agent (read/write access to workspace files)
- Shell tool (arbitrary command execution)
- Subagents (parallel task delegation)
- Human-in-the-loop (tool execution approval)
- Skills (instruction files loaded from workspace, e.g.
CLAUDE.md,AGENTS.md,.cursor/rules/) - MCP (external tool server connections)
- Reasoning (extended thinking traces before tool selection and responses)
- Prompt compaction (context management for long sessions)
- Copilot interface (IDE-embedded variants: rich diff review, inline completions, background agents)
Trust analysis
Local coding agents run unsandboxed in the developer's environment with the developer's full permissions. There is no sandbox boundary - the agent can access anything the developer can, including credentials, SSH keys, and other projects on the filesystem. Human-in-the-loop approval via tool execution approval is the primary safety mechanism rather than technical containment. The developer configures which tool categories require approval, with read-only operations typically auto-approved and write operations requiring confirmation.
The filesystem is both workspace and instruction source: skill files (e.g. CLAUDE.md, AGENTS.md, .cursor/rules/) are discovered and loaded into the agent's context by the agent runtime. A compromised repository can alter the agent's behavioral instructions before the developer interacts with it, through context poisoning via crafted skill or rule files.
Review fidelity varies by form factor. IDE-embedded agents present changes in rich diff views with syntax highlighting, making structural edits easier to audit than terminal-only review. However, IDE agents also introduce lower-friction interaction patterns - inline completions applied with a single keystroke, background agents making edits without an explicit approval step - that can reduce the deliberateness of review. CLI agents in headless mode (CI/CD, scripted automation) remove the human review boundary entirely, making the agent fully autonomous with whatever permissions the pipeline grants.
Subagents run with their own context but share the same unsandboxed environment. A subagent that encounters prompt injection in a file can return poisoned results that influence the parent agent's subsequent reasoning and edits. MCP connections extend the capability surface to external tool servers, introducing additional trust boundaries around tool server authentication and behavior.
Interaction effects
- Filesystem + shell + no sandbox: The agent can install and execute arbitrary dependencies under the developer's identity, opening a supply chain attack surface. A malicious package installed via shell can execute during builds, tests, or agent operations with the developer's full permissions. Dependency-free software - where agents generate custom implementations instead of importing libraries - eliminates this surface at the cost of maintenance burden.
- Skills + filesystem write access: The agent reads skill files to shape its behavior and has write access to the same filesystem where skills live. A compromised agent session can modify skill files to influence future sessions - a persistence attack that survives beyond the current interaction. Because the filesystem is both workspace and implicit coordination channel, shared mutable state creates coupling between otherwise independent sessions.
- Subagents + filesystem: Subagents search and read files in parallel, then return results to the parent. If a repository contains prompt injection payloads in source files or documentation, a search subagent can propagate the payload into the parent's context, influencing subsequent edits across the codebase.
- Human-in-the-loop + high action volume: Extended coding sessions produce many file edits and shell commands. The propose-review-execute loop is the primary defense, but its effectiveness degrades as the developer's review attention diminishes over a long session. IDE agents compound this with inline completions and background edits that bypass the explicit approval step.
- Headless mode + no sandbox: In CI/CD or scripted automation, CLI agents operate without human oversight and with pipeline-level permissions. Any successful prompt injection or context poisoning leads directly to autonomous execution of attacker-controlled actions without a human review step.
- MCP + no sandbox: External MCP tool servers can expose capabilities that interact with the unsandboxed environment. A malicious or compromised MCP server can influence agent behavior through tool output poisoning, and the agent can act on that influence with the developer's full system access.
Threats
| Threat | Relevance | Note |
|---|---|---|
| Prompt injection | Primary | Workspace files, tool outputs, and conversation direct file edits and shell commands unsandboxed |
| Context poisoning | Primary | Skill files and workspace files control agent behavior; compromised repo controls the agent |
| Privilege compromise | Primary | No sandbox; agent operates with developer's full system permissions |
| Persistence attacks | Primary | Backdoors in source code, CI config, skill files, git hooks survive sessions |
| Supply chain attack | Primary | Malicious dependencies installed via shell execute with developer's full permissions |
| Unauthorized code execution | Primary | Shell commands run with developer's full system permissions |
| Tool misuse | Elevated | Destructive file edits or shell commands across entire accessible filesystem |
| Data exfiltration | Elevated | Credentials, SSH keys, source code via shell, network requests, or MCP calls |
| Approval fatigue exploitation | Elevated | High volume degrades review; IDE agents compound with auto-applied suggestions |
| Multi-agent system threats | Elevated | Subagent injection propagation into parent agent's reasoning and edits |
| Tool output poisoning | Elevated | MCP responses or command outputs hijack subsequent reasoning |
| Goal manipulation | Elevated | Workspace files or tool outputs redirect agent objectives |
| Denial of service | Elevated | Fork bombs, resource exhaustion in developer's local environment |
| Hallucination exploitation | Standard | Incorrect code edits, wrong dependency choices |
| Guardrail bypass | Standard | Circumventing safety constraints on code generation or commands |
| System prompt extraction | Standard | Revealing skill/rule file contents through conversation |
| User manipulation | Standard | Developer trust in confident explanations for subtly harmful edits |
| Misaligned model behaviors | Standard | Sycophancy and shortcut-taking degrade code quality |
| Training data poisoning | Standard | Baseline risk, no architecture-specific amplifier |
Examples
- Claude Code: a CLI-based local coding agent that runs in the developer's terminal, with headless mode for CI/CD automation.
- Cursor: an IDE-embedded local coding agent with rich diff review, inline tab completions, background agents, and multi-model routing.
- Windsurf: an IDE with integrated local coding agent capabilities.