Why AI Security Requires a New Approach
Traditional cybersecurity focuses on network perimeters, access control, and deterministic software. AI agents break these assumptions fundamentally. Language models conflate data and instructions—an agent reading an external document can be manipulated by hidden instructions embedded within it. This creates attack surfaces that firewalls, WAFs, and conventional pen-testing simply cannot address.
The stakes are highest in environments where AI agents have access to proprietary code, trading algorithms, customer data, or production infrastructure. A single prompt injection exploit can turn a helpful coding assistant into a data exfiltration vector.
The Four Layers of AI Defense
We structure every engagement around four complementary security layers:
Identity & Access
Eliminate long-lived API keys in favour of short-lived, scoped credentials. We implement AWS IAM Identity Center (SSO) with OIDC federation, ensuring every agent session is tied to a specific human identity with auditable, time-limited permissions.
Network Isolation
Move AI workloads off public APIs and into your VPC. Using AWS PrivateLink and strict egress filtering, we ensure that model inference traffic never traverses the public internet—and that compromised agents cannot “phone home” to external servers.
Execution Governance
Deploy a centralised gateway (AgentCore or MCP Proxy) that intercepts every tool call an agent makes. This enables payload inspection, sensitive variable redaction, deterministic pattern matching, and on-behalf-of identity propagation—ensuring agents can only do what the initiating developer is authorised to do.
Semantic Defenses
Because agents are non-deterministic, we deploy independent monitoring layers: Bedrock Guardrails to block extraction of proprietary IP, LLM-as-a-Judge evaluators to assess whether agent actions align with security policies, and automated kill switches for anomalous behaviour.
Claude Code Hardening
For organisations using Claude Code as a development tool, we provide specific hardening measures:
- Bedrock Migration: Shift from public Anthropic API to AWS Bedrock, keeping all code and conversation data within your VPC.
- Sandbox Enforcement: Harden the bubblewrap-based sandbox with immutable configuration, restricted file access, and locked-down SessionStart hooks.
- Configuration Management: Deploy centrally managed
.claude/settings.jsonacross all workstations, preventing ad-hoc permission escalation. - Credential Rotation: Automate credential retrieval via
awsAuthRefreshwith temporary session tokens—no static keys.
Agent Runtime Security
For autonomous agents built on frameworks like LangChain, LangGraph, or custom orchestration layers:
- Ephemeral Environments: Run agent containers in airgapped, disposable environments with denied outbound internet access by default.
- Least Privilege: Strip agents of all unnecessary capabilities—an analysis agent should physically lack write or delete credentials.
- Semantic Filtering: Deploy independent filters that scrub sensitive variables from outgoing payloads and block known injection patterns in inbound data.
- Human-in-the-Loop: Implement supervised agency for state-changing actions, requiring explicit human authorisation before an agent can execute trades, push code, or modify infrastructure.