Maneesh Chaturvedi
Insights

Pillar 2 — Platform & Infrastructure

AI Agents Need Kernel-Grade Containment

An AI agent with tools is a process with authority. It should be contained like one.

May 29, 2026

The first question to ask about an AI agent is not how smart it is. It’s what it can touch.

A chatbot that answers questions is mostly an information-risk problem. An agent that can read files, call APIs, execute code, send messages, modify infrastructure, query production data, open pull requests, or update business systems is different.

It is a process with authority. If that process consumes untrusted input, maintains state, invokes tools, and acts faster than a human can inspect, the right mental model is not “digital coworker.” The right mental model is workload containment.

Security engineers already know this pattern. A low-privilege shell is not harmless if it can read credentials, reach internal networks, discover writable paths, access cloud metadata, or find a service account with broader permissions. The initial foothold matters less than the graph of authority around it. Agentic AI creates the same graph. It just hides it behind natural language.

An Agent Is Closer to a Process Than a Person

The language around agents is dangerously social. Agents “help.” They “collaborate.” They “understand goals.” They “take action on behalf of users.” That language may be useful for product positioning. It is weak for security architecture.

A production agent has inputs, memory, tools, identity, state, permissions, network access, and side effects. It may run in a browser, desktop app, SaaS workflow, IDE, customer-support platform, cloud automation layer, or internal operations environment.

This looks less like a person and more like a process.

Processes are not trusted because they seem helpful. They are trusted according to what they are allowed to do, what they can reach, which identity they run under, which files they can read, which network paths they can open, which system calls they can make, and what gets logged when they act.

Agent security needs that level of seriousness. If the agent can perform consequential actions, it needs a containment model before it needs a personality model.

Ambient Authority Is the Default Failure

The most common agent-security failure is ambient authority.

The agent runs in an environment full of existing access:

  • browser sessions
  • API keys
  • SaaS tokens
  • repository credentials
  • cloud roles
  • local files
  • shell access
  • environment variables
  • enterprise search permissions
  • messaging access
  • ticketing permissions
  • memory from prior sessions

The agent may not be explicitly granted all of these in a clean design document. It inherits them because it runs where the user works, where the developer codes, where the service account operates, or where the automation environment already has credentials.

This is exactly how small compromises become large ones. Post-exploitation thinking starts with a simple question: once something is compromised, what can it enumerate in the first minute?

For agents, the equivalent questions are direct:

  • What files can the agent read without asking?
  • What environment variables are visible?
  • What credentials are mounted into the runtime?
  • What internal services can it reach?
  • What SaaS APIs are already authenticated?
  • What repositories can it inspect?
  • What data can it retrieve because the user has access?
  • What can it write?
  • What can it send outside the organization?
  • What memory can it modify for future sessions?

If the answer is “whatever the user can access,” the organization has not designed an agent boundary. It has delegated a human access graph to a probabilistic process.

Human Approval Is Not a Containment Boundary

“Human in the loop” is useful. It is not a complete security boundary.

Humans approve what they can see and understand. Agent systems often show users a summary, a proposed action, or a nicely formatted explanation. They do not always show the raw tool inputs, hidden context, intermediate reasoning artifacts, retrieved documents, exact API calls, external destinations, or cumulative effect of many small actions.

An agent can chain individually acceptable steps into an unacceptable outcome.

Read this file. Summarize this customer thread. Draft this message. Attach the relevant document. Send it to this address. Update memory so this preference is remembered later. Each step may look reasonable in isolation. The sequence may violate confidentiality, policy, or business intent.

The approval UI can also be shaped by model output. If the same system that proposes the action also writes the explanation for approval, the user may be approving the agent’s framing rather than the actual operation. Human approval should remain part of the design. But containment must exist underneath it. The system should make dangerous actions impossible, not merely ask the user to notice danger in time.

Linux Already Solved the Shape of the Problem

Operating systems do not rely on software promising to behave. They isolate processes. Linux gives us a useful vocabulary for agent architecture:

  • users and groups define identity
  • capabilities split root-like power into narrower privileges
  • namespaces isolate process, network, filesystem, and user views
  • containers package those boundaries into deployable environments
  • netfilter and firewall rules control network movement
  • audit systems record sensitive actions
  • seccomp-style controls reduce the syscall surface

The exact primitives may differ in AI platforms, but the design logic transfers.

An agent should not run with broad shared credentials if a narrow per-agent identity will do. It should not see the whole filesystem if it only needs a workspace. It should not reach the internet if it only needs one internal API. It should not call every tool in the catalog if the task requires two. It should not retain persistent memory unless the workflow has a reason to preserve state.

Containment is not one control. It is a stack of boundaries.

Namespaces for Agent Reality

Linux namespaces isolate what a process can see. Agent systems need the same idea.

A file namespace means the agent sees only the working set required for the task, not the user’s home directory, browser profile, SSH keys, cloud config, downloads folder, and old project archives.

A network namespace means the agent has explicit egress rules. It can reach approved services and nothing else. External communication is not a default privilege.

A process namespace means the agent cannot inspect or interfere with unrelated local processes.

A user namespace means the agent may appear powerful inside its own environment while mapping to an unprivileged identity outside it.

A memory namespace means session state, long-term memory, and retrieved context are scoped by purpose rather than pooled into one global assistant brain.

The point is not to reproduce Linux literally in every AI product. The point is to stop treating agent context as an unlimited operating environment. Every boundary should answer a practical question: what should this agent not be able to see or do even if its prompt is compromised?

Capabilities Beat Roles

Traditional role-based access often becomes too coarse for agents.

“Support agent.” “Developer assistant.” “Finance assistant.” “Operations agent.”

These labels describe intent. They do not define safe authority. Capabilities are sharper.

Can it read? Can it write? Can it send? Can it delete? Can it create external connections? Can it modify memory? Can it execute code? Can it approve? Can it query regulated data? Can it act without confirmation? Can it combine data from two domains?

The safe unit of design is the action.

An agent may need to read a ticket but not email the customer. It may need to draft a pull request but not merge it. It may need to inspect a log but not query secrets. It may need to summarize a contract but not upload it to an external endpoint. It may need to recommend a refund but not issue one.

Agents should receive capabilities per workflow and per action class. Permanent broad access is the agent equivalent of running every service as root because one endpoint occasionally needs privilege.

Egress Control Is Non-Negotiable

The ability to communicate outward is one of the most important parts of the risk model.

An agent that can read private data but cannot send data anywhere is risky.

An agent that can read private data and make arbitrary outbound calls is a different class of system.

Network egress control is familiar in server security. A workload should not be able to reach every destination on the internet just because it runs in production. The same rule applies to agents.

If an agent can fetch arbitrary URLs, call webhooks, send email, post to chat, upload files, or interact with third-party APIs, those destinations need policy.

Otherwise untrusted content can attempt to turn the agent into an exfiltration path.

The design should distinguish:

  • internal reads
  • external reads
  • internal writes
  • external writes
  • user-visible communication
  • silent machine-to-machine communication
  • persistent memory writes

Those are not equivalent actions and they should not share the same boundary.

Audit Logs Must Be Designed for Reconstruction

Logging agent activity is harder than logging ordinary application events. It is not enough to know that an API call happened.

A useful audit trail needs to reconstruct why it happened, what context influenced it, which user or service identity was used, which tool was invoked, what data sources were consulted, what approval was shown, what the exact action parameters were, and whether the action was read-only, reversible, external, or consequential. Without that, incident response becomes guesswork.

When something goes wrong, the organization needs to answer:

  • Was the agent following user intent?
  • Did untrusted content influence the action?
  • Which permissions were exercised?
  • Which data was exposed?
  • Which systems were changed?
  • Was the action approved by a human?
  • What did the human actually see?
  • Did the agent update memory or state afterward?

Agent logs should be immutable enough to support investigation.

They should also be understandable enough for governance, security, engineering, and business owners to reason about the event. If the log is only a transcript, it is incomplete. If it is only API telemetry, it is incomplete. Agent behavior crosses both layers.

The Secure Default Is Untrusted Workload

The secure default for agents should be “untrusted workload with bounded capability.” That sounds severe only because the product language has made agents feel human. The security reality is simpler. An agent is software interpreting mixed-trust input and taking actions under an identity. That is exactly the kind of thing containment was invented for.

Start with narrow identity. Add task-specific capabilities. Restrict filesystem scope. Restrict network egress. Separate memory. Log privileged actions. Require confirmation where judgment is genuinely needed. Prevent the agent from obtaining new authority through its own outputs.

Then make the experience useful inside those boundaries. The future of agent security will not be built on better prompts alone. It will be built on old security principles applied without sentimentality.

Least privilege. Isolation. Explicit capability. Network control. Auditability. Blast-radius reduction. Agents do not need trust. They need containment.