Four security principles for agentic AI systems

02/04/2026

182

Agentic AI represents a qualitative shift in how software operates. Traditional software executes deterministic instructions. Generative AI responds to human prompts with output that humans review and use at their discretion. Agentic AI differs from both. Agents connect to software tools and APIs and uses large language models (LLMs) as reasoning engines to plan and execute sequences of actions autonomously—at machine speed—with real-world consequences. This shift raises new questions for information security. In January 2026, NIST’s Center for AI Standards and Innovation (CAISI) issued a Request for Information (RFI) seeking industry input on how to secure these systems. AWS submitted a response grounded in our experience building and operating agentic AI services. This post summarizes the four security principles at the heart of that response and the architectural building blocks that implement them.

The NIST agentic AI RFI

CAISI asked developers, deployers, and security researchers to weigh in on how the industry should secure AI systems that act autonomously. The RFI posed questions across five areas. What unique security considerations do agentic systems introduce, and how do those considerations change as systems gain more autonomy? What practices improve security during development and deployment? How do organizations assess the security of their agentic systems? How can deployment environments be constrained and monitored? And where should the industry focus future research?

Why this matters

Even a conservative risk/benefit analysis will conclude that the benefits of agentic AI clearly outweigh the risks in many domains. The rapid adoption of agentic technology across business and government confirms this. But agents are valuable precisely because of their autonomy and adaptability, and these same characteristics create the security challenge. An agentic system that carries out an unintended action can do so at machine speed, before a human can intervene. Unlike human actors who pause or escalate when something seems unusual, agents might not inherently recognize ambiguities that are evident to humans, nor intuitively grasp unstated policy boundaries.

The good news, however, is that the security response to agentic AI doesn’t need to start from scratch. Existing security frameworks, including the NIST Cybersecurity Framework, NIST AI Risk Management Framework, and the Secure Software Development Framework, remain relevant and should be extended for agent-specific considerations rather than replaced. The most important extension is architectural. Our response to NIST identified four foundational security principles that address how to make that extension.

Four security principles for agentic AI

These principles build on the premise that agentic AI doesn’t require a new security paradigm, but it does require existing practices to evolve. The first two principles address what carries forward; the second two address what is genuinely new.

Principle 1: Secure development lifecycle practices apply across system components. Agentic AI systems combine traditional software components (APIs, databases, orchestration logic) with AI elements such as foundation models, prompt templates, and retrieval pipelines. A secure development lifecycle must cover both sets of components. For traditional components, established practices such as code review, static analysis, dependency scanning, and threat modeling remain essential, keeping in mind that those practices are also in the process of being enhanced with AI-based tooling. For AI components, the challenge is different. Foundation models are probabilistic, which means traditional regression testing is necessary but not sufficient. Organizations must supplement it with behavioral testing, adversarial evaluation, and continuous monitoring to validate that AI components operate within expected parameters.

Regular re-evaluation is equally important for addressing behavioral drift. Models receive updates that can alter behavior. Prompt templates evolve as teams refine agent capabilities. New tools and data sources expand the agent’s operational surface. Each change can introduce new failure modes or potential security issues. Organizations must treat evaluation as an ongoing operational practice, not a one-time gate. This includes automated testing after model updates, red team exercises against deployed agents, and monitoring that detects behavioral drift over time.

Principle 2: Traditional security controls remain fully applicable. Agentic AI introduces new considerations, but it doesn’t render existing security risks obsolete. The full complement of traditional security controls still applies. An agentic AI system combines traditional software with the new LLM-plus-tools processing loop. Organizations must secure existing software, tools, and configurations against well-known risks to provide a sound foundation for the agentic elements.

Privilege escalation, confused deputy issues, session hijacking, code injection, and supply chain risks extend directly into agentic systems. Some of these risks increase in agentic contexts. Agents operate at greater scale and speed than human actors, which means excessive privileges carry more potential for unintended consequences. That means that applying principles of least privilege to access management in an agentic context is as important—if not more so—than in traditional systems. The supply chain surface is also broader. Agentic systems consume not only third-party code dependencies but also foundation models, plugins, tool servers, and data retrieval sources. Agents that invoke APIs, query databases, or generate code create new potential injection surfaces at tool boundaries. AI-specific controls must be additions to this foundational security, not replacements for it.

Principle 3: Deterministic external controls are the starting point for agentic security. This is the most important architectural principle for agentic AI security. Organizations should enforce security through deterministic, infrastructure-level controls external to the agent’s reasoning loop, not through the agent’s own reasoning, internal guardrails, or prompt-based instructions. The logic is straightforward. LLMs are probabilistic reasoning engines, not security enforcement mechanisms. Developers can instruct an LLM to refuse certain requests, but prompt injection techniques can override those instructions. An LLM can be told to respect access boundaries, but it has no reliable mechanism to enforce them. Attempting to constrain agent behavior only through prompting or alignment runs against the fundamental value proposition of agents, which is their ability to adapt dynamically to novel situations.

Effective security places fully specified, deterministic controls outside the agent that govern which tools it can access, what operations it can perform, and what data it can reach. Model manipulation cannot bypass these controls. We describe this as the security box. It’s external to the agent, deterministic in its enforcement, and comprehensive in its coverage. Every interaction between the agent and the outside world passes through it. The Agentic AI Security Scoping Matrix helps organizations calibrate the rigor of these controls based on their system’s autonomy level. Scopes range from systems that require explicit human approval before every action to fully autonomous systems that initiate their own activities based on external events.

The security box isn’t a limitation on the agent’s value. It’s the precondition for achieving that value responsibly. As agentic technology matures, the box itself will likely evolve to include agentic elements. Specialized AI agents designed to control the scope of other agents might replace some deterministic constraints over time, using new information and context to make more appropriate automated decisions than could be achieved by humans managing complex deterministic controls.

Principle 4: Greater autonomy should be earned through ongoing evaluation. Organizations should expand agent autonomy progressively based on demonstrated performance, not grant it by default. The starting point is human decision-making for high-consequence operations. When an agent encounters an action that could modify high-value production data, initiate financial transactions, or communicate sensitive information externally, a human makes the final decision. The agent recommends, and a human approves or rejects.

This approach carries a well-known risk. If every agent action requires human approval, the volume of decisions might overwhelm reviewers. Approval becomes reflexive rather than deliberate, shifting liability to humans who have been placed in a position to fail. Organizations must scope human oversight to genuinely high-consequence operations and resist the temptation to require human-in-the-loop designs for routine actions that carry low risk.

The path from human oversight to expanded autonomy runs through evaluation. As organizations systematically record what the agent recommended, what the human decided, and what actually happened, they build the evidence base for expanding autonomy. When data shows sustained alignment, organizations can shift from prior approval to after-the-fact review, and eventually to full autonomy for specific operation types. This progression should happen at the operation or workflow level, not across a broad range of unrelated tasks.

This progression isn’t one-way. Organizations should be prepared to reintroduce human oversight when evidence warrants it. Some deterministic boundaries likely remain permanent for the foreseeable future. These boundaries exist not because the agent hasn’t earned trust, but because the consequences of certain actions are unacceptable under a reasonable risk analysis. The overall model is one of earned autonomy through demonstrated competence, governed by evaluation, bounded by permanent constraints, and subject to continuous review. There might come a time with specialized boundary agents can provide better outcomes than purely deterministic controls, but that option can only emerge over time from experience and evaluation.

From principles to practice

The four principles define the goals. Achieving them requires specific architectural building blocks that compose the security box and the broader security architecture. Our response to NIST described these building blocks in greater detail. Here we provide a summary. AWS has implemented them in Amazon Bedrock AgentCore, a framework for building, deploying, and operating agentic AI systems with security built in from the ground up.

Compute isolation. Agent compute environments must isolate execution, prevent cross-agent data leakage, and contain agents within defined boundaries. Amazon Bedrock AgentCore runs agents on Firecracker, an open source virtual machine manager written in Rust. Firecracker provides lightweight micro-VMs backed by Linux KVM and hardware-based virtualization, delivering the speed of containers with the isolation properties of full virtual machines. Key security-critical elements of Firecracker have been formally verified by AWS teams, adding assurance beyond the memory safety that Rust provides.

Identity and access management. Agents require their own identities, secure credential storage, and least-privilege authorization enforced at the infrastructure level. AgentCore Identity provides machine identities for agents, manages OAuth and secure credential flows, and integrates with AWS Identity and Access Management (IAM) for fine-grained access control. It supports attribute-based access control and maintains traceable delegation chains so that the relationship between agent actions and the invoking user remains auditable.

Tool access and policy enforcement. Every tool an agent can access expands both its usefulness and its potential risk. Managing tool access individually across agents creates an unmanageable combinatorial explosion. AgentCore Gateway acts as a centralized intermediary between agents and tools, enforcing authentication and authorization at a single control point. It can inspect tool calls down to individual parameters, not just at the API level. AgentCore Policy, built on the open source Cedar authorization language, adds formally verified policy enforcement. Teams can author Cedar policies in natural language and then review them, combining the flexibility of LLMs with the rigor of formal methods.

Observability. Observability infrastructure must capture sufficient context for real-time monitoring and investigation, and it must be protected from the agents it monitors. Organizations wouldn’t allow employees to edit their own audit logs, and the same principle applies to agents. AgentCore provides observability through the AgentCore Gateway, session-level telemetry, and detailed traces that record internal state changes. These capabilities can extend to agents running outside of AgentCore as well.

Model execution environment. The security of the model execution environment matters as much as the security of the agent itself. Amazon Bedrock runs models in isolated network environments where neither AWS nor model providers access customer prompts and responses. When customers enable logging, those logs are encrypted at rest and protected by customer-managed encryption keys. This architectural isolation is a key reason government and enterprise customers have adopted Amazon Bedrock.

Deterministic external controls are complemented by controls within the AI processing loop. Amazon Bedrock Guardrails inspects prompts and responses using small AI models called classifiers that address challenges such as prompt injection. Automated Reasoning checks go further, so that developers can create a formal model of a knowledge domain and verify that LLM output conforms to it, producing results that are deterministic and provably correct.

Looking ahead

Agentic AI changes how software operates, but the security response builds on decades of established practice. Existing frameworks provide the right foundation. The task is to extend existing frameworks for agent-specific considerations. Organizations should apply secure development lifecycle practices to AI components and maintain traditional security controls. They should enforce security through deterministic controls external to the agent and earn greater autonomy through systematic evaluation.

These principles aren’t theoretical. They reflect the operational experience AWS has gained building and operating agentic AI services. They’re embedded in how we design our infrastructure. As NIST develops guidance based on industry input, we will continue to invest in helping customers build and operate agentic AI systems with confidence.

To learn more about how AWS helps customers secure their AI workloads, visit the AWS AI Security or read the Amazon response to the CAISI Request for Information regarding Security Considerations for Artificial Intelligence Agents.

– Read more

Monday	08:00 - 17:00
Tuesday	08:00 - 17:00
Wednesday	08:00 - 17:00
Thursday	08:00 - 17:00
Friday	08:00 - 17:00

Four security principles for agentic AI systems

The NIST agentic AI RFI

Why this matters

Four security principles for agentic AI

From principles to practice

Looking ahead

Latest article

Apple’s ‘Hide My Email’ Privacy Flaw Exposes Real Email Addresses

Missed incidents, persistent threats, and response gaps: Insights from compromise assessment projects

Test Cache

Secure Amazon container workloads using container attribute-based rules in AWS Network Firewall

EDITOR PICKS

Apple’s ‘Hide My Email’ Privacy Flaw Exposes Real Email Addresses

Missed incidents, persistent threats, and response gaps: Insights from compromise assessment...

Test Cache

POPULAR POSTS

Threats to users of adult websites in 2018

The World’s Most Popular Coding Language Happens to be Most Hackers’...

IT threat evolution Q2 2019

POPULAR CATEGORY

Apple’s ‘Hide My Email’ Privacy Flaw Exposes Real Email Addresses

Missed incidents, persistent threats, and response gaps: Insights from compromise assessment...

Test Cache