GenAI Security Principles

Overview

This page sets out the provider-neutral security principles for managed generative AI and large language model (LLM) APIs. Scope is limited to cloud-hosted managed model APIs: AWS Bedrock, Azure OpenAI Service, GCP Vertex AI (Gemini API), and OCI Generative AI Service. Self-hosted models, on-premises inference servers, and cloud training platforms (SageMaker, Azure ML, Vertex AI Training) are out of scope.

This page does not contain provider-specific controls. For controls, see the provider pages: Azure OpenAI Service Hardening (live, the Phase 13 pilot). AWS Bedrock Hardening, GCP Vertex AI Hardening, and OCI Generative AI Hardening arrive in Phase 14. The principles on this page apply uniformly across all four providers; the provider pages translate each principle into auditable configurations, CLI commands, and IaC.

Before reading the provider GenAI pages, security engineers should be familiar with three cross-cutting domains: Identity and Access Management (workload identities, least-privilege access), Network Security (private endpoints, egress controls), and Logging and Monitoring (audit trail, anomaly detection). GenAI hardening extends these foundations; it does not replace them. A misconfigured IAM role or a missing private endpoint is still critical even when content-safety guardrails are enabled.

Threat model

LLM-based systems present a different threat surface than traditional cloud workloads. In a conventional data store or compute deployment, the developer authors and deploys the application logic, and the attack surface is the API boundary, the network, and the credentials. In an LLM-based system, the model itself becomes a dynamic execution environment: it interprets natural language instructions from both the developer (system prompt) and the user (completion request), and in agentic configurations it issues tool calls that interact with real services. An attacker who can influence model inputs can influence model outputs and, in agentic systems, downstream actions. That is what makes the LLM threat surface new.

LLM threat taxonomy mapped to OWASP LLM Top 10:2025
Threat	OWASP LLM Top 10:2025 ID	Description
Prompt injection (direct)	LLM01:2025	Attacker-controlled user input contains instructions that override or supplement the developer-set system prompt, causing the model to behave contrary to its intended design (e.g., bypass restrictions, leak data, execute unauthorised tool calls).
Prompt injection (indirect / RAG)	LLM01:2025	Malicious instructions are embedded in documents, web pages, or database records retrieved during a RAG (Retrieval-Augmented Generation) lookup. The model treats retrieved content as trusted context and executes the embedded instructions. The attacker never sends a direct request to the model.
Sensitive data leakage in completions	LLM02:2025	The model reproduces PII, credentials, or other sensitive information verbatim in its output, either from training-data memorisation or because sensitive data was included in the prompt context without redaction. The completion response itself becomes an exfiltration channel.
Training-data poisoning	LLM04:2025	Malicious data injected into a fine-tuning corpus corrupts model behaviour for specific input patterns (backdoor attacks), causes the model to produce systematically wrong outputs, or embeds extractable PII that can be recovered by later completion queries.
RAG / vector-store poisoning	LLM08:2025	Malicious content ingested into the vector store during the knowledge-base build phase creates adversarial embeddings. When retrieved, these embeddings inject hostile instructions into the prompt context at inference time. Unlike training-data poisoning, this attack targets the retrieval pipeline rather than model weights.
System-prompt leakage	LLM07:2025	The model discloses the contents of the developer-controlled system prompt in response to adversarial user queries. System prompts often contain proprietary instructions, API keys embedded as configuration, or business logic that was intended to remain confidential.
Excessive agency	LLM06:2025	An AI agent granted broad tool permissions (e.g., storage read/write, email send, code execution) can be triggered by a prompt injection or jailbreak to perform destructive, exfiltrating, or irreversible actions using those permissions. The blast radius equals the breadth of the tool permissions granted.
Model DoS / unbounded consumption	LLM10:2025	An attacker sends crafted inputs that trigger extremely long completions, repeated model calls, or resource-intensive reasoning chains. The result is quota exhaustion, runaway costs, or denial of service for legitimate users. Token-farming attacks (generating large outputs for re-sale) exploit the same surface.

RAG pipelines combine two of these threats in a compound attack chain. LLM01:2025 (indirect prompt injection) and LLM08:2025 (vector and embedding weaknesses) interact when an attacker controls content that enters the knowledge base. The attacker poisons the retrieval index during ingestion (LLM08:2025), and those poisoned chunks are later retrieved and injected as adversarial instructions into the model context at inference time (LLM01:2025). The defence must address both stages: access-controlled ingestion pipelines with content provenance checks, and a differential trust treatment of retrieved content at inference time, where retrieved chunks are treated as untrusted user input rather than trusted system-level instructions.

Cross-cutting principles

Nine architecture-level principles apply regardless of which managed model API you deploy. Each principle is stated once here and referenced from provider pages. Provider pages translate these principles into concrete configurations, CLI commands, and IaC; they do not redefine the principles themselves.

1. Input filtering and validation: Validate and sanitise all user-supplied text before it reaches the model. Apply a content-safety classifier or prompt injection detector at the application layer before model invocation. Do not rely solely on provider-managed safety filters at model inference time. An application-layer check provides an independent, earlier defence that catches attacks before they consume model tokens or trigger harmful model outputs.
2. Output filtering: Apply content safety checks and PII redaction to model responses before returning them to the caller. Model outputs are untrusted data: they may reproduce memorised PII, generate harmful content that bypassed inference-time filters, or contain injection payloads designed to be executed by a downstream component (LLM05:2025). Treat model output as you would user-supplied input when passing it to other system components.
3. System-prompt isolation: Treat the system prompt as trusted configuration, not as a user-addressable surface. Never include secrets (API keys, tokens, connection strings) in the system prompt, because model extraction attacks can surface them. Enforce system-prompt isolation through provider-level controls (e.g., role separation in the OpenAI message format, Bedrock system-prompt role, Vertex AI system instruction field) rather than trusting the model to protect prompt confidentiality. Assume the system prompt will eventually be extracted, and design it to be safe to disclose.
4. Content-safety guardrails: Configure harm-category safety filters explicitly at recommended severity thresholds for your workload. Do not rely on provider defaults, which may be permissive or change without notice. Multiple independent layers are required: provider-managed inference-time filters are one layer, and application-layer input and output checks are additional layers. Content filters are not a complete prompt-injection defence; they reduce the attack surface but cannot eliminate it.
5. Tool-use authorisation: Scope agent tool permissions to the minimum specific resources and actions required for the task. Validate all tool invocations server-side before execution; do not let the LLM's tool-call output run without an authorisation check at the application layer. Treat every tool invocation as if an untrusted caller had initiated it. This is the primary control against excessive agency (LLM06:2025) and the principal mitigation for the agentic AI blast-radius failure mode.
6. Rate limiting and quota management: Apply per-user or per-application token and request quotas to prevent unbounded consumption (LLM10:2025). Instrument token usage per caller and alert on abnormal consumption patterns that suggest token-farming, automated abuse, or runaway inference loops. Rate limits at the application API gateway layer provide an additional check independent of provider-level quotas, which are typically per-deployment rather than per-caller.
7. Prompt and completion logging with PII redaction: Log all model invocations with caller identity, timestamp, and request metadata for audit and anomaly detection. Redact PII from prompts and completions as a gate before any log write, not as a later post-processing step. Raw unredacted prompts in logs create a secondary exfiltration surface: a log storage misconfiguration or over-privileged analyst can access every user input sent to the model. The logging pipeline and the model invocation pipeline carry equal data-sensitivity risk.
8. Data-residency for embeddings: Confirm that the vector store and embedding compute are processed in the same geographic region as required by your primary data classification for the source documents. RAG pipelines move data across two additional processing stages (embedding generation and vector storage) beyond the model invocation itself, and each stage must satisfy the data-residency requirements that govern the source documents. Providers offer region-locked embedding endpoints; verify the configuration rather than accepting defaults.
9. PII redaction before model invocation: Strip or tokenise PII from user input before sending it to the model. This prevents PII from appearing in completions (LLM02:2025) and in prompt logs, and limits the data-sensitivity of the inference request itself. Reversible tokenisation (replacing PII with stable tokens before the prompt and substituting back in the response) allows PII-bearing applications to use managed model APIs without exposing PII to the model provider's inference infrastructure.

Common misconfigurations

These five patterns appear protective but weaken your GenAI security posture. Each has been observed in production environments.

Misconfiguration 1: Raw unredacted prompts in default logs

Enabling model invocation logging without a PII filter routes raw unredacted prompts to CloudWatch Logs, Log Analytics, or Cloud Audit Logs. Every user message, including any PII or sensitive context the user typed, is written verbatim to the log destination. Prompt logs become a secondary exfiltration surface with different access controls and retention policies than the primary application. A log-storage misconfiguration, an over-privileged analyst account, or a log-forwarding misconfiguration to a SIEM can expose user conversations at scale.

Remediation: Apply PII redaction before log storage, as a pre-write gate rather than a post-processing step. Configure the logging pipeline to tokenise or mask PII fields before any log record is written to a durable destination.

Misconfiguration 2: BLOCK_NONE safety filters

Setting harm-category safety filters to BLOCK_NONE to reduce false positives eliminates the provider-managed output moderation layer entirely. Any jailbreak, adversarial prompt, or unintentional harmful completion produces unfiltered output that is returned to the caller. Operators frequently set BLOCK_NONE during development to reduce iteration friction, then leave it in place in production. A single misconfigured deployment with BLOCK_NONE becomes the entry point for adversarial users who test filters systematically.

Remediation: Tune thresholds rather than disabling. BLOCK_MEDIUM_AND_ABOVE is the minimum recommended setting for regulated contexts. Maintain separate deployment configurations for development and production environments with different filter thresholds.

Misconfiguration 3: Shared API key or service account across environments

Using a single shared API key or shared service account across development, staging, and production environments means a compromised development credential carries production blast radius. Audit logs cannot distinguish per-workload or per-environment activity, breaking forensic traceability. A developer workstation with the shared API key stored in a dotfile or IDE config is a direct path to production model access.

Remediation: Use per-workload, per-environment managed identities or IAM roles, never shared credentials. Each environment (development, staging, production) must have distinct identities with distinct audit trails and distinct permission scopes.

Misconfiguration 4: Wildcard tool permissions on agents

Granting an AI agent wildcard tool permissions (s3:*, lambda:*, Contributor role) "for flexibility" enables any successful prompt injection on the agent's tool-use execution path to perform arbitrary destructive or exfiltrating actions. The agent is the attack amplifier: a single injected instruction turns the agent's granted permissions into the attacker's effective permissions. Configuring wildcard tool permissions on agents is the primary agentic AI failure mode identified in OWASP LLM Top 10:2025. Every penetration test of an agentic AI system with broad permissions finds this path exploitable.

Remediation: Scope execution roles to the minimum specific resources required for each tool. Validate tool invocations server-side before execution. Treat the LLM's tool-call output as untrusted input that must pass an authorisation check before any action is taken.

Misconfiguration 5: Disabling abuse monitoring human review without compensating controls

Applying for the Limited Access exemption to disable Microsoft's human review of flagged completions (citing privacy or latency concerns) without alternative detection controls removes a compensating layer that catches attack patterns automated classifiers miss. Abuse monitoring human review exists specifically to identify novel jailbreaks, prompt-injection campaigns, and policy-violation patterns before they are formalised into automated detectors. Removing it creates a detection gap during the interval between novel attack emergence and detector update.

Remediation: Keep default abuse monitoring enabled. If disabling human review is a documented regulatory requirement, implement Defender for Cloud AI workload alerts and a structured incident-review process as compensating controls, and document the risk acceptance formally. Do not disable without a compensating control in place.

OWASP LLM Top 10:2025 taxonomy

The OWASP LLM Top 10:2025 (published November 2024) supersedes the 2023 edition (v1.1). Provider pages in this guide map controls to stable LLMxx:2025 IDs. LLM07:2025 (System Prompt Leakage) and LLM08:2025 (Vector and Embedding Weaknesses) are new entries in the 2025 edition; they do not exist in the 2023 list. The 2023 entry for position 7 was "Insecure Plugin Design", a concept now subsumed by LLM06:2025 Excessive Agency, so that 2023-edition mapping is incorrect when applied to the 2025 edition. Always verify which edition a mapping cites before using it as an audit reference.

OWASP Top 10 for LLM Applications 2025: complete taxonomy
ID	Name	Brief description
LLM01:2025	Prompt Injection	Direct and indirect prompt injection attacks overriding or supplementing model instructions via user input or retrieved context.
LLM02:2025	Sensitive Information Disclosure	Model reproduces sensitive data, PII, credentials, or proprietary information from training memorisation or prompt context.
LLM03:2025	Supply Chain	Vulnerabilities introduced via third-party models, datasets, plugins, or dependencies in the model delivery and deployment chain.
LLM04:2025	Data and Model Poisoning	Malicious injection into training data, fine-tuning datasets, or model weights that corrupts behaviour for specific input patterns.
LLM05:2025	Improper Output Handling	Downstream components processing LLM output without adequate validation, enabling XSS, SSRF, code injection, or command execution.
LLM06:2025	Excessive Agency	LLM agents granted excessive permissions or autonomy executing unintended, destructive, or exfiltrating actions via tool calls.
LLM07:2025	System Prompt Leakage	Disclosure of confidential system prompt contents to users via adversarial extraction queries. New entry in 2025 edition.
LLM08:2025	Vector and Embedding Weaknesses	RAG pipeline manipulation via poisoned embeddings, adversarial retrieval, or vector-store access control failures. New entry in 2025 edition.
LLM09:2025	Misinformation	LLM generates plausible but factually false or misleading information with downstream security or compliance consequences.
LLM10:2025	Unbounded Consumption	Excessive resource use, denial of service, or cost-exhaustion attacks via token farming, runaway inference chains, or quota abuse.

Each provider page in this guide maps its controls to these IDs in the compliance table column labelled "OWASP LLM Top 10:2025". Source: OWASP Top 10 for LLM Applications 2025 (accessed 2026-05).

EU AI Act: provider vs. deployer obligations

The EU AI Act (Regulation (EU) 2024/1689) creates distinct obligations for cloud providers acting as general-purpose AI (GPAI) model providers and for enterprises that deploy AI APIs in their applications as deployers of high-risk AI systems. The enforcement timeline is staggered: the articles most relevant to cloud security hardening entered force at different dates spanning 2025 to 2026.

EU AI Act enforcement timeline: obligations relevant to managed GenAI API hardening
Obligation	Article	Who it applies to	In force date
GPAI model provider transparency: technical documentation, training-data summary, copyright policy	Art. 53 (in force 2025-08-02)	Cloud providers placing GPAI models on the EU market (AWS, Azure, GCP, OCI)	2025-08-02
GPAI systemic-risk provider controls: adversarial testing, systemic risk assessment, serious incident reporting, cybersecurity measures for models trained with >10²⁵ FLOPs	Art. 55 (in force 2025-08-02)	Major cloud providers whose foundation models qualify as systemic-risk GPAI	2025-08-02
High-risk AI deployer risk management: use per provider instructions, human oversight, input data management, impact assessment, incident monitoring, log retention ≥ 6 months	Art. 26 (in force 2026-08-02)	Enterprises deploying managed AI APIs in high-risk use cases (as defined in Annex III)	2026-08-02
High-risk AI system transparency to users: instructions for use enabling deployers to understand capabilities and limitations	Art. 13 (in force 2026-08-02)	Enterprises deploying in high-risk contexts; also obligations on providers to supply documentation	2026-08-02

Temporal qualification (important for audit use): As of this writing (2026-05), Art. 26 deployer obligations are NOT yet enforceable; the in force date is 2026-08-02. Art. 55 GPAI provider obligations ARE currently in force (since 2025-08-02). Do not cite Art. 26 as a current audit requirement; it is a future obligation. Provider compliance tables in this guide use the pattern Art. 55 (in force 2025-08-02) for controls where the cloud provider holds the Art. 55 obligation, and Art. 26 (in force 2026-08-02) where the enterprise deployer holds the obligation.

Provider page compliance table cells follow the pattern: Art. 55 (in force 2025-08-02) for controls where the cloud provider holds the Art. 55 GPAI obligation, and Art. 26 (in force 2026-08-02) where the deployer holds the obligation under Art. 26. Every Art. reference in compliance cells includes the enforcement-date qualifier. A bare "Art. 26" without a date qualifier does not satisfy the audit-precision standard used throughout this guide.

Reading the provider pages

The provider GenAI pages translate the principles on this page into auditable, provider-specific controls. Each control article identifies the threat it addresses (by LLMxx:2025 code), the severity of the gap it closes, and the remediation steps including CLI commands and IaC. Compliance table columns on provider pages follow the 10-column GenAI schema: the four CIS Foundations benchmarks (marked n/a (no dedicated CIS GenAI benchmark) as no CIS benchmark exists for managed LLM APIs as of 2026-05), NIST SP 800-53 rev5, ISO/IEC 27001:2022, ISO/IEC 27017:2015, OWASP LLM Top 10:2025, NIST AI 600-1 (Jul 2024), and EU AI Act (2024/1689).

Azure OpenAI Service Hardening is the Phase 13 pilot page and is currently live. It covers nine controls addressing all AZOPENAI-01 through AZOPENAI-09 requirements, including Entra ID authentication enforcement, Azure AI Content Safety Prompt Shields, content filter baseline configuration, private endpoint, RBAC least-privilege, diagnostic logging, customer-managed key encryption, quota and token rate limiting, and abuse monitoring configuration.

AWS Bedrock Hardening, GCP Vertex AI Hardening, and OCI Generative AI Hardening are forthcoming in Phase 14. Phase 14 will follow the same 10-column compliance table schema validated by the Azure pilot and will add cross-provider equivalence links between all four provider GenAI pages. During Phase 13, equivalence link placeholders appear as HTML comments in the provider page source; live hrefs will be added in the Phase 14 sealing wave once all anchor targets exist.

Each control on the provider pages carries an equivalence callout noting the analogous control on sibling provider pages. These callouts are populated progressively as provider pages are authored, and Phase 14 completes the full cross-provider equivalence map.