This page covers the Azure OpenAI Service API (also surfaced as "Azure OpenAI in Azure AI Foundry Models" in the portal). The Foundry developer portal experience is not in scope — this page addresses the API service authentication, content safety, network, and observability controls. API version referenced in examples: api-version=2024-10-01.
Controls are ordered severity-descending: two CRITICAL controls (authentication and prompt injection defence) appear first, followed by five HIGH controls, then two MEDIUM controls. Equivalence links to AWS Bedrock, GCP Vertex AI, and OCI Generative AI will be added in Phase 14 when those pages are authored.
Enforce Entra ID (managed identity) authentication and disable local API key authentication via disableLocalAuth: true. API keys are long-lived credentials susceptible to leakage in code repositories, CI pipelines, and application logs. Disabling them forces all callers to present an Entra ID token — enabling full caller attribution in audit logs, per-identity token rate-limiting, and Conditional Access enforcement. See azure-iam-06 — managed identity for the prerequisite managed-identity setup.
Remediation — Azure CLI
# Azure CLI 2.x
# Audit: find Azure OpenAI resources with local auth enabled
az cognitiveservices account list \
--query "[?kind=='OpenAI' && properties.disableLocalAuth!=true].{name:name, rg:resourceGroup}" \
--output table
# Remediate: disable local API key authentication
az cognitiveservices account update \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--custom-domain "${AOAI_ACCOUNT}" \
--api-properties '{"DisableLocalAuth": true}'
import * as pulumi from "@pulumi/pulumi";
import * as cs from "@pulumi/azure-native/cognitiveservices";
import * as resources from "@pulumi/azure-native/resources";
const rg = new resources.ResourceGroup("aoai-rg");
new cs.Account("aoai", {
resourceGroupName: rg.name,
kind: "OpenAI",
sku: { name: "S0" },
identity: { type: cs.ResourceIdentityType.SystemAssigned },
properties: {
customSubDomainName: "aoai-hardened",
disableLocalAuth: true, // Entra ID only — disables shared keys
publicNetworkAccess: "Disabled",
networkAcls: { defaultAction: "Deny" },
},
});
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
OWASP LLM Top 10:2025
NIST AI 600-1 (Jul 2024)
EU AI Act (2024/1689)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
IA-2; IA-3; AC-17
A.5.17; A.8.5
CLD.6.3.1
LLM06:2025
Information Security
Art. 55 (in force 2025-08-02)
Log signals
AzureActivity Microsoft.CognitiveServices/accounts/write where the request body sets properties.disableLocalAuth = false — re-enables the long-lived API-key auth path on an Azure OpenAI account.
AzureActivity Microsoft.CognitiveServices/accounts/regenerateKey/action issued from a principal outside the documented operator group — key-issuance event without a matching ticket.
AzureDiagnostics ResourceProvider = "MICROSOFT.COGNITIVESERVICES" Category Audit showing authMethod = "ApiKey" on completions calls — runtime regression even when the account flag did not change.
Query
AzureActivity
| where ResourceId has "Microsoft.CognitiveServices/accounts"
| where OperationNameValue in ("Microsoft.CognitiveServices/accounts/write", "Microsoft.CognitiveServices/accounts/regenerateKey/action")
| extend body = tostring(parse_json(Properties).requestbody)
| where body has "\"disableLocalAuth\":false" or OperationNameValue endswith "/regenerateKey/action"
| project TimeGenerated, Caller, ResourceId, OperationNameValue, body
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. The API-key path provides no audit attribution for completions calls; persist as a Sentinel analytics rule with severity High and require a governance ticket for every key regenerate event.
Alert threshold
Any flip of disableLocalAuth to false on a production Azure OpenAI account — page on first occurrence.
Key regeneration by a principal outside the operator group — page; treat as preparation for credential theft.
Initial response
Reapply disableLocalAuth=true via the IaC baseline; rotate the regenerated key via az cognitiveservices account keys regenerate to invalidate any copy that left the operator session.
Walk AzureDiagnostics Audit log for completions calls during the exposure window — any call with authMethod = "ApiKey" is candidate unattributed usage and should be charged back to whichever workload should have used managed identity.
Escalate per general/ir.html — confirm Azure Policy Cognitive Services accounts should have local authentication methods disabled remains in deny mode.
Enable Azure AI Content Safety Prompt Shields for both direct user-prompt injection (jailbreak attempts) and document/RAG indirect injection (malicious instructions embedded in retrieved documents). Prompt Shields is a SEPARATE service from the Azure OpenAI content filter — it is an API endpoint in Azure AI Content Safety that detects injection attacks before the prompt reaches the model. Configure for both userPromptAnalysis (direct injection) and documentsAnalysis (indirect/RAG injection). Prompt Shields went GA in 2024.
Important architectural distinction: Prompt Shields (this control) and the Azure OpenAI content filter (azure-genai-02) are architecturally distinct: Prompt Shields detects injection attacks at the input layer; content filters moderate harm categories at the output layer. Both are required. Using only the content filter does not protect against prompt injection; using only Prompt Shields does not moderate harmful output.
Audit — Azure CLI
# Azure CLI 2.x
# Check current Content Safety configuration on the Azure OpenAI resource
az cognitiveservices account show \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--query "properties.contentSafetyConfig"
Submit via POST https://{endpoint}/contentsafety/text:shieldPrompt?api-version=2024-09-01 with your prompt and retrieved documents as the request body. Integrate Prompt Shields as a pre-flight check in your application before forwarding to the Azure OpenAI inference endpoint.
AzureActivity edits removing the jailbreak or indirect_attack Prompt Shield setting from an RAI policy assignment — disarms the prompt-injection defence layer.
AzureDiagnostics Category RequestResponse showing promptShieldResult.detected = true on inbound prompts followed by completions that nevertheless returned action-bearing tokens — possible bypass via prompt structure.
AzureDiagnostics contentFilterResults.jailbreak field flipping from filtered=true historical baseline to filtered=false on the same prompt signatures — coverage regression.
Query
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES" and Category == "RequestResponse"
| extend filterJB = tostring(parse_json(properties_s).contentFilterResults.jailbreak.filtered)
| extend filterIA = tostring(parse_json(properties_s).contentFilterResults.indirect_attack.filtered)
| where filterJB == "false" or filterIA == "false"
| project TimeGenerated, Resource, identity_claim_appid_g, filterJB, filterIA, properties_s
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. Prompt-shield disablement is rare and easily reverted; the more important signal is the runtime stream showing whether the shield actually fired on adversary-style prompts. Persist as a Sentinel analytics rule.
Alert threshold
Any RAI-policy edit that disables Prompt Shields — page on first occurrence.
Spike in jailbreak.filtered=false entries above the 30-day baseline — page; adversary may have discovered a bypass pattern.
Initial response
Reapply the Prompt Shields setting via the IaC pipeline; confirm the next RequestResponse batch shows the jailbreak shield active on adversary-style prompts.
Walk the prompt/response pairs that bypassed the shield — feed them to the content-safety adversarial-prompt corpus and rerun the Defender for AI evaluation suite.
Escalate per general/ir.html — confirm Microsoft Defender for AI workloads remains enabled on the subscription and that the Prompt Shields telemetry is routed into Sentinel.
Configure the Azure OpenAI content filter (RAI policy) with non-default severity thresholds for the four harm categories — Hate, Sexual, Self-harm, and Violence — applied to both prompt input and completion output. The default content filter is not an acceptable sole control; it must be explicitly configured at recommended or stricter thresholds and attached to each model deployment. RAI policy threshold configuration requires the Azure OpenAI REST API; az cognitiveservices account update does not support RAI policy threshold configuration (known CLI limitation — use REST API or Terraform azurerm_cognitive_account_rai_policy).
Anti-pattern: Setting any harm category to "annotate only" or disabling filters for "better response quality" is equivalent to BLOCK_NONE — the second of five common misconfigurations documented in General GenAI — Common Misconfigurations. All four harm categories must be set to block at recommended or higher thresholds for both input and output.
Audit — Azure CLI
# Azure CLI 2.x
# Locate the resource custom domain (needed for REST API calls)
az cognitiveservices account show \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--query "properties.customSubDomainName"
# Then retrieve current RAI policies via REST API:
# GET https://{endpoint}/openai/rai/policies?api-version=2024-10-01
Configuration — Azure OpenAI REST API (RAI policy)
Submit via POST /openai/rai/policies/{policyName}?api-version=2024-10-01. Then attach the policy to each deployment: PATCH /openai/deployments/{deploymentName}?api-version=2024-10-01 with body {"rai_policy_name": "recommended-baseline"}.
AzureActivity Microsoft.CognitiveServices/accounts/raiPolicies/write where the request body lowers a content-filter severity threshold (Hate, Sexual, SelfHarm, Violence) below the org baseline.
AzureActivity Microsoft.CognitiveServices/accounts/raiPolicies/delete on a policy that is the active assignment for a production deployment — falls back to the Microsoft default which is more permissive than the org floor.
AzureDiagnostics Category RequestResponse showing contentFilterResults where filtered=false on categories that the org baseline marks as block — downstream confirmation the change took effect.
Query
AzureActivity
| where OperationNameValue startswith "Microsoft.CognitiveServices/accounts/raiPolicies/"
| extend body = tostring(parse_json(Properties).requestbody)
| where OperationNameValue endswith "/delete" or body has "\"allowedContentLevel\":\"high\"" or body has "\"allowedContentLevel\":\"medium\""
| project TimeGenerated, Caller, ResourceId, OperationNameValue, body
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. RAI-policy edits should be ticket-bound and four-eyes-reviewed; persist as a Sentinel analytics rule with severity Medium and require the change ticket to be attached at the time of edit.
Alert threshold
Any RAI policy edit that loosens a severity threshold below the org baseline — page on first occurrence.
RAI policy delete that leaves a production deployment relying on the platform default — page; coverage of the content-filter floor has regressed.
Initial response
Reapply the org-baseline RAI policy via the IaC pipeline; confirm the next RequestResponse batch shows filtered=true on the baseline-blocked categories.
Walk the AzureDiagnostics RequestResponse stream for the exposure window for prompt/response pairs that match the loosened categories — high-volume completions on those categories warrant review.
Escalate per general/ir.html — confirm Azure Policy enforcing minimum RAI-policy severity floors remains assigned at the resource provider scope.
Deploy a Private Endpoint via Azure Private Link for the Azure OpenAI resource and disable public network access. Without this control, inference traffic (including prompts and completions) traverses the public internet. See azure-net-04 — private endpoint pattern for the general pattern; this control applies it specifically to the Azure OpenAI Cognitive Services resource.
Remediation — Azure CLI
# Azure CLI 2.x
# Disable public network access on the Azure OpenAI resource
az cognitiveservices account update \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--custom-domain "${AOAI_ACCOUNT}" \
--public-network-access Disabled
# Verify: confirm public access is disabled
az cognitiveservices account show \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--query "properties.publicNetworkAccess"
targetScope = 'resourceGroup'
@description('Azure OpenAI account resource ID to wrap with a private endpoint.')
param aoaiResourceId string
@description('Subnet resource ID hosting the private endpoint NIC.')
param subnetId string
param location string = resourceGroup().location
resource pe 'Microsoft.Network/privateEndpoints@2024-03-01' = {
name: 'pe-aoai'
location: location
properties: {
subnet: { id: subnetId }
privateLinkServiceConnections: [
{
name: 'aoai-link'
properties: {
privateLinkServiceId: aoaiResourceId
groupIds: ['account']
}
}
]
}
}
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
OWASP LLM Top 10:2025
NIST AI 600-1 (Jul 2024)
EU AI Act (2024/1689)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
SC-7; AC-17
A.8.20; A.8.22
CLD.13.1.4
LLM10:2025
Information Security
Art. 55 (in force 2025-08-02)
Log signals
AzureActivity Microsoft.CognitiveServices/accounts/write where the request body sets publicNetworkAccess = "Enabled" on an Azure OpenAI account that was previously private-endpoint-only.
AzureActivity Microsoft.Network/privateEndpoints/delete on a Private Endpoint whose target is an Azure OpenAI account — silent fallback to public reachability if combined with the flag flip.
AzureDiagnostics RequestResponse showing client source-IP from public CIDRs — runtime confirmation that public reachability is being used.
Query
AzureActivity
| where ResourceId has "Microsoft.CognitiveServices/accounts"
| where OperationNameValue == "Microsoft.CognitiveServices/accounts/write"
| extend body = tostring(parse_json(Properties).requestbody)
| where body has "\"publicNetworkAccess\":\"Enabled\""
| project TimeGenerated, Caller, ResourceId, body
| order by TimeGenerated desc
| take 100
Run as a KQL query in Log Analytics. Pair with a Sentinel analytics rule that joins the AzureDiagnostics RequestResponse stream against the corporate egress CIDR list to flag completions calls from public networks.
Alert threshold
Any publicNetworkAccess flip to Enabled on a production Azure OpenAI account — page on first occurrence.
Completions call with a source-IP outside the corporate egress CIDR list — page; treat as adversary reaching the model from outside the trust boundary.
Initial response
Flip publicNetworkAccess back to Disabled via the IaC baseline; if the Private Endpoint was deleted, recreate it and confirm DNS resolution returns the Private Link address.
Walk the RequestResponse stream for the exposure window for completions issued from public-IP clients — any such call is candidate unauthorised inference and should be charged back to a documented workload or treated as compromise.
Escalate per general/ir.html — confirm Azure Policy Cognitive Services accounts should disable public network access remains in deny mode at the management-group root.
Assign the Cognitive Services OpenAI User role per-resource (not at subscription or resource-group scope) for application service identities. Use Cognitive Services OpenAI Contributor only for deployment management operations. Never assign the generic Contributor role at the resource scope for data-plane access — it grants management-plane rights that far exceed what inference workloads require. See azure-iam-03 — Privileged Identity Management for the general RBAC least-privilege pattern.
Audit — Azure CLI
# Azure CLI 2.x
# Find overly broad Cognitive Services OpenAI User assignments at subscription scope
az role assignment list \
--scope "/subscriptions/${SUBSCRIPTION_ID}" \
--query "[?roleDefinitionName=='Cognitive Services OpenAI User']" \
--output table
# Check correct per-resource assignments
az role assignment list \
--scope "/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RG}/providers/Microsoft.CognitiveServices/accounts/${AOAI_ACCOUNT}" \
--query "[?roleDefinitionName=='Cognitive Services OpenAI User']" \
--output table
Remediation — Terraform
# Terraform AzureRM provider ~> 3.0
# Assign Cognitive Services OpenAI User at the specific resource scope only
resource "azurerm_role_assignment" "aoai_user" {
scope = azurerm_cognitive_account.aoai.id
role_definition_name = "Cognitive Services OpenAI User"
principal_id = var.app_service_principal_id
}
Remediation — Bicep
targetScope = 'resourceGroup'
@description('Azure OpenAI account resource ID.')
param aoaiResourceId string
@description('Application principal that should call the model.')
param appPrincipalId string
// Cognitive Services OpenAI User (data-plane, no key management)
var openAiUserRoleId = '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd'
resource assign 'Microsoft.Authorization/roleAssignments@2024-04-01' = {
name: guid(aoaiResourceId, appPrincipalId, openAiUserRoleId)
scope: resourceGroup()
properties: {
principalId: appPrincipalId
principalType: 'ServicePrincipal'
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', openAiUserRoleId)
}
}
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
OWASP LLM Top 10:2025
NIST AI 600-1 (Jul 2024)
EU AI Act (2024/1689)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
AC-2; AC-6; IA-2
A.5.15; A.5.18
CLD.12.1.5
LLM06:2025; LLM08:2025
Information Security
Art. 55 (in force 2025-08-02)
Log signals
AzureActivity Microsoft.Authorization/roleAssignments/write granting Cognitive Services Contributor or Owner on an Azure OpenAI resource to a principal outside the documented platform-engineering group.
AzureActivity scope expansion that binds an existing operator role at a higher scope (subscription rather than resource group) — coverage creep on the inference plane.
AzureDiagnostics Category Audit showing data-plane Completions calls from a service principal that just acquired an elevated role binding — adversary exercising fresh privilege.
Query
AzureActivity
| where OperationNameValue == "Microsoft.Authorization/roleAssignments/write"
| where ResourceId has "Microsoft.CognitiveServices/accounts"
| extend body = tostring(parse_json(Properties).requestbody)
| where body has "Cognitive Services Contributor" or body has "Owner" or body has "OpenAI Contributor"
| project TimeGenerated, Caller, ResourceId, body
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. Role bindings on cognitive services accounts should be ticket-bound and four-eyes-reviewed; persist as a Sentinel analytics rule and pair with PIM-eligibility enforcement on the privileged roles.
Alert threshold
Privileged role-binding write to a non-platform-engineering principal — page on first occurrence.
Subscription-scope binding of Cognitive Services OpenAI User when the documented pattern is resource-group scope — page; treat as scope creep.
Initial response
Reverse the role assignment via the IaC baseline; capture the AzureActivity Caller and the requested role as the ledger.
Walk Audit-category data-plane logs for the new principal during the exposure window — any unattributed completions call is candidate misuse.
Escalate per general/ir.html — confirm Entra PIM eligibility configuration on the privileged Cognitive Services roles still requires MFA and ticket-bound activation.
Configure Diagnostic Settings to forward Azure OpenAI resource logs to a Log Analytics workspace. Enable the Audit and RequestResponse log categories. Resource logs are not enabled by default — explicit configuration is required. Apply PII redaction before log storage; do not store raw unredacted prompts (anti-feature #1 in General GenAI — Common Misconfigurations). See Azure Logging for the general diagnostic settings pattern.
AzureActivity Microsoft.Insights/diagnosticSettings/delete on a setting that exports the RequestResponse and Audit categories from an Azure OpenAI account — silences the prompt audit trail.
AzureActivity diagnostic-settings write events where the RequestResponse category is removed while only Audit remains — partial coverage erosion.
AzureDiagnostics ingestion gap exceeding 60 minutes on the RequestResponse category for an account with a steady baseline — absence-of-signal indicator.
Query
AzureActivity
| where ResourceId has "Microsoft.CognitiveServices/accounts"
| where OperationNameValue startswith "Microsoft.Insights/diagnosticSettings/"
| extend body = tostring(parse_json(Properties).requestbody)
| project TimeGenerated, Caller, ResourceId, OperationNameValue, body
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. Pair with a Heartbeat-style watchdog on the RequestResponse category — silence on a previously active account is itself the alert. Persist as a Sentinel analytics rule with severity High.
Alert threshold
Delete or RequestResponse-category drop on a diagnostic setting for a production Azure OpenAI account — page on first occurrence.
60-minute RequestResponse ingestion gap on an account with steady 30-day baseline — page; the prompt-audit truth source is dark.
Initial response
Reapply the diagnostic setting via Bicep/Terraform; confirm the next RequestResponse batch shows up in Log Analytics within 10 minutes.
If a Storage Account archive is also configured as destination, replay any RequestResponse records from the gap window for downstream Sentinel analytics correlation.
Escalate per general/ir.html — confirm Azure Policy Diagnostic logs in Cognitive Services accounts should be enabled remains in DeployIfNotExists mode at the management-group root.
Keep Azure OpenAI abuse monitoring enabled. Default-enabled is the secure baseline; the secure action is to not disable it. Microsoft's abuse monitoring applies automated classifiers and human review to flagged prompts and completions; disabling human review degrades detection accuracy for novel attack patterns and coordinated abuse campaigns. Applying for the Limited Access exemption to disable human review without formal risk acceptance and documented compensating controls is anti-feature #5 from General GenAI — Common Misconfigurations.
Anti-pattern: Disabling human review abuse monitoring without compensating controls is a known anti-pattern. If human review exemption is required for regulatory reasons (e.g., sensitive personal data in prompts), implement all three compensating controls: (1) Microsoft Defender for Cloud AI workload protection enabled; (2) extended log retention with SIEM ingestion of RequestResponse logs; (3) formal risk acceptance documented in your risk register with management sign-off.
Audit — Azure CLI
# Azure CLI 2.x
# Confirm abuse monitoring status — default is enabled
# There is no CLI command to "enable" abuse monitoring; it is the default state.
# This command confirms it has not been modified to a non-default configuration.
az cognitiveservices account show \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--query "properties.abuseMonitoring"
Note: If properties.abuseMonitoring returns null or is absent from the output, abuse monitoring is at its default-enabled state. A non-null value indicating human review is disabled is the misconfiguration to remediate.
Remediation — Bicep
targetScope = 'resourceGroup'
@description('Azure OpenAI account name. Abuse monitoring is ON by default; do NOT request the modified-abuse-monitoring exception unless legally required.')
param accountName string
resource aoai 'Microsoft.CognitiveServices/accounts@2024-10-01' existing = {
name: accountName
}
// No template knob disables abuse monitoring — its absence here is the control.
// Gate policy: alert if Microsoft.CognitiveServices/accounts/properties.userOwnedStorage is set
// (a signal that a modified-abuse-monitoring exception was filed without security review).
output abuseMonitoringEnforced bool = true
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
OWASP LLM Top 10:2025
NIST AI 600-1 (Jul 2024)
EU AI Act (2024/1689)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
n/a (no dedicated CIS GenAI benchmark)
SI-4; AU-6
A.8.16; A.5.34
n/a
LLM10:2025
Information Integrity
Art. 55 (in force 2025-08-02)
Log signals
AzureActivity write events on the account where the org-applied abuse-monitoring-modification request (properties.abuseMonitoringDisabled = true via Microsoft engagement) is silently reapplied without a matching internal ticket — would silently disable Microsoft-side abuse review.
AzureDiagnostics Audit-category entries showing absence of Microsoft-internal abuse review timestamps over an extended window — coverage gap signal.
SecurityAlert entries from Defender for AI workloads citing high-volume outlier prompt patterns that should have triggered abuse-monitoring review.
Query
AzureActivity
| where ResourceId has "Microsoft.CognitiveServices/accounts"
| where OperationNameValue == "Microsoft.CognitiveServices/accounts/write"
| extend body = tostring(parse_json(Properties).requestbody)
| where body has "abuseMonitoring" or body has "modifiedAbuseMonitoring"
| project TimeGenerated, Caller, ResourceId, body
| order by TimeGenerated desc
| take 200
Run as a KQL query in Log Analytics. The Microsoft abuse-monitoring modification is exceptional and requires a Microsoft-engagement ticket; persist as a Sentinel analytics rule with severity High and require the engagement reference at the time of the change.
Alert threshold
Any account-level write touching the abuse-monitoring property without a corresponding Microsoft-engagement ticket reference in the change-management system — page on first occurrence.
Defender for AI workloads alert citing outlier prompt patterns on an account that has had abuse monitoring modified — page; treat as platform-layer escalation.
Initial response
Verify the Microsoft-engagement ticket and confirm the abuse-monitoring posture against the engagement scope; if no ticket exists, restore the default abuse-monitoring posture via Microsoft support.
Walk the AzureDiagnostics RequestResponse stream for the prior 30 days for any prompt pattern that would have prompted Microsoft-internal review under the default posture — flag for internal abuse-review.
Escalate per general/ir.html — confirm the abuse-monitoring posture is documented in the org's AI-services responsible-AI register.
Enable Customer-Managed Key (CMK) encryption via Azure Key Vault for the Azure OpenAI resource. CMK gives control over the encryption lifecycle — rotation, revocation, and independent audit of key usage — that Microsoft-managed keys do not provide. CMK is not the default; it requires explicit Key Vault reference configuration. Note regional and SKU availability constraints: verify availability for your deployment region and SKU tier at deployment time, as CMK support is not uniform across all Azure regions.
Remediation — Azure CLI
# Azure CLI 2.x
# Step 1: Ensure the resource has a system-assigned identity (required for Key Vault access)
az cognitiveservices account identity assign \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}"
# Step 2: Grant Key Vault access to the managed identity
# (Key Vault must have soft-delete and purge protection enabled)
az keyvault set-policy \
--name "${KEY_VAULT_NAME}" \
--object-id "${AOAI_IDENTITY_PRINCIPAL_ID}" \
--key-permissions get wrapKey unwrapKey
AzureActivity Microsoft.CognitiveServices/accounts/write where encryption.keySource flips from Microsoft.KeyVault back to Microsoft.CognitiveServices — silently regresses the CMK envelope for stored prompts and fine-tuning data.
AzureActivity Microsoft.KeyVault/vaults/keys/delete on the CMK referenced by the OpenAI account — downstream rewrap operations will fail and disrupt the model deployment.
AzureDiagnostics Key Vault AuditEvent showing absence of unwrap operations from the OpenAI account's managed identity over the prior 24h — signal that the key is no longer being exercised.
Query
AzureActivity
| where OperationNameValue in ("Microsoft.CognitiveServices/accounts/write", "Microsoft.KeyVault/vaults/keys/delete")
| extend body = tostring(parse_json(Properties).requestbody)
| where body has "\"keySource\":\"Microsoft.CognitiveServices\"" or OperationNameValue endswith "keys/delete"
| project TimeGenerated, Caller, ResourceId, OperationNameValue, body
| order by TimeGenerated desc
| take 100
Run as a KQL query in Log Analytics. CMK regressions on cognitive services accounts are rare and intentional; persist as a Sentinel analytics rule with severity High and require a governance ticket reference at the time of the change.
Alert threshold
Any flip of keySource back to Microsoft.CognitiveServices on a production account — page on first occurrence.
Key Vault key delete on a key referenced by an OpenAI account — page immediately; model deployment will lose access within minutes.
Initial response
Reapply CMK encryption via the IaC pipeline; if the key was deleted, attempt soft-delete recovery via az keyvault key recover.
Confirm the OpenAI account's managed identity retains Key Vault Crypto User RBAC on the source vault and that the vault firewall admits the account's outbound IP range or service tag.
Escalate per general/ir.html — confirm Azure Policy Cognitive Services accounts should use customer-managed key for encryption remains in deny mode.
Configure per-deployment TPM (tokens-per-minute) and RPM (requests-per-minute) quota limits in Azure OpenAI to bound resource consumption and mitigate model-DoS (LLM10:2025). For multi-tenant or high-volume applications, enforce token consumption limits at the Azure API Management layer using the azure-openai-token-limit policy (also known as llm-token-limit). Quota limits prevent a single deployment or caller from exhausting capacity that is shared across all users and applications on the Azure OpenAI resource.
Audit — Azure CLI
# Azure CLI 2.x
# Check current quota and rate limit settings for a deployment
az cognitiveservices account deployment show \
--name "${AOAI_ACCOUNT}" \
--resource-group "${RG}" \
--deployment-name "${DEPLOYMENT}" \
--query "{tpm:properties.rateLimits[?key=='token'].count | [0], rpm:properties.rateLimits[?key=='request'].count | [0]}"
# Update deployment quota (requires REST API for full TPM/RPM configuration)
# GET /openai/deployments/{deploymentName}?api-version=2024-10-01
# PATCH with {"sku": {"capacity": N}} where N = desired TPM / 1000
AzureActivity Microsoft.CognitiveServices/accounts/deployments/write where the request body raises scaleSettings.capacity beyond the documented business baseline — proxy for abuse-or-runaway-cost setup.
AzureDiagnostics Category RequestResponse showing a sustained rate of completions calls from a single appId exceeding the 30-day per-app rolling p99 baseline.
SecurityAlert entries from Defender for AI workloads citing quota saturation or runaway token consumption on a deployment.
Query
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES" and Category == "RequestResponse"
| extend appId = tostring(identity_claim_appid_g)
| summarize tokens=sum(toint(properties_s.responseMetadata.totalTokens)) by bin(TimeGenerated, 5m), appId, Resource
| join kind=leftouter (
AzureDiagnostics
| where Category == "RequestResponse"
| extend appId2 = tostring(identity_claim_appid_g)
| summarize p99=percentile(toint(properties_s.responseMetadata.totalTokens), 99) by appId2
) on $left.appId == $right.appId2
| where tokens > p99 * 3
| order by tokens desc
| take 200
Run as a KQL query in Log Analytics. Pair with a Sentinel analytics rule that fires on sustained 3x-p99 spikes — a single hot client is the canonical runaway-cost signal as well as a data-exfiltration-via-inference indicator.
Alert threshold
Sustained 3x-p99 token consumption from a single appId over 15 minutes — page; rate-limit the client at the Front Door/APIM layer before the budget alarm fires.
Deployment capacity raise beyond business baseline without a matching change ticket — page on the management-plane write itself.
Initial response
Apply rate-limit policy at the APIM facade or Front Door route for the offending appId; reduce deployment capacity back to baseline via az cognitiveservices account deployment update.
Walk the prompt content for the high-volume client — repetitive identical prompts suggest a runaway loop; varied prompts spanning the full document corpus suggest model-mediated exfiltration.
Escalate per general/ir.html — confirm Azure Policy enforcing maximum deployment capacity per environment remains in deny mode and that Defender for AI workloads remains enabled on the subscription.