This page covers Google Cloud Platform incident response — the surfaces, services, and pre-positioned controls that decide whether the organisation can detect, contain, investigate, and recover from a cloud security incident before the attacker achieves their objective. Scope is the commercial GCP regions; GCP Sovereign Cloud (formerly Assured Workloads and the Google Cloud Air-Gapped offering) inherits the same controls but exposes a different region table and constrains some services — re-verify region availability before applying any of the IaC below to a sovereign or air-gapped deployment. The IR lifecycle on this page is the one codified in NIST SP 800-61 Rev 3 — April 2025 release (accessed 2026-05), which restates the lifecycle as a CSF 2.0 community profile (Govern · Identify · Protect · Detect · Respond · Recover); the canonical lifecycle, evidence-handling, communications, and recovery framing live on the General Incident Response page (lifecycle, preparation, containment, forensics, communication, recovery / post-incident, tabletops). This page maps that lifecycle to the GCP surfaces an IR responder actually touches.
The GCP IR plane is the product of an organization (the root policy boundary where Cloud Identity tenants attach and where Security Command Center is activated), Cloud Identity (the directory plane and the identity-provider boundary that must remain reachable when the on-prem IdP federation is compromised — this is the architectural reason break-glass accounts are Cloud-Identity-only), Security Command Center (the posture, threat-detection, and finding-aggregation plane — Premium tier ships Event Threat Detection, Container Threat Detection, VM Threat Detection, and Anomaly Detection; Enterprise tier upgrades to multi-cloud CNAPP with Mandiant threat intelligence and case management), Pub/Sub (the asynchronous message bus that SCC notifications fan out to and that Cloud Functions / Eventarc subscribe to for playbook automation), Cloud Functions Gen 2 and Eventarc (the serverless automation surfaces that execute containment playbooks), Cloud Storage with Bucket Lock (the immutable evidence-preservation surface; LOCKED retention is the only retention mode that survives a compromised storage admin), BigQuery audit-log sinks (the analytical surface for SQL-based forensic queries against the corpus of Admin Activity and Data Access logs), Compute Engine snapshots (the disk-image preservation primitive for VM forensics), and the Workspace Admin SDK (the OAuth-token revocation and session-enumeration surface for compromised user-identity response). Severity is assigned from the methodology severity rubric; equivalence callouts at the bottom of each control point at the matching control on the AWS, Azure, and OCI sibling pages.
Three anti-conflation callouts up front, because each gets conflated in audit reports and architecture reviews and the distinction is load-bearing for how the corresponding control is designed.
First: break-glass (gcp-ir-01) is PREVENTIVE, not RESPONSIVE. The control is the pre-positioning that makes response possible — 2-4 emergency-access Cloud Identity accounts created on a quiet day, hardened with FIDO2 hardware security keys, excluded from Context-Aware Access and Workforce Identity Federation, monitored via SCC and log-based metric alerts on every sign-in, and access-tested quarterly. Creating break-glass during the incident that took out the Workforce Identity Federation or the on-prem IdP is structurally impossible — the entire reason break-glass exists is that the normal sign-in path has failed. This typing mirrors the equivalent decision on Phase 6 aws-ir-01-break-glass-account and Phase 7 azure-ir-01-emergency-access and is locked across all three providers.
Second: forensic-evidence storage (gcp-ir-03) uses Cloud Storage Bucket Lock with is_locked = true — LOCKED retention cannot be reduced even by organization admins. Without Bucket Lock the attacker profile that compromised the storage admin role on the security project would also have the authority to shorten or remove the retention policy and overwrite or delete the evidence. The exact analog of this decision is Phase 6 aws-ir-03 using S3 Object Lock in Compliance mode (not Governance — Governance has s3:BypassGovernanceRetention which a sufficiently-privileged attacker acquires) and Phase 7 azure-ir-03 using Immutable Blob storage in Locked mode (not Unlocked — Unlocked is subscription-owner-bypassable). The control across all three providers is "the retention policy survives the same attacker who compromised the storage admin"; for Cloud Storage Bucket Lock that means retention_policy { is_locked = true; retention_period_seconds = 31557600 }, applied at bucket-creation time and locked once verified.
Third: tabletop exercises (gcp-ir-07) are PREVENTIVE, not RESPONSIVE. The value of a quarterly tabletop is preventing runbook decay before the next incident — runbooks written and never re-exercised are, in practice, runbooks that do not work when they are needed (the modal failure of all written IR procedures). Each exercise that surfaces a wrong, missing, or unexecutable runbook step is tracked as a finding against the runbook repository and remediated before the next quarter. The PREVENTIVE typing is locked across Phases 6 (aws-ir-07), 7 (azure-ir-07), and 8 (this control) per the methodology rubric and PITFALL B-14 (preventive controls stop bad states from arising; tabletops stop runbook decay).
Order matters. Control 01 is the pre-positioned identity that survives a compromised IdP. Control 02 is the automation pipeline that contains in seconds rather than the minutes a human on-call would take. Control 03 is the evidence-preservation surface that survives the storage-admin compromise. Control 04 is the SQL-driven forensic-query workflow that lets an analyst pivot across hundreds of millions of audit events. Controls 05–06 are the playbook runbooks for the two most common single-resource compromise scenarios (a VM and a service-account credential). Control 07 is the anti-decay loop that keeps every prior runbook executable. Cross-link to General IR — preparation for the lifecycle framing this ordering reflects.
gcp-ir-01-break-glass!CRITICALPREVENTIVE
Provision two to four break-glass Cloud Identity super-admin accounts that exist outside the normal Workforce Identity Federation / Cloud Identity-Google-Workspace synchronisation path. These accounts are created directly in the Cloud Identity tenant (not synced from an on-prem IdP via Google Cloud Directory Sync), excluded from every Context-Aware Access binding and Workforce Identity Federation pool, hardened with FIDO2 hardware security keys (no SMS, no TOTP authenticator apps — Google's 2024 Advanced Protection Program guidance and the broader phishing-resistant-MFA consensus), stored in dual-control physical safes, and instrumented with Security Command Center notifications plus a Cloud Logging log-based metric on every sign-in event (Google Cloud — Best practices for planning accounts and organizations (accessed 2026-05)). The accounts must be access-tested quarterly: every quarter, one named responder signs in, demonstrates the credential still works, and documents the test in the IR runbook repository. This is PREVENTIVE not RESPONSIVE because the control is the pre-positioning that makes response possible — break-glass cannot be created during the incident that took out the IdP. Cross-link to General IR — preparation and gcp-iam-02 for the Phase 5 zero-tolerance baseline on long-lived credentials that this control deliberately exempts itself from.
Remediation — gcloud CLI
# gcloud CLI (latest stable) + Workspace Admin SDK via gcloud identity
# Step 1: create the break-glass super-admin user directly in Cloud Identity.
# Done via the Workspace Admin Console UI OR Directory API; gcloud has limited
# coverage. The canonical operation is below, executed by a Workspace super-admin.
# Workspace Directory API — create the break-glass user.
gcloud identity groups memberships list \
--group-email=breakglass-admins@example.com \
--format='value(preferredMemberKey.id)'
# Step 2: assign Organization Administrator role to the break-glass account.
gcloud organizations add-iam-policy-binding ORG_ID \
--member='user:breakglass-01@example.com' \
--role='roles/resourcemanager.organizationAdmin'
# Step 3: enforce 2-Step Verification with security keys only for the
# break-glass OU. Done via Workspace Admin Console (Security > 2-Step
# Verification > Enforcement > Security Keys Only).
# Step 4: create the log-based metric that fires on every break-glass sign-in.
gcloud logging metrics create breakglass-signin \
--description='Sign-in event for any break-glass Cloud Identity account' \
--log-filter='logName="organizations/ORG_ID/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.authenticationInfo.principalEmail=("breakglass-01@example.com" OR "breakglass-02@example.com")'
# Step 5: route the metric to a Cloud Monitoring alert policy that pages the
# on-call (Pub/Sub topic subscribed by the PagerDuty integration).
gcloud alpha monitoring policies create \
--notification-channels=projects/PROJECT_ID/notificationChannels/PD_CHANNEL_ID \
--display-name='Break-glass account sign-in detected' \
--condition-filter='metric.type="logging.googleapis.com/user/breakglass-signin" AND resource.type="global"' \
--condition-threshold-value=0 \
--condition-threshold-comparison=COMPARISON_GT \
--condition-threshold-duration=0s
Remediation — Terraform
# Terraform Google provider ~> 5.0
# Source: Google Cloud break-glass + Cloud Identity docs (accessed 2026-05)
# Note: Cloud Identity user creation is not directly supported by the google
# provider; the user object itself is created via the Workspace Admin Console.
# Terraform manages the IAM bindings, log-based metric, and alert policy.
resource "google_organization_iam_member" "breakglass_01_org_admin" {
org_id = var.org_id
role = "roles/resourcemanager.organizationAdmin"
member = "user:breakglass-01@example.com"
}
resource "google_organization_iam_member" "breakglass_02_org_admin" {
org_id = var.org_id
role = "roles/resourcemanager.organizationAdmin"
member = "user:breakglass-02@example.com"
}
resource "google_logging_metric" "breakglass_signin" {
name = "breakglass-signin"
description = "Sign-in event for any break-glass Cloud Identity account"
filter = <<-EOT
logName="organizations/${var.org_id}/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.authenticationInfo.principalEmail=("breakglass-01@example.com" OR "breakglass-02@example.com")
EOT
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
}
}
resource "google_monitoring_alert_policy" "breakglass_signin_alert" {
display_name = "Break-glass account sign-in detected"
combiner = "OR"
notification_channels = [var.pagerduty_channel_id]
conditions {
display_name = "Any break-glass sign-in"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/${google_logging_metric.breakglass_signin.name}\" AND resource.type=\"global\""
comparison = "COMPARISON_GT"
threshold_value = 0
duration = "0s"
}
}
}
import * as gcp from "@pulumi/gcp";
// Break-glass IR account — single human, MFA-mandatory, separate from day-to-day admin path.
// Alert on EVERY use of this binding via Cloud Logging.
const breakGlass = new gcp.organizations.IAMMember("break-glass-org-admin", {
orgId: orgId,
role: "roles/resourcemanager.organizationAdmin",
member: "user:breakglass-ir@example.com",
});
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
n/a
n/a
(Cloud Identity emergency-access docs)
n/a
IR-4; AC-2(8); AC-6
A.5.24; A.5.26
CLD.9.5.1
Log signals
Cloud Audit Logs SetIamPolicy events binding the break-glass principal (typically break-glass-admin@) to roles/owner or roles/iam.securityAdmin on any project or folder.
Workspace sign-in audit feed showing sign-ins to the break-glass account from any location — the account should sit dormant outside declared incidents.
Cloud Identity password-reset / 2SV-enrol events on the break-glass user; both indicate someone is actively preparing to use the account.
Query
logName=~"organizations/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND ((protoPayload.methodName="SetIamPolicy"
AND protoPayload.serviceData.policyDelta.bindingDeltas.member=~"user:break-glass-.*")
OR (protoPayload.serviceName="admin.googleapis.com"
AND protoPayload.authenticationInfo.principalEmail=~"break-glass-.*"))
Pair this Cloud Logging filter with a Cloud Monitoring alert that routes to multiple notification channels (SMS, voice, on-call manager email) so a single channel failure cannot suppress the page; the break-glass account is a high-confidence signal.
Alert threshold
Page immediately on any sign-in to the break-glass account or any IAM binding involving its principal; there is no acceptable rate of background use.
Page on any Workspace admin event mutating the break-glass account's 2SV or password posture.
Initial response
Confirm the on-call engineer initiated the use via the documented incident channel; if not, suspend the account in Workspace and revoke all OAuth tokens via the Directory API.
Audit every API call made under the break-glass principal during the active window; the principal should produce a tightly bounded action set documented in the incident timeline.
Re-seal the break-glass account post-incident: rotate the password into the offline sealed envelope, re-enrol the FIDO2 key, and revoke any IAM bindings created during the window.
Configure Security Command Center to export findings to a dedicated Pub/Sub topic in the security-operations project; subscribe a Cloud Functions (Gen 2) function — or an Eventarc-triggered Cloud Run service — that executes the auto-containment playbook for high-severity finding categories (cryptomining detection, privilege-escalation detection, exfiltration detection, malware-on-VM detection). The playbook canonical steps: (1) snapshot the implicated disks via gcloud compute snapshots create; (2) swap the VM's firewall tags so it lands in the pre-deployed quarantine network policy (deny-all egress, ingress only from named IR analyst IPs); (3) disable the implicated service account via gcloud iam service-accounts disable; (4) emit a structured event to the IR PagerDuty Pub/Sub topic with the finding payload attached for the human on-call (Google Cloud — SCC notifications documentation (accessed 2026-05)). Pub/Sub subscription filter narrows the playbook scope to the categories that have well-tested auto-containment recipes (category=("CRYPTOMINING" OR "PRIVILEGE_ESCALATION" OR "EXFILTRATION")); other categories page the on-call without auto-containment. Same-phase STRICT pair-control: SCC threat-detection itself (the enablement of Event Threat Detection, Container Threat Detection, VM Threat Detection) is owned by gcp-log-04-scc-premium — this IR control covers the response pipeline that consumes those findings.
Remediation — gcloud CLI
# gcloud CLI (latest stable)
# Step 1: create the SCC notification config that exports findings to Pub/Sub.
gcloud pubsub topics create scc-findings-prod \
--project=security-ops-prod
gcloud scc notifications create scc-notif-cryptomining \
--organization=ORG_ID \
--pubsub-topic=projects/security-ops-prod/topics/scc-findings-prod \
--filter='state="ACTIVE" AND severity="CRITICAL" AND category="Mining: Bitcoin Pool"'
# Step 2: create the Pub/Sub subscription that filters categories with playbooks.
gcloud pubsub subscriptions create scc-findings-playbook-sub \
--project=security-ops-prod \
--topic=projects/security-ops-prod/topics/scc-findings-prod \
--message-filter='attributes.category=("CRYPTOMINING" OR "PRIVILEGE_ESCALATION" OR "EXFILTRATION")'
# Step 3: deploy the Gen 2 Cloud Function that runs the containment playbook.
gcloud functions deploy scc-auto-containment \
--project=security-ops-prod \
--region=europe-west1 \
--gen2 \
--runtime=python312 \
--source=./containment-playbook \
--entry-point=handle_finding \
--trigger-topic=scc-findings-prod \
--service-account=scc-containment-sa@security-ops-prod.iam.gserviceaccount.com
# Step 4: grant the containment SA the precise IAM needed in target projects.
gcloud organizations add-iam-policy-binding ORG_ID \
--member='serviceAccount:scc-containment-sa@security-ops-prod.iam.gserviceaccount.com' \
--role='roles/iam.serviceAccountAdmin' \
--condition='expression=resource.name.startsWith("projects/svc-"),title=svc-projects-only'
Cloud Audit Logs on cloudfunctions.googleapis.com for functions.delete targeting the IR-automation function bound to the SCC findings Pub/Sub topic.
Function-update events changing the entry point or the Pub/Sub trigger topic to a non-SCC source — silent re-pointing of the responder.
Pub/Sub subscription IAM mutations removing the function's roles/pubsub.subscriber binding — disconnects the fanout without deleting either side.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND ((protoPayload.serviceName="cloudfunctions.googleapis.com"
AND protoPayload.methodName=~".*functions.(delete|update)"
AND protoPayload.resourceName=~".*scc-responder.*")
OR (protoPayload.serviceName="pubsub.googleapis.com"
AND protoPayload.methodName=~".*subscriptions.SetIamPolicy"
AND protoPayload.resourceName=~".*scc-findings.*"))
This Cloud Logging filter watches the responder function's lifecycle and the Pub/Sub binding that ties it to SCC findings; pair with a Cloud Monitoring synthetic check that publishes a test finding every hour to verify end-to-end responder activation.
Alert threshold
Page on any delete or update of the IR-responder function or any IAM mutation on its Pub/Sub subscription.
Page on the synthetic finding failing to invoke the responder for two consecutive hourly checks.
Initial response
Restore the function from the captured Terraform state via terraform apply; re-bind the Pub/Sub subscriber role; verify the next synthetic finding invokes the responder.
Backfill any SCC findings raised during the responder outage by replaying via gcloud pubsub topics publish against the recovered subscription.
Pin responder code + IAM bindings in Terraform; add a Cloud Asset Inventory feed on the function and topic so future delete or unbind events fire via an independent channel.
Provision a dedicated Cloud Storage bucket in a dedicated forensic-evidence-prod project (sibling to the security-operations project, with its own IAM perimeter) for incident-evidence preservation. The bucket carries Cloud Storage Bucket Lock with retention_policy { is_locked = true; retention_period_seconds = 31557600 } — one calendar year, locked at bucket-creation time, immutable for the lifetime of the bucket (Google Cloud — Bucket Lock documentation (accessed 2026-05)). LOCKED retention is the only retention mode that survives a compromised storage admin: once the policy is locked, no principal — including organization admins and the project owner — can reduce or remove it; the only way to free the objects is to wait out the retention period. This is the precise analog of aws-ir-03 using S3 Object Lock in Compliance mode (Compliance, not Governance — Governance has s3:BypassGovernanceRetention which a sufficiently-privileged attacker acquires) and azure-ir-03 using Immutable Blob storage in Locked mode (Locked, not Unlocked — Unlocked is subscription-owner-bypassable). Layer customer-managed encryption keys (CMEK) via Cloud KMS on the bucket so evidence-at-rest is bound to the same key-management perimeter as the production-data CMEK chain; tag every uploaded object with chain-of-custody metadata (incident ID, uploader principal, SHA-256 hash, ingest timestamp). The CRITICAL rating reflects that evidence destroyed during the incident is irrecoverable — no after-the-fact compensating control exists.
Remediation — gcloud CLI
# gcloud CLI (latest stable)
# Step 1: create the dedicated forensic-evidence project under the security folder.
gcloud projects create forensic-evidence-prod \
--folder=FOLDER_ID_SECURITY \
--name='Forensic Evidence (production)'
# Step 2: create the bucket with CMEK + uniform bucket-level access.
gcloud storage buckets create gs://forensic-evidence-prod \
--project=forensic-evidence-prod \
--location=europe-west1 \
--uniform-bucket-level-access \
--default-encryption-key='projects/security-kms-prod/locations/europe-west1/keyRings/forensic/cryptoKeys/forensic-cmek' \
--public-access-prevention
# Step 3: set the retention policy to 1 year (31_557_600 seconds).
gcloud storage buckets update gs://forensic-evidence-prod \
--retention-period=31557600s
# Step 4: lock the retention policy. THIS IS IRREVERSIBLE.
# Once locked, the retention period can only be INCREASED, never reduced or removed.
gcloud storage buckets update gs://forensic-evidence-prod \
--lock-retention-period
# Step 5: bind the IR team service account at object-admin scope.
gcloud storage buckets add-iam-policy-binding gs://forensic-evidence-prod \
--member='serviceAccount:ir-team-sa@security-ops-prod.iam.gserviceaccount.com' \
--role='roles/storage.objectAdmin'
# Step 6: bind external IR partners at object-viewer scope for read-only handoff.
gcloud storage buckets add-iam-policy-binding gs://forensic-evidence-prod \
--member='group:external-ir-partners@example.com' \
--role='roles/storage.objectViewer'
Cloud Audit Logs on storage.googleapis.com for storage.buckets.update reducing the forensic bucket's retentionPolicy.retentionPeriod or disabling Object Versioning.
Bucket-lock state transitions: the forensic bucket's retention policy should be locked, and any buckets.lockRetentionPolicy reversal attempt produces a denial that is still worth reviewing.
Bucket-IAM mutations on the forensic bucket adding any storage.objects.delete-capable role to a non-incident principal.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.serviceName="storage.googleapis.com"
AND protoPayload.resourceName=~".*buckets/forensic-evidence.*"
AND (protoPayload.methodName="storage.buckets.update"
OR protoPayload.methodName="storage.setIamPermissions")
Run this Cloud Logging filter at project scope on the forensic-bucket project; pair with a Cloud Asset Inventory feed so retention-policy + IAM-state drift surface in real time, independent of audit-log delivery.
Alert threshold
Page on any mutation to the forensic bucket's retention policy, versioning, or IAM bindings.
Page on any object-delete attempt against the forensic bucket; with the retention lock the delete should be denied, but the attempt itself is signal.
Initial response
If the retention policy is not yet locked, restore the original retention period and apply the lock via gcloud storage buckets update --lock-retention-policy; locked policies cannot be shortened.
Revoke unauthorised IAM bindings; if any object was successfully deleted within object-versioning history, restore the noncurrent version via the legacy gsutil cp gs://bucket/object#GENERATION command (gsutil is legacy; gcloud storage cp is the current default).
Audit the principal that issued the mutation for additional forensic-scope tampering attempts; treat as candidate post-breach cover-up activity.
Use the BigQuery audit-log dataset created by the aggregated Cloud Logging sink to run SQL-based forensic queries across the corpus of Admin Activity, Data Access, System Event, and Policy Denied logs. The sink itself — partitioned table layout, 2-year retention via partition expiration, KMS-CMK encryption — is owned by gcp-log-08-bigquery-audit-sink; this IR control covers the forensic-query workflow that consumes the dataset. Maintain a saved-query library — implemented as google_bigquery_routine resources — covering canonical hunts: "all IAM role grants in the last 30 days", "all service-account-key creation events in the last 90 days", "all Cloud Storage storage.objects.list on the forensic-evidence bucket", "all Pub/Sub pubsub.subscriptions.create on the SCC findings topic", "all setIamPolicy calls by principals outside the security operations group" (Google Cloud — Cloud Audit Logs best practices (accessed 2026-05)). Partitioning by timestamp day and clustering by protoPayload.serviceName keeps the canonical hunts under a few-GB scan; analysts pivot from one finding to the next without leaving the BigQuery console. Same-phase STRICT pair-control: the sink configuration itself lives at gcp-log-08-bigquery-audit-sink — author the sink there; author the saved-query library here.
Remediation — gcloud CLI
# gcloud CLI + bq (latest stable)
# Step 1: verify the audit-log dataset exists (created by gcp-log-08 sink).
bq ls --project_id=security-logs-prod | grep cloudaudit_googleapis_com_
# Step 2: run a canonical forensic query — all IAM role grants in the last 30 days
# by a specific principal across every project.
bq query --use_legacy_sql=false --project_id=security-logs-prod \
--max_rows=10000 \
'SELECT
timestamp,
protoPayload.authenticationInfo.principalEmail AS actor,
resource.labels.project_id AS project,
protoPayload.methodName AS method,
protoPayload.serviceData.policyDelta.bindingDeltas AS deltas
FROM
`security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
AND protoPayload.methodName = "SetIamPolicy"
AND protoPayload.authenticationInfo.principalEmail = "suspect-actor@example.com"
ORDER BY timestamp DESC;'
# Step 3: persist the query as a saved BigQuery routine for reuse.
bq query --use_legacy_sql=false --project_id=security-logs-prod \
'CREATE OR REPLACE PROCEDURE
`security-logs-prod.forensic_hunts.iam_grants_by_actor`(
actor STRING, lookback_days INT64
)
BEGIN
SELECT timestamp, resource.labels.project_id, protoPayload.methodName
FROM `security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL lookback_days DAY))
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
AND protoPayload.methodName IN ("SetIamPolicy", "google.iam.admin.v1.SetIamPolicy")
AND protoPayload.authenticationInfo.principalEmail = actor
ORDER BY timestamp DESC;
END;'
Remediation — Terraform
# Terraform Google provider ~> 5.0
# Source: Google Cloud Logging-to-BigQuery + bigquery_routine docs (accessed 2026-05)
# The dataset itself is authored by gcp-log-08-bigquery-audit-sink in gcp/logging.html.
# This snippet adds the saved-query library on top of that dataset.
resource "google_bigquery_dataset" "forensic_hunts" {
project = "security-logs-prod"
dataset_id = "forensic_hunts"
location = "europe-west1"
default_encryption_configuration {
kms_key_name = var.logs_cmek_id
}
}
resource "google_bigquery_routine" "iam_grants_by_actor" {
project = "security-logs-prod"
dataset_id = google_bigquery_dataset.forensic_hunts.dataset_id
routine_id = "iam_grants_by_actor"
routine_type = "PROCEDURE"
language = "SQL"
arguments {
name = "actor"
data_type = jsonencode({ "typeKind" = "STRING" })
}
arguments {
name = "lookback_days"
data_type = jsonencode({ "typeKind" = "INT64" })
}
definition_body = <<-SQL
SELECT timestamp, resource.labels.project_id, protoPayload.methodName
FROM `security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL lookback_days DAY))
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
AND protoPayload.methodName IN ("SetIamPolicy", "google.iam.admin.v1.SetIamPolicy")
AND protoPayload.authenticationInfo.principalEmail = actor
ORDER BY timestamp DESC;
SQL
}
Cloud Audit Logs on bigquery.googleapis.com for Dataset.delete, Table.delete, or Dataset.setIamPolicy on the audit-log replication dataset.
BigQuery query history showing scheduled-query failures on the forensic transforms — a stalled scheduled query means the forensic data product is silently no longer updated.
BigQuery dataset CMEK rotation events: if the dataset is CMEK-protected, key destroy on the wrapping key blocks all subsequent reads.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.serviceName="bigquery.googleapis.com"
AND (protoPayload.methodName=~".*Dataset.(delete|setIamPolicy)"
OR protoPayload.methodName=~".*Table.delete")
AND protoPayload.resourceName=~".*datasets/forensic_audit_logs.*"
This Cloud Logging filter joins against the scheduled-query failure stream from BigQuery Data Transfer Service; both signals together capture both DDL drift and pipeline failure on the forensic audit-log dataset.
Alert threshold
Page on any DDL operation on the forensic audit-log dataset or its scheduled queries.
Page on any IAM mutation adding a non-incident principal to the dataset, or removing the audit-log sink writer-identity binding.
Initial response
Restore the dataset / table via BigQuery time-travel (FOR SYSTEM_TIME AS OF) if within the 7-day window; otherwise restore from snapshot or the parallel Cloud Storage sink destination.
Re-run the scheduled query backfill to close the data-product gap; verify the forensic-query views produce expected row counts for the gap window.
Pin dataset DDL + IAM + scheduled queries in Terraform; gate edits via change-management and require the BigQuery DDL-deny IAM condition to remain in place outside incidents.
Document a runbook for Compute Engine instance isolation that an on-call responder can execute end-to-end in a small number of named commands. The canonical step order: (1) snapshot the boot disk and every attached data disk via gcloud compute snapshots create with a snapshot name that encodes the incident ID, before any other action that could alter on-disk state; (2) swap the instance's network tags so it falls under the pre-deployed quarantine Hierarchical Firewall Policy / VPC firewall rule (deny-all egress except DNS to the IR-analyst forwarder, deny-all ingress except SSH from the named IR-analyst IP range, no metadata-server reachability except for IR analysts); (3) disable the instance's service account via gcloud iam service-accounts disable to revoke the SA's downstream IAM at platform scope; (4) attach an incident-id=INC-YYYY-NNN label to the instance for cross-referencing in BigQuery audit-log forensic queries; (5) extract the instance's Cloud Logging slice for the relevant time window via gcloud logging read and persist it to the forensic-evidence bucket from gcp-ir-03 with chain-of-custody metadata (Google Cloud — Create and manage disk snapshots (accessed 2026-05)). The quarantine network policy and the snapshot schedule are pre-deployed via google_compute_resource_policy and google_compute_firewall_policy so the runbook runs on infrastructure that already exists; the responder never has to author firewall rules under stress.
Remediation — gcloud CLI
# gcloud CLI (latest stable) — Compute Engine isolation runbook
INCIDENT_ID='INC-2026-0042'
INSTANCE='vm-app-prod-7f3a'
ZONE='europe-west1-b'
PROJECT='svc-app-prod'
# Step 1: snapshot boot + data disks BEFORE any other action.
gcloud compute snapshots create "boot-snap-${INCIDENT_ID}" \
--source-disk="${INSTANCE}" \
--source-disk-zone="${ZONE}" \
--project="${PROJECT}" \
--storage-location=eu \
--labels="incident-id=${INCIDENT_ID,,},purpose=forensic"
# Step 2: swap network tags so the VM falls under the quarantine firewall policy.
gcloud compute instances remove-tags "${INSTANCE}" \
--zone="${ZONE}" --project="${PROJECT}" \
--tags=app-prod
gcloud compute instances add-tags "${INSTANCE}" \
--zone="${ZONE}" --project="${PROJECT}" \
--tags=quarantine
# Step 3: disable the instance service account at platform scope.
INSTANCE_SA=$(gcloud compute instances describe "${INSTANCE}" \
--zone="${ZONE}" --project="${PROJECT}" \
--format='value(serviceAccounts.email)')
gcloud iam service-accounts disable "${INSTANCE_SA}" \
--project="${PROJECT}"
# Step 4: attach the incident-id label.
gcloud compute instances add-labels "${INSTANCE}" \
--zone="${ZONE}" --project="${PROJECT}" \
--labels="incident-id=${INCIDENT_ID,,}"
# Step 5: extract the instance's audit-log slice and persist to forensic bucket.
gcloud logging read \
"resource.labels.instance_id=\"$(gcloud compute instances describe ${INSTANCE} \
--zone=${ZONE} --project=${PROJECT} --format='value(id)')\"
AND timestamp>=\"2026-05-22T00:00:00Z\"" \
--project="${PROJECT}" \
--format=json > "/tmp/${INCIDENT_ID}-${INSTANCE}-logs.json"
gcloud storage cp "/tmp/${INCIDENT_ID}-${INSTANCE}-logs.json" \
"gs://forensic-evidence-prod/${INCIDENT_ID}/${INSTANCE}/logs.json"
Cloud Audit Logs on compute.googleapis.com for v1.compute.instances.setTags applying the documented quarantine network tag — emitted by the responder when isolating a candidate-compromised VM.
VPC firewall-rule events: firewalls.insert creating a deny-all egress rule targeting the quarantine tag — the rule should pre-exist; create-events indicate the responder built the rule on demand.
VM disk-snapshot events tagged forensic-snapshot: compute.snapshots.insert with the documented label set — ties snapshot creation to incident provenance.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.serviceName="compute.googleapis.com"
AND ((protoPayload.methodName="v1.compute.instances.setTags"
AND protoPayload.request.items="quarantine")
OR (protoPayload.methodName="v1.compute.snapshots.insert"
AND protoPayload.request.labels.purpose="forensic-snapshot"))
Stream this Cloud Logging filter into the incident management Pub/Sub topic so isolation actions become part of the canonical incident timeline; the join-key on incident IDs is the snapshot's labels.incidentId field.
Alert threshold
Notify (do not page) on every quarantine-tag application — these are valid responder outputs, surface in the incident timeline.
Page if the quarantine-egress firewall rule is missing from the project; it should pre-exist and be enforced as a configuration-as-code invariant.
Initial response
Verify the quarantine tag took effect by inspecting VPC Flow Logs for the affected VM — egress should drop to zero within seconds of the tag applying.
Snapshot the boot disk + any attached data disks with --labels=purpose=forensic-snapshot,incident-id=…; transfer the snapshots to the forensic-evidence project for offline analysis.
If the firewall rule is missing, recreate it from the Terraform baseline and re-apply the quarantine tag to ensure isolation actually engages.
Document a runbook for compromised-identity response covering both user identities and service-account credentials. The canonical step order: (1) for a user identity, suspend the user via Workspace Admin SDK / Cloud Identity (gcloud identity + Admin Console) — this immediately blocks future sign-in; (2) revoke all active OAuth refresh tokens for the user via the Admin SDK users.signOut / tokens.delete APIs so already-issued tokens cannot be reused until expiry; (3) for a compromised service account, delete every long-lived JSON key via gcloud iam service-accounts keys delete and disable the SA via gcloud iam service-accounts disable; (4) enumerate active sessions and recent API calls in BigQuery (cross-link to gcp-ir-04); (5) rotate any dependent secrets the identity had access to in Secret Manager via gcloud secrets versions add + gcloud secrets versions disable of the previous version; (6) re-issue scoped IAM bindings on a fresh user / service-account that does not inherit the compromised identity's role accumulation (Google Cloud — Create and delete service account keys (accessed 2026-05)). The zero-tolerance baseline that makes service-account-key compromise rare in the first place is owned by gcp-iam-02-no-sa-keys; this control covers the response path for the residual exceptions and for the user-identity branch.
Remediation — gcloud CLI
# gcloud CLI + Admin SDK (latest stable) — credential revocation runbook
INCIDENT_ID='INC-2026-0042'
SA_EMAIL='compromised-sa@svc-app-prod.iam.gserviceaccount.com'
USER_EMAIL='compromised-user@example.com'
# Branch A: compromised service account.
# Step A.1: list every long-lived JSON key on the SA.
gcloud iam service-accounts keys list \
--iam-account="${SA_EMAIL}" \
--filter='keyType=USER_MANAGED' \
--format='value(name)'
# Step A.2: delete every USER_MANAGED key (Google-managed keys cannot/should not be deleted).
for key in $(gcloud iam service-accounts keys list \
--iam-account="${SA_EMAIL}" --filter='keyType=USER_MANAGED' --format='value(name)'); do
gcloud iam service-accounts keys delete "${key}" \
--iam-account="${SA_EMAIL}" --quiet
done
# Step A.3: disable the SA so any in-flight access tokens are rejected at next refresh.
gcloud iam service-accounts disable "${SA_EMAIL}"
# Branch B: compromised user identity.
# Step B.1: suspend the Workspace user (blocks future sign-in).
# Done via Workspace Admin Console OR Directory API.
gcloud identity groups memberships list \
--group-email=all-users@example.com \
--format='value(preferredMemberKey.id)' | grep "${USER_EMAIL}"
# Step B.2: revoke all active OAuth refresh tokens for the user.
# Workspace Admin SDK Directory API tokens.delete — invoked via gcloud REST.
gcloud auth print-access-token \
| xargs -I{} curl -X POST \
-H "Authorization: Bearer {}" \
"https://admin.googleapis.com/admin/directory/v1/users/${USER_EMAIL}/signOut"
# Step B.3: enumerate the user's recent API calls in BigQuery (uses gcp-ir-04 routine).
bq query --use_legacy_sql=false --project_id=security-logs-prod \
"CALL \`security-logs-prod.forensic_hunts.iam_grants_by_actor\`('${USER_EMAIL}', 90);"
# Step C (both branches): rotate dependent Secret Manager secrets.
for secret in $(gcloud secrets list --project=svc-app-prod --format='value(name)'); do
gcloud secrets versions add "${secret}" \
--project=svc-app-prod \
--data-file=/tmp/new-secret-value
done
Remediation — Terraform
# Terraform Google provider ~> 5.0
# Source: Google Cloud IAM SA-key + Secret Manager docs (accessed 2026-05)
# Pre-positioned: a notification channel and incident-runbook reference that
# the runbook above can be tracked against. SA key creation itself is governed
# by gcp-iam-02-no-sa-keys (zero-tolerance baseline).
resource "google_secret_manager_secret" "app_db_password" {
project = var.app_project
secret_id = "app-db-password"
replication {
user_managed {
replicas {
location = "europe-west1"
customer_managed_encryption {
kms_key_name = var.app_cmek_id
}
}
}
}
rotation {
next_rotation_time = "2026-08-01T00:00:00Z"
rotation_period = "7776000s" # 90 days
}
}
resource "google_organization_iam_audit_config" "sa_key_events" {
org_id = var.org_id
service = "iam.googleapis.com"
audit_log_config {
log_type = "ADMIN_READ"
}
audit_log_config {
log_type = "DATA_WRITE"
}
}
Cloud Audit Logs on iam.googleapis.com for DisableServiceAccountKey and DeleteServiceAccountKey emitted by the responder — these are the canonical credential-revocation actions during an incident.
Service-account impersonation events post-revoke (iamcredentials.googleapis.com/GenerateAccessToken denied) — confirms the revocation took effect.
OAuth-token revocation events on user accounts via the Workspace Directory API.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND protoPayload.serviceName="iam.googleapis.com"
AND (protoPayload.methodName="google.iam.admin.v1.DisableServiceAccountKey"
OR protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccountKey")
This Cloud Logging filter feeds the incident-timeline Pub/Sub topic; pair with a denied-impersonation query (protoPayload.status.code=7 on iamcredentials) so the post-revoke validation is automatic.
Alert threshold
Notify on every responder-issued key disable / delete (expected during incidents).
Page if a previously revoked key produces a successful GenerateAccessToken response within 24 hours of revocation — indicates a residual valid token or cache window.
Initial response
Verify the disable/delete via gcloud iam service-accounts keys list; the key should report disabled: true or be absent entirely.
Force-rotate any downstream consumer that still references the revoked key path; if the consumer uses Workload Identity Federation, no rotation is needed and the responder action is sufficient.
Document the revoked-key fingerprint in the incident record and add the fingerprint to a deny-list checked by the responder so future re-issuance attempts of the same key material are surfaced.
Run quarterly tabletop exercises against three to five documented GCP-specific scenarios drawn from the threat-model corpus: (a) a Cloud Storage bucket discovered to be public with PII inside; (b) a service-account JSON key leaked to a public GitHub repository (secret-scanning webhook fires); (c) an SCC CRITICAL finding from Event Threat Detection for Mining: Bitcoin Pool against an unused project's quota; (d) a mass BigQuery export of a regulated dataset to an unfamiliar Cloud Storage bucket; (e) a Cloud Identity sign-in from a country outside the organisation's footprint with anomalous user-agent. For each exercise, run the relevant runbook end-to-end against a tabletop project (separate from production), time the steps, and track mean time-to-contain (MTTC) as the primary metric quarter-over-quarter (NIST SP 800-61 Rev 3 — CSF 2.0 Community Profile, April 2025 release (accessed 2026-05)). Engagement contract with Google Cloud Security (the Mandiant team, now operating under the Google Cloud Security brand following the 2022 acquisition) is itself tested annually by initiating a no-incident engagement — the engagement-works test is the control, not the contract; a contract that has never been exercised is, in practice, a contract that may not work when it is needed (Google Cloud — Mandiant Incident Response services (accessed 2026-05)). Exercises that surface a wrong, missing, or unexecutable runbook step are tracked as findings against the runbook repository and remediated before the next quarter. The control is typed PREVENTIVE (anti-decay rationale per PITFALL B-14; mirrors aws-ir-07 + azure-ir-07) because its value lies in preventing the runbook decay that always happens to documentation nobody runs — runbooks written and never tested are, in practice, runbooks that do not work.
Remediation — gcloud CLI
# gcloud CLI (latest stable) — tabletop driver commands
# Tabletop exercises are facilitated workshops; the GCP surface they touch is
# the tabletop project. Representative driver: stand up a deliberately
# misconfigured Cloud Storage bucket so responders can practice scenario (a).
PROJECT='ir-tabletop-2026q2'
BUCKET='ir-tabletop-public-pii-202605'
# Step 1: create the tabletop bucket in the tabletop project.
gcloud storage buckets create "gs://${BUCKET}" \
--project="${PROJECT}" \
--location=europe-west1
# Step 2: deliberately DISABLE public-access-prevention for the exercise.
# Production buckets must NEVER have this configuration; this is exercise-only.
gcloud storage buckets update "gs://${BUCKET}" \
--no-public-access-prevention
# Step 3: drop a synthetic PII file so the responder has something to remediate.
echo "name,ssn" > /tmp/tabletop-pii.csv
echo "Jane Doe,000-00-0000" >> /tmp/tabletop-pii.csv
gcloud storage cp /tmp/tabletop-pii.csv "gs://${BUCKET}/" --project="${PROJECT}"
# Step 4: record the MTTC for the exercise — start timestamp at SCC finding.
# End timestamp when the bucket is back to public-access-prevention=enforced
# AND the PII file has been deleted from the bucket.
gcloud logging write ir-tabletop-mttc \
"{\"incident_id\":\"TABLETOP-2026Q2-A\",\"event\":\"start\",\"timestamp\":\"$(date -u +%FT%TZ)\"}" \
--payload-type=json --project="${PROJECT}"
Remediation — Terraform
# Terraform Google provider ~> 5.0
# Source: NIST SP 800-61 Rev 3 + Google Cloud Mandiant IR services (accessed 2026-05)
# Tabletop-project-only resources. Apply only to the dedicated tabletop project.
resource "google_storage_bucket" "tabletop_scenario_a" {
project = "ir-tabletop-2026q2"
name = "ir-tabletop-public-pii-202605"
location = "europe-west1"
uniform_bucket_level_access = true
public_access_prevention = "inherited" # exercise-only; prod = "enforced"
labels = {
purpose = "ir-tabletop"
quarter = "2026q2"
}
}
# BigQuery dataset that holds the MTTC time-series across quarters.
resource "google_bigquery_dataset" "ir_tabletop_metrics" {
project = "security-ops-prod"
dataset_id = "ir_tabletop_metrics"
location = "europe-west1"
}
resource "google_bigquery_table" "mttc_history" {
project = "security-ops-prod"
dataset_id = google_bigquery_dataset.ir_tabletop_metrics.dataset_id
table_id = "mttc_history"
deletion_protection = true
schema = jsonencode([
{ name = "incident_id", type = "STRING", mode = "REQUIRED" },
{ name = "scenario", type = "STRING", mode = "REQUIRED" },
{ name = "quarter", type = "STRING", mode = "REQUIRED" },
{ name = "start_timestamp", type = "TIMESTAMP", mode = "REQUIRED" },
{ name = "contained_timestamp", type = "TIMESTAMP", mode = "REQUIRED" },
{ name = "mttc_seconds", type = "INT64", mode = "REQUIRED" }
])
}
Remediation — Infrastructure Manager
Compliance mapping
CIS AWS Foundations v3.0.0
CIS Microsoft Azure Foundations v3.0.0
CIS GCP Foundation v4.0.0
CIS OCI Foundation v2.0.0
NIST SP 800-53 rev5
ISO/IEC 27001:2022
ISO/IEC 27017:2015
(best-practices)
n/a
(NIST SP 800-61 rev 3 + Google IR engagement)
n/a
IR-2; IR-3; IR-3(2)
A.5.24; A.5.27
n/a
Log signals
Cloud Scheduler audit entries for the documented quarterly tabletop-exercise job: cloudscheduler.googleapis.comJobs.run on the scheduler resource named tabletop-quarterly.
Cloud Workflows audit events on the incident-response-tabletop workflow execution: workflowexecutions.googleapis.comExecutions.create ties exercise runs to a single execution-id.
Drift on the responder synthetic-injection function: Cloud Functions audit on the function bound to the tabletop topic should fire on every exercise; absence is signal.
Query
logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
AND ((protoPayload.serviceName="cloudscheduler.googleapis.com"
AND protoPayload.resourceName=~".*jobs/tabletop-quarterly")
OR (protoPayload.serviceName="workflowexecutions.googleapis.com"
AND protoPayload.resourceName=~".*workflows/incident-response-tabletop"))
Pin this Cloud Logging filter to a quarterly Cloud Monitoring scheduled report; the cadence is rare enough that the report doubles as a control-evidence artefact for ISO 27017 CLD.6.3.1 review.
Alert threshold
Page if the tabletop job has not executed within the documented quarterly window (90 days plus a 15-day grace).
Page on any deletion of the tabletop scheduler or workflow resource.
Initial response
Run the tabletop workflow on demand via gcloud workflows execute incident-response-tabletop; collect responder timings and the participant attendance list for the exercise record.
Compare measured responder latencies against the documented SLO targets; any divergence becomes a backlog ticket on the IR team's queue.
Re-run the exercise with the SLO-divergent step recorded as the focus area; the next quarterly run should show measurable improvement against the same step.