GCP Incident Response Hardening

Overview

This page covers Google Cloud Platform incident response — the surfaces, services, and pre-positioned controls that decide whether the organisation can detect, contain, investigate, and recover from a cloud security incident before the attacker achieves their objective. Scope is the commercial GCP regions; GCP Sovereign Cloud (formerly Assured Workloads and the Google Cloud Air-Gapped offering) inherits the same controls but exposes a different region table and constrains some services — re-verify region availability before applying any of the IaC below to a sovereign or air-gapped deployment. The IR lifecycle on this page is the one codified in NIST SP 800-61 Rev 3 — April 2025 release (accessed 2026-05), which restates the lifecycle as a CSF 2.0 community profile (Govern · Identify · Protect · Detect · Respond · Recover); the canonical lifecycle, evidence-handling, communications, and recovery framing live on the General Incident Response page (lifecycle, preparation, containment, forensics, communication, recovery / post-incident, tabletops). This page maps that lifecycle to the GCP surfaces an IR responder actually touches.

The GCP IR plane is the product of an organization (the root policy boundary where Cloud Identity tenants attach and where Security Command Center is activated), Cloud Identity (the directory plane and the identity-provider boundary that must remain reachable when the on-prem IdP federation is compromised — this is the architectural reason break-glass accounts are Cloud-Identity-only), Security Command Center (the posture, threat-detection, and finding-aggregation plane — Premium tier ships Event Threat Detection, Container Threat Detection, VM Threat Detection, and Anomaly Detection; Enterprise tier upgrades to multi-cloud CNAPP with Mandiant threat intelligence and case management), Pub/Sub (the asynchronous message bus that SCC notifications fan out to and that Cloud Functions / Eventarc subscribe to for playbook automation), Cloud Functions Gen 2 and Eventarc (the serverless automation surfaces that execute containment playbooks), Cloud Storage with Bucket Lock (the immutable evidence-preservation surface; LOCKED retention is the only retention mode that survives a compromised storage admin), BigQuery audit-log sinks (the analytical surface for SQL-based forensic queries against the corpus of Admin Activity and Data Access logs), Compute Engine snapshots (the disk-image preservation primitive for VM forensics), and the Workspace Admin SDK (the OAuth-token revocation and session-enumeration surface for compromised user-identity response). Severity is assigned from the methodology severity rubric; equivalence callouts at the bottom of each control point at the matching control on the AWS, Azure, and OCI sibling pages.

Three anti-conflation callouts up front, because each gets conflated in audit reports and architecture reviews and the distinction is load-bearing for how the corresponding control is designed.

First: break-glass (gcp-ir-01) is PREVENTIVE, not RESPONSIVE. The control is the pre-positioning that makes response possible — 2-4 emergency-access Cloud Identity accounts created on a quiet day, hardened with FIDO2 hardware security keys, excluded from Context-Aware Access and Workforce Identity Federation, monitored via SCC and log-based metric alerts on every sign-in, and access-tested quarterly. Creating break-glass during the incident that took out the Workforce Identity Federation or the on-prem IdP is structurally impossible — the entire reason break-glass exists is that the normal sign-in path has failed. This typing mirrors the equivalent decision on Phase 6 aws-ir-01-break-glass-account and Phase 7 azure-ir-01-emergency-access and is locked across all three providers.

Second: forensic-evidence storage (gcp-ir-03) uses Cloud Storage Bucket Lock with is_locked = true — LOCKED retention cannot be reduced even by organization admins. Without Bucket Lock the attacker profile that compromised the storage admin role on the security project would also have the authority to shorten or remove the retention policy and overwrite or delete the evidence. The exact analog of this decision is Phase 6 aws-ir-03 using S3 Object Lock in Compliance mode (not Governance — Governance has s3:BypassGovernanceRetention which a sufficiently-privileged attacker acquires) and Phase 7 azure-ir-03 using Immutable Blob storage in Locked mode (not Unlocked — Unlocked is subscription-owner-bypassable). The control across all three providers is "the retention policy survives the same attacker who compromised the storage admin"; for Cloud Storage Bucket Lock that means retention_policy { is_locked = true; retention_period_seconds = 31557600 }, applied at bucket-creation time and locked once verified.

Third: tabletop exercises (gcp-ir-07) are PREVENTIVE, not RESPONSIVE. The value of a quarterly tabletop is preventing runbook decay before the next incident — runbooks written and never re-exercised are, in practice, runbooks that do not work when they are needed (the modal failure of all written IR procedures). Each exercise that surfaces a wrong, missing, or unexecutable runbook step is tracked as a finding against the runbook repository and remediated before the next quarter. The PREVENTIVE typing is locked across Phases 6 (aws-ir-07), 7 (azure-ir-07), and 8 (this control) per the methodology rubric and PITFALL B-14 (preventive controls stop bad states from arising; tabletops stop runbook decay).

Order matters. Control 01 is the pre-positioned identity that survives a compromised IdP. Control 02 is the automation pipeline that contains in seconds rather than the minutes a human on-call would take. Control 03 is the evidence-preservation surface that survives the storage-admin compromise. Control 04 is the SQL-driven forensic-query workflow that lets an analyst pivot across hundreds of millions of audit events. Controls 05–06 are the playbook runbooks for the two most common single-resource compromise scenarios (a VM and a service-account credential). Control 07 is the anti-decay loop that keeps every prior runbook executable. Cross-link to General IR — preparation for the lifecycle framing this ordering reflects.

gcp-ir-01-break-glass ! CRITICAL PREVENTIVE

Provision two to four break-glass Cloud Identity super-admin accounts that exist outside the normal Workforce Identity Federation / Cloud Identity-Google-Workspace synchronisation path. These accounts are created directly in the Cloud Identity tenant (not synced from an on-prem IdP via Google Cloud Directory Sync), excluded from every Context-Aware Access binding and Workforce Identity Federation pool, hardened with FIDO2 hardware security keys (no SMS, no TOTP authenticator apps — Google's 2024 Advanced Protection Program guidance and the broader phishing-resistant-MFA consensus), stored in dual-control physical safes, and instrumented with Security Command Center notifications plus a Cloud Logging log-based metric on every sign-in event (Google Cloud — Best practices for planning accounts and organizations (accessed 2026-05)). The accounts must be access-tested quarterly: every quarter, one named responder signs in, demonstrates the credential still works, and documents the test in the IR runbook repository. This is PREVENTIVE not RESPONSIVE because the control is the pre-positioning that makes response possible — break-glass cannot be created during the incident that took out the IdP. Cross-link to General IR — preparation and gcp-iam-02 for the Phase 5 zero-tolerance baseline on long-lived credentials that this control deliberately exempts itself from.

MITIGATES: Lockout of all administrators following a compromise of the federated identity provider (Okta, Azure AD / Microsoft Entra ID, on-prem Active Directory federated to Cloud Identity), a misconfigured Context-Aware Access binding that excludes all current admins, a Google Cloud Directory Sync misconfiguration that deletes the wrong organizational-unit's accounts, or a malicious-insider scenario where a Cloud Identity super-admin attempts to remove the recovery path before exfiltration. Compounds when the federated IdP is itself the entry point for the original compromise.

ATTACK VECTOR: The on-prem AD-FS server federated to Cloud Identity is compromised in a separate incident; all Cloud Identity sign-ins go through that federation. The attacker pivots into the Cloud Identity tenant, removes the federated-IdP binding, and the organisation can no longer sign in to recover the situation. Without break-glass: a Google Workspace support ticket and a multi-day identity-proofing escalation. With break-glass: a named responder retrieves the hardware key from the physical safe and signs in within minutes.

BLAST RADIUS: Without break-glass: the entire Cloud Identity tenant and every Google Cloud organization, folder, and project bound to it, for the duration of the Workspace support escalation (typically multi-day under contractual-SLA paths). With break-glass: bounded to the time from incident declaration to the responder reaching the safe.

Remediation — gcloud CLI

# gcloud CLI (latest stable) + Workspace Admin SDK via gcloud identity
# Step 1: create the break-glass super-admin user directly in Cloud Identity.
# Done via the Workspace Admin Console UI OR Directory API; gcloud has limited
# coverage. The canonical operation is below, executed by a Workspace super-admin.

# Workspace Directory API — create the break-glass user.
gcloud identity groups memberships list \
  --group-email=breakglass-admins@example.com \
  --format='value(preferredMemberKey.id)'

# Step 2: assign Organization Administrator role to the break-glass account.
gcloud organizations add-iam-policy-binding ORG_ID \
  --member='user:breakglass-01@example.com' \
  --role='roles/resourcemanager.organizationAdmin'

# Step 3: enforce 2-Step Verification with security keys only for the
# break-glass OU. Done via Workspace Admin Console (Security > 2-Step
# Verification > Enforcement > Security Keys Only).

# Step 4: create the log-based metric that fires on every break-glass sign-in.
gcloud logging metrics create breakglass-signin \
  --description='Sign-in event for any break-glass Cloud Identity account' \
  --log-filter='logName="organizations/ORG_ID/logs/cloudaudit.googleapis.com%2Factivity"
                AND protoPayload.authenticationInfo.principalEmail=("breakglass-01@example.com" OR "breakglass-02@example.com")'

# Step 5: route the metric to a Cloud Monitoring alert policy that pages the
# on-call (Pub/Sub topic subscribed by the PagerDuty integration).
gcloud alpha monitoring policies create \
  --notification-channels=projects/PROJECT_ID/notificationChannels/PD_CHANNEL_ID \
  --display-name='Break-glass account sign-in detected' \
  --condition-filter='metric.type="logging.googleapis.com/user/breakglass-signin" AND resource.type="global"' \
  --condition-threshold-value=0 \
  --condition-threshold-comparison=COMPARISON_GT \
  --condition-threshold-duration=0s

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud break-glass + Cloud Identity docs (accessed 2026-05)
# Note: Cloud Identity user creation is not directly supported by the google
# provider; the user object itself is created via the Workspace Admin Console.
# Terraform manages the IAM bindings, log-based metric, and alert policy.

resource "google_organization_iam_member" "breakglass_01_org_admin" {
  org_id = var.org_id
  role   = "roles/resourcemanager.organizationAdmin"
  member = "user:breakglass-01@example.com"
}

resource "google_organization_iam_member" "breakglass_02_org_admin" {
  org_id = var.org_id
  role   = "roles/resourcemanager.organizationAdmin"
  member = "user:breakglass-02@example.com"
}

resource "google_logging_metric" "breakglass_signin" {
  name        = "breakglass-signin"
  description = "Sign-in event for any break-glass Cloud Identity account"
  filter      = <<-EOT
    logName="organizations/${var.org_id}/logs/cloudaudit.googleapis.com%2Factivity"
    AND protoPayload.authenticationInfo.principalEmail=("breakglass-01@example.com" OR "breakglass-02@example.com")
  EOT

  metric_descriptor {
    metric_kind = "DELTA"
    value_type  = "INT64"
  }
}

resource "google_monitoring_alert_policy" "breakglass_signin_alert" {
  display_name = "Break-glass account sign-in detected"
  combiner     = "OR"
  notification_channels = [var.pagerduty_channel_id]

  conditions {
    display_name = "Any break-glass sign-in"
    condition_threshold {
      filter          = "metric.type=\"logging.googleapis.com/user/${google_logging_metric.breakglass_signin.name}\" AND resource.type=\"global\""
      comparison      = "COMPARISON_GT"
      threshold_value = 0
      duration        = "0s"
    }
  }
}

Remediation — Config Connector

apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: break-glass-org-admin
  namespace: config-control
spec:
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Organization
    external: "organizations/ORG_ID"
  role: roles/resourcemanager.organizationAdmin
  member: "user:breakglass-ir@example.com"

Remediation — Pulumi (TypeScript)

import * as gcp from "@pulumi/gcp";

// Break-glass IR account — single human, MFA-mandatory, separate from day-to-day admin path.
// Alert on EVERY use of this binding via Cloud Logging.
const breakGlass = new gcp.organizations.IAMMember("break-glass-org-admin", {
    orgId: orgId,
    role: "roles/resourcemanager.organizationAdmin",
    member: "user:breakglass-ir@example.com",
});

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(Cloud Identity emergency-access docs)	n/a	IR-4; AC-2(8); AC-6	A.5.24; A.5.26	CLD.9.5.1

Log signals

Cloud Audit Logs SetIamPolicy events binding the break-glass principal (typically break-glass-admin@) to roles/owner or roles/iam.securityAdmin on any project or folder.
Workspace sign-in audit feed showing sign-ins to the break-glass account from any location — the account should sit dormant outside declared incidents.
Cloud Identity password-reset / 2SV-enrol events on the break-glass user; both indicate someone is actively preparing to use the account.

Query

logName=~"organizations/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND ((protoPayload.methodName="SetIamPolicy"
                AND protoPayload.serviceData.policyDelta.bindingDeltas.member=~"user:break-glass-.*")
               OR (protoPayload.serviceName="admin.googleapis.com"
                   AND protoPayload.authenticationInfo.principalEmail=~"break-glass-.*"))

Pair this Cloud Logging filter with a Cloud Monitoring alert that routes to multiple notification channels (SMS, voice, on-call manager email) so a single channel failure cannot suppress the page; the break-glass account is a high-confidence signal.

Alert threshold

Page immediately on any sign-in to the break-glass account or any IAM binding involving its principal; there is no acceptable rate of background use.
Page on any Workspace admin event mutating the break-glass account's 2SV or password posture.

Initial response

Confirm the on-call engineer initiated the use via the documented incident channel; if not, suspend the account in Workspace and revoke all OAuth tokens via the Directory API.
Audit every API call made under the break-glass principal during the active window; the principal should produce a tightly bounded action set documented in the incident timeline.
Re-seal the break-glass account post-incident: rotate the password into the offline sealed envelope, re-enrol the FIDO2 key, and revoke any IAM bindings created during the window.

References

Google Cloud — Identity best practices (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI

gcp-ir-02-scc-pub-sub-function ! HIGH RESPONSIVE

Configure Security Command Center to export findings to a dedicated Pub/Sub topic in the security-operations project; subscribe a Cloud Functions (Gen 2) function — or an Eventarc-triggered Cloud Run service — that executes the auto-containment playbook for high-severity finding categories (cryptomining detection, privilege-escalation detection, exfiltration detection, malware-on-VM detection). The playbook canonical steps: (1) snapshot the implicated disks via gcloud compute snapshots create; (2) swap the VM's firewall tags so it lands in the pre-deployed quarantine network policy (deny-all egress, ingress only from named IR analyst IPs); (3) disable the implicated service account via gcloud iam service-accounts disable; (4) emit a structured event to the IR PagerDuty Pub/Sub topic with the finding payload attached for the human on-call (Google Cloud — SCC notifications documentation (accessed 2026-05)). Pub/Sub subscription filter narrows the playbook scope to the categories that have well-tested auto-containment recipes (category=("CRYPTOMINING" OR "PRIVILEGE_ESCALATION" OR "EXFILTRATION")); other categories page the on-call without auto-containment. Same-phase STRICT pair-control: SCC threat-detection itself (the enablement of Event Threat Detection, Container Threat Detection, VM Threat Detection) is owned by gcp-log-04-scc-premium — this IR control covers the response pipeline that consumes those findings.

MITIGATES: Attacker dwell time between SCC finding emission and human responder action — typically a non-trivial fraction of total dwell when the incident lands outside business hours, when the on-call is paged but the initial assessment takes ten or more minutes, or when the contain-by-hand workflow requires console clicks across multiple projects. Compounds when the attacker is a cryptominer who can spin up dozens of GPU instances before manual containment lands.

ATTACK VECTOR: A service-account credential leaks (committed to a public GitHub repo by mistake). An attacker uses it to enumerate IAM permissions and launches GPU-instance cryptomining workloads in unused regions. SCC's Event Threat Detection emits the Mining: Bitcoin Pool finding within minutes; without auto-containment the on-call responder spends 10-30 minutes assessing, identifying the compromised SA, and disabling it manually. With auto-containment: the Cloud Function disables the SA and quarantines the implicated VMs within seconds of finding emission.

BLAST RADIUS: Without auto-containment: every workload reachable by the compromised SA during the manual-response window. With auto-containment: bounded to the workloads that were already provisioned before the playbook fired (typically within tens of seconds).

Remediation — gcloud CLI

# gcloud CLI (latest stable)
# Step 1: create the SCC notification config that exports findings to Pub/Sub.
gcloud pubsub topics create scc-findings-prod \
  --project=security-ops-prod

gcloud scc notifications create scc-notif-cryptomining \
  --organization=ORG_ID \
  --pubsub-topic=projects/security-ops-prod/topics/scc-findings-prod \
  --filter='state="ACTIVE" AND severity="CRITICAL" AND category="Mining: Bitcoin Pool"'

# Step 2: create the Pub/Sub subscription that filters categories with playbooks.
gcloud pubsub subscriptions create scc-findings-playbook-sub \
  --project=security-ops-prod \
  --topic=projects/security-ops-prod/topics/scc-findings-prod \
  --message-filter='attributes.category=("CRYPTOMINING" OR "PRIVILEGE_ESCALATION" OR "EXFILTRATION")'

# Step 3: deploy the Gen 2 Cloud Function that runs the containment playbook.
gcloud functions deploy scc-auto-containment \
  --project=security-ops-prod \
  --region=europe-west1 \
  --gen2 \
  --runtime=python312 \
  --source=./containment-playbook \
  --entry-point=handle_finding \
  --trigger-topic=scc-findings-prod \
  --service-account=scc-containment-sa@security-ops-prod.iam.gserviceaccount.com

# Step 4: grant the containment SA the precise IAM needed in target projects.
gcloud organizations add-iam-policy-binding ORG_ID \
  --member='serviceAccount:scc-containment-sa@security-ops-prod.iam.gserviceaccount.com' \
  --role='roles/iam.serviceAccountAdmin' \
  --condition='expression=resource.name.startsWith("projects/svc-"),title=svc-projects-only'

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud SCC + Pub/Sub + Cloud Functions Gen 2 docs (accessed 2026-05)
resource "google_pubsub_topic" "scc_findings" {
  project = var.security_ops_project
  name    = "scc-findings-prod"
}

resource "google_scc_notification_config" "cryptomining" {
  config_id    = "scc-notif-cryptomining"
  organization = var.org_id
  description  = "Active CRITICAL cryptomining findings"
  pubsub_topic = google_pubsub_topic.scc_findings.id

  streaming_config {
    filter = "state=\"ACTIVE\" AND severity=\"CRITICAL\" AND category=\"Mining: Bitcoin Pool\""
  }
}

resource "google_pubsub_subscription" "playbook_sub" {
  project = var.security_ops_project
  name    = "scc-findings-playbook-sub"
  topic   = google_pubsub_topic.scc_findings.id

  filter = "attributes.category = \"CRYPTOMINING\" OR attributes.category = \"PRIVILEGE_ESCALATION\" OR attributes.category = \"EXFILTRATION\""

  ack_deadline_seconds = 60
}

resource "google_cloudfunctions2_function" "containment" {
  project  = var.security_ops_project
  name     = "scc-auto-containment"
  location = "europe-west1"

  build_config {
    runtime     = "python312"
    entry_point = "handle_finding"
    source {
      storage_source {
        bucket = var.fn_source_bucket
        object = "containment-playbook.zip"
      }
    }
  }

  service_config {
    service_account_email = google_service_account.containment_sa.email
    available_memory      = "512M"
    timeout_seconds       = 120
  }

  event_trigger {
    trigger_region = "europe-west1"
    event_type     = "google.cloud.pubsub.topic.v1.messagePublished"
    pubsub_topic   = google_pubsub_topic.scc_findings.id
    retry_policy   = "RETRY_POLICY_RETRY"
  }
}

resource "google_service_account" "containment_sa" {
  project    = var.security_ops_project
  account_id = "scc-containment-sa"
}

Remediation — Infrastructure Manager

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(SCC notifications + Eventarc docs)	n/a	IR-4(1); IR-4(7); SI-4(7)	A.5.26	CLD.12.4.5

Log signals

Cloud Audit Logs on cloudfunctions.googleapis.com for functions.delete targeting the IR-automation function bound to the SCC findings Pub/Sub topic.
Function-update events changing the entry point or the Pub/Sub trigger topic to a non-SCC source — silent re-pointing of the responder.
Pub/Sub subscription IAM mutations removing the function's roles/pubsub.subscriber binding — disconnects the fanout without deleting either side.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND ((protoPayload.serviceName="cloudfunctions.googleapis.com"
                AND protoPayload.methodName=~".*functions.(delete|update)"
                AND protoPayload.resourceName=~".*scc-responder.*")
               OR (protoPayload.serviceName="pubsub.googleapis.com"
                   AND protoPayload.methodName=~".*subscriptions.SetIamPolicy"
                   AND protoPayload.resourceName=~".*scc-findings.*"))

This Cloud Logging filter watches the responder function's lifecycle and the Pub/Sub binding that ties it to SCC findings; pair with a Cloud Monitoring synthetic check that publishes a test finding every hour to verify end-to-end responder activation.

Alert threshold

Page on any delete or update of the IR-responder function or any IAM mutation on its Pub/Sub subscription.
Page on the synthetic finding failing to invoke the responder for two consecutive hourly checks.

Initial response

Restore the function from the captured Terraform state via terraform apply; re-bind the Pub/Sub subscriber role; verify the next synthetic finding invokes the responder.
Backfill any SCC findings raised during the responder outage by replaying via gcloud pubsub topics publish against the recovered subscription.
Pin responder code + IAM bindings in Terraform; add a Cloud Asset Inventory feed on the function and topic so future delete or unbind events fire via an independent channel.

References

Google Cloud — SCC notification responder pattern (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI · Pair-control: gcp-log-04-scc-premium

gcp-ir-03-forensic-bucket ! CRITICAL RESPONSIVE

Provision a dedicated Cloud Storage bucket in a dedicated forensic-evidence-prod project (sibling to the security-operations project, with its own IAM perimeter) for incident-evidence preservation. The bucket carries Cloud Storage Bucket Lock with retention_policy { is_locked = true; retention_period_seconds = 31557600 } — one calendar year, locked at bucket-creation time, immutable for the lifetime of the bucket (Google Cloud — Bucket Lock documentation (accessed 2026-05)). LOCKED retention is the only retention mode that survives a compromised storage admin: once the policy is locked, no principal — including organization admins and the project owner — can reduce or remove it; the only way to free the objects is to wait out the retention period. This is the precise analog of aws-ir-03 using S3 Object Lock in Compliance mode (Compliance, not Governance — Governance has s3:BypassGovernanceRetention which a sufficiently-privileged attacker acquires) and azure-ir-03 using Immutable Blob storage in Locked mode (Locked, not Unlocked — Unlocked is subscription-owner-bypassable). Layer customer-managed encryption keys (CMEK) via Cloud KMS on the bucket so evidence-at-rest is bound to the same key-management perimeter as the production-data CMEK chain; tag every uploaded object with chain-of-custody metadata (incident ID, uploader principal, SHA-256 hash, ingest timestamp). The CRITICAL rating reflects that evidence destroyed during the incident is irrecoverable — no after-the-fact compensating control exists.

MITIGATES: An attacker (or a privileged-insider scenario) deletes or overwrites evidence after compromising the storage admin role on the security project, defeating any subsequent forensic analysis, regulator notification, or law-enforcement engagement. Also mitigates accidental deletion by a responder under stress and lifecycle-rule misconfiguration that silently expires evidence before the post-incident review.

ATTACK VECTOR: The incident under investigation involves a compromised organisation-admin credential. The attacker, anticipating forensic preservation, enumerates Cloud Storage buckets across the organisation and identifies the evidence bucket. Without Bucket Lock: the attacker is one gcloud storage rm --recursive away from destroying every artifact. Without LOCKED retention (just an unlocked retention policy): the attacker removes the retention policy first, then deletes. With LOCKED retention: no principal can reduce the policy; the evidence survives for the retention window regardless of any IAM compromise.

BLAST RADIUS: Without Bucket Lock: all evidence collected to date. With unlocked retention: same — the policy is bypassable. With LOCKED retention: zero — the policy is non-bypassable for the retention period.

Remediation — gcloud CLI

# gcloud CLI (latest stable)
# Step 1: create the dedicated forensic-evidence project under the security folder.
gcloud projects create forensic-evidence-prod \
  --folder=FOLDER_ID_SECURITY \
  --name='Forensic Evidence (production)'

# Step 2: create the bucket with CMEK + uniform bucket-level access.
gcloud storage buckets create gs://forensic-evidence-prod \
  --project=forensic-evidence-prod \
  --location=europe-west1 \
  --uniform-bucket-level-access \
  --default-encryption-key='projects/security-kms-prod/locations/europe-west1/keyRings/forensic/cryptoKeys/forensic-cmek' \
  --public-access-prevention

# Step 3: set the retention policy to 1 year (31_557_600 seconds).
gcloud storage buckets update gs://forensic-evidence-prod \
  --retention-period=31557600s

# Step 4: lock the retention policy. THIS IS IRREVERSIBLE.
# Once locked, the retention period can only be INCREASED, never reduced or removed.
gcloud storage buckets update gs://forensic-evidence-prod \
  --lock-retention-period

# Step 5: bind the IR team service account at object-admin scope.
gcloud storage buckets add-iam-policy-binding gs://forensic-evidence-prod \
  --member='serviceAccount:ir-team-sa@security-ops-prod.iam.gserviceaccount.com' \
  --role='roles/storage.objectAdmin'

# Step 6: bind external IR partners at object-viewer scope for read-only handoff.
gcloud storage buckets add-iam-policy-binding gs://forensic-evidence-prod \
  --member='group:external-ir-partners@example.com' \
  --role='roles/storage.objectViewer'

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud Bucket Lock + retention policy docs (accessed 2026-05)
resource "google_storage_bucket" "forensic_evidence" {
  project                     = "forensic-evidence-prod"
  name                        = "forensic-evidence-prod"
  location                    = "europe-west1"
  uniform_bucket_level_access = true
  public_access_prevention    = "enforced"

  encryption {
    default_kms_key_name = var.forensic_cmek_id
  }

  retention_policy {
    is_locked        = true
    retention_period = 31557600  # 1 year, in seconds
  }

  versioning {
    enabled = true
  }

  lifecycle {
    prevent_destroy = true
  }
}

resource "google_storage_bucket_iam_member" "ir_team_admin" {
  bucket = google_storage_bucket.forensic_evidence.name
  role   = "roles/storage.objectAdmin"
  member = "serviceAccount:ir-team-sa@security-ops-prod.iam.gserviceaccount.com"
}

resource "google_storage_bucket_iam_member" "external_ir_viewer" {
  bucket = google_storage_bucket.forensic_evidence.name
  role   = "roles/storage.objectViewer"
  member = "group:external-ir-partners@example.com"
}

Remediation — Config Connector

apiVersion: storage.cnrm.cloud.google.com/v1beta1
kind: StorageBucket
metadata:
  name: forensic-evidence
  namespace: config-control
spec:
  location: us-central1
  uniformBucketLevelAccess: true
  publicAccessPrevention: enforced
  versioning:
    enabled: true
  retentionPolicy:
    retentionPeriod: 31536000  # 1 year minimum
    isLocked: true
  logging:
    logBucketRef:
      external: "log-sink-bucket"

Remediation — Pulumi (TypeScript)

import * as gcp from "@pulumi/gcp";

// Immutable forensic-evidence bucket: bucket lock + retention policy + access logs.
const forensicBucket = new gcp.storage.Bucket("forensic-evidence", {
    name: "forensic-evidence",
    location: "US-CENTRAL1",
    uniformBucketLevelAccess: true,
    publicAccessPrevention: "enforced",
    versioning: { enabled: true },
    retentionPolicy: {
        retentionPeriod: 31536000,  // 1 year
        isLocked: true,
    },
    forceDestroy: false,
});

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(Bucket Lock docs)	n/a	AU-11; IR-4(7); SI-7	A.5.28; A.8.13	CLD.12.4.5

Log signals

Cloud Audit Logs on storage.googleapis.com for storage.buckets.update reducing the forensic bucket's retentionPolicy.retentionPeriod or disabling Object Versioning.
Bucket-lock state transitions: the forensic bucket's retention policy should be locked, and any buckets.lockRetentionPolicy reversal attempt produces a denial that is still worth reviewing.
Bucket-IAM mutations on the forensic bucket adding any storage.objects.delete-capable role to a non-incident principal.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND protoPayload.serviceName="storage.googleapis.com"
          AND protoPayload.resourceName=~".*buckets/forensic-evidence.*"
          AND (protoPayload.methodName="storage.buckets.update"
               OR protoPayload.methodName="storage.setIamPermissions")

Run this Cloud Logging filter at project scope on the forensic-bucket project; pair with a Cloud Asset Inventory feed so retention-policy + IAM-state drift surface in real time, independent of audit-log delivery.

Alert threshold

Page on any mutation to the forensic bucket's retention policy, versioning, or IAM bindings.
Page on any object-delete attempt against the forensic bucket; with the retention lock the delete should be denied, but the attempt itself is signal.

Initial response

If the retention policy is not yet locked, restore the original retention period and apply the lock via gcloud storage buckets update --lock-retention-policy; locked policies cannot be shortened.
Revoke unauthorised IAM bindings; if any object was successfully deleted within object-versioning history, restore the noncurrent version via the legacy gsutil cp gs://bucket/object#GENERATION command (gsutil is legacy; gcloud storage cp is the current default).
Audit the principal that issued the mutation for additional forensic-scope tampering attempts; treat as candidate post-breach cover-up activity.

References

Google Cloud — Bucket Lock for forensic retention (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI

gcp-ir-04-bigquery-audit-logs ! HIGH RESPONSIVE

Use the BigQuery audit-log dataset created by the aggregated Cloud Logging sink to run SQL-based forensic queries across the corpus of Admin Activity, Data Access, System Event, and Policy Denied logs. The sink itself — partitioned table layout, 2-year retention via partition expiration, KMS-CMK encryption — is owned by gcp-log-08-bigquery-audit-sink; this IR control covers the forensic-query workflow that consumes the dataset. Maintain a saved-query library — implemented as google_bigquery_routine resources — covering canonical hunts: "all IAM role grants in the last 30 days", "all service-account-key creation events in the last 90 days", "all Cloud Storage storage.objects.list on the forensic-evidence bucket", "all Pub/Sub pubsub.subscriptions.create on the SCC findings topic", "all setIamPolicy calls by principals outside the security operations group" (Google Cloud — Cloud Audit Logs best practices (accessed 2026-05)). Partitioning by timestamp day and clustering by protoPayload.serviceName keeps the canonical hunts under a few-GB scan; analysts pivot from one finding to the next without leaving the BigQuery console. Same-phase STRICT pair-control: the sink configuration itself lives at gcp-log-08-bigquery-audit-sink — author the sink there; author the saved-query library here.

MITIGATES: Inability to answer time-bounded forensic questions across hundreds of millions of audit events during an active incident. Without SQL-driven hunts, the responder relies on the Cloud Logging console's free-text search which is rate-limited, hard to share, and not joinable across log types. Compounds when the incident spans weeks of historical events and crosses multiple audit-log categories.

ATTACK VECTOR: Not a direct attack vector — this control mitigates failure-modes of detection and investigation. The scenario is: an analyst discovers an unauthorised IAM role grant during the incident triage. With the saved-query library, "show every setIamPolicy call by this principal across every project in the last 90 days" is one SQL query returning in seconds. Without it: ad-hoc Logs Explorer queries that time out at the 30-day retention boundary or exceed Cloud Logging quota.

BLAST RADIUS: Without forensic-query workflow: investigation time inflates by a factor proportional to incident scope; analysts cannot share queries, cannot version them in a runbook repo, and re-author the same hunts each incident. With saved-query library: investigation steps are reproducible, versioned, and runnable by any analyst with BigQuery dataset reader role.

Remediation — gcloud CLI

# gcloud CLI + bq (latest stable)
# Step 1: verify the audit-log dataset exists (created by gcp-log-08 sink).
bq ls --project_id=security-logs-prod | grep cloudaudit_googleapis_com_

# Step 2: run a canonical forensic query — all IAM role grants in the last 30 days
# by a specific principal across every project.
bq query --use_legacy_sql=false --project_id=security-logs-prod \
  --max_rows=10000 \
'SELECT
  timestamp,
  protoPayload.authenticationInfo.principalEmail AS actor,
  resource.labels.project_id AS project,
  protoPayload.methodName AS method,
  protoPayload.serviceData.policyDelta.bindingDeltas AS deltas
FROM
  `security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
WHERE
  _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
                   AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
  AND protoPayload.methodName = "SetIamPolicy"
  AND protoPayload.authenticationInfo.principalEmail = "suspect-actor@example.com"
ORDER BY timestamp DESC;'

# Step 3: persist the query as a saved BigQuery routine for reuse.
bq query --use_legacy_sql=false --project_id=security-logs-prod \
'CREATE OR REPLACE PROCEDURE
  `security-logs-prod.forensic_hunts.iam_grants_by_actor`(
    actor STRING, lookback_days INT64
  )
BEGIN
  SELECT timestamp, resource.labels.project_id, protoPayload.methodName
  FROM `security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
  WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL lookback_days DAY))
                         AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
    AND protoPayload.methodName IN ("SetIamPolicy", "google.iam.admin.v1.SetIamPolicy")
    AND protoPayload.authenticationInfo.principalEmail = actor
  ORDER BY timestamp DESC;
END;'

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud Logging-to-BigQuery + bigquery_routine docs (accessed 2026-05)
# The dataset itself is authored by gcp-log-08-bigquery-audit-sink in gcp/logging.html.
# This snippet adds the saved-query library on top of that dataset.

resource "google_bigquery_dataset" "forensic_hunts" {
  project    = "security-logs-prod"
  dataset_id = "forensic_hunts"
  location   = "europe-west1"

  default_encryption_configuration {
    kms_key_name = var.logs_cmek_id
  }
}

resource "google_bigquery_routine" "iam_grants_by_actor" {
  project         = "security-logs-prod"
  dataset_id      = google_bigquery_dataset.forensic_hunts.dataset_id
  routine_id      = "iam_grants_by_actor"
  routine_type    = "PROCEDURE"
  language        = "SQL"

  arguments {
    name      = "actor"
    data_type = jsonencode({ "typeKind" = "STRING" })
  }
  arguments {
    name      = "lookback_days"
    data_type = jsonencode({ "typeKind" = "INT64" })
  }

  definition_body = <<-SQL
    SELECT timestamp, resource.labels.project_id, protoPayload.methodName
    FROM `security-logs-prod.cloudaudit_googleapis_com_activity.cloudaudit_googleapis_com_activity_*`
    WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL lookback_days DAY))
                           AND FORMAT_DATE("%Y%m%d", CURRENT_DATE())
      AND protoPayload.methodName IN ("SetIamPolicy", "google.iam.admin.v1.SetIamPolicy")
      AND protoPayload.authenticationInfo.principalEmail = actor
    ORDER BY timestamp DESC;
  SQL
}

Remediation — Config Connector

apiVersion: logging.cnrm.cloud.google.com/v1beta1
kind: LoggingLogSink
metadata:
  name: audit-to-bigquery
  namespace: config-control
spec:
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Organization
    external: "organizations/ORG_ID"
  destination:
    bigQueryDatasetRef:
      external: "projects/PROJECT_ID/datasets/audit_archive"
  filter: 'logName:"cloudaudit.googleapis.com"'
  includeChildren: true

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(Cloud Audit Logs + BigQuery docs)	n/a	AU-11; IR-4(7)	A.5.28	CLD.12.4.5

Log signals

Cloud Audit Logs on bigquery.googleapis.com for Dataset.delete, Table.delete, or Dataset.setIamPolicy on the audit-log replication dataset.
BigQuery query history showing scheduled-query failures on the forensic transforms — a stalled scheduled query means the forensic data product is silently no longer updated.
BigQuery dataset CMEK rotation events: if the dataset is CMEK-protected, key destroy on the wrapping key blocks all subsequent reads.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND protoPayload.serviceName="bigquery.googleapis.com"
          AND (protoPayload.methodName=~".*Dataset.(delete|setIamPolicy)"
               OR protoPayload.methodName=~".*Table.delete")
          AND protoPayload.resourceName=~".*datasets/forensic_audit_logs.*"

This Cloud Logging filter joins against the scheduled-query failure stream from BigQuery Data Transfer Service; both signals together capture both DDL drift and pipeline failure on the forensic audit-log dataset.

Alert threshold

Page on any DDL operation on the forensic audit-log dataset or its scheduled queries.
Page on any IAM mutation adding a non-incident principal to the dataset, or removing the audit-log sink writer-identity binding.

Initial response

Restore the dataset / table via BigQuery time-travel (FOR SYSTEM_TIME AS OF) if within the 7-day window; otherwise restore from snapshot or the parallel Cloud Storage sink destination.
Re-run the scheduled query backfill to close the data-product gap; verify the forensic-query views produce expected row counts for the gap window.
Pin dataset DDL + IAM + scheduled queries in Terraform; gate edits via change-management and require the BigQuery DDL-deny IAM condition to remain in place outside incidents.

References

Google Cloud — BigQuery audit logs (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI · Pair-control: gcp-log-08-bigquery-audit-sink

gcp-ir-05-gce-isolation ! HIGH RESPONSIVE

Document a runbook for Compute Engine instance isolation that an on-call responder can execute end-to-end in a small number of named commands. The canonical step order: (1) snapshot the boot disk and every attached data disk via gcloud compute snapshots create with a snapshot name that encodes the incident ID, before any other action that could alter on-disk state; (2) swap the instance's network tags so it falls under the pre-deployed quarantine Hierarchical Firewall Policy / VPC firewall rule (deny-all egress except DNS to the IR-analyst forwarder, deny-all ingress except SSH from the named IR-analyst IP range, no metadata-server reachability except for IR analysts); (3) disable the instance's service account via gcloud iam service-accounts disable to revoke the SA's downstream IAM at platform scope; (4) attach an incident-id=INC-YYYY-NNN label to the instance for cross-referencing in BigQuery audit-log forensic queries; (5) extract the instance's Cloud Logging slice for the relevant time window via gcloud logging read and persist it to the forensic-evidence bucket from gcp-ir-03 with chain-of-custody metadata (Google Cloud — Create and manage disk snapshots (accessed 2026-05)). The quarantine network policy and the snapshot schedule are pre-deployed via google_compute_resource_policy and google_compute_firewall_policy so the runbook runs on infrastructure that already exists; the responder never has to author firewall rules under stress.

MITIGATES: Loss of forensic state from premature instance termination, ongoing attacker activity from a compromised VM during the response window, lateral movement from the implicated VM to peers in the same VPC, and exfiltration over outbound connections opened by the attacker before containment. Also mitigates the inverse failure where a stressed responder shuts the VM down before snapshotting and loses volatile memory and on-disk state.

ATTACK VECTOR: A web application running on a Compute Engine VM is compromised via an unpatched RCE in an application dependency. The attacker installs a reverse-shell, enumerates the metadata server, and uses the instance service account to enumerate Cloud Storage buckets across the project. SCC emits an Event Threat Detection finding for the metadata-server access pattern; the on-call responder is paged. Without a runbook: terminate the instance, lose volatile state, lose any open files the attacker had not yet flushed, lose the responder's ability to answer "what did the attacker actually do during the dwell window".

BLAST RADIUS: Without runbook: every VM the SA could reach during the post-detection window, plus the volatile-state evidence on the VM itself. With runbook: bounded to the snapshot point in time; the VM is offline within seconds of step 2; the SA is offline within seconds of step 3.

Remediation — gcloud CLI

# gcloud CLI (latest stable) — Compute Engine isolation runbook
INCIDENT_ID='INC-2026-0042'
INSTANCE='vm-app-prod-7f3a'
ZONE='europe-west1-b'
PROJECT='svc-app-prod'

# Step 1: snapshot boot + data disks BEFORE any other action.
gcloud compute snapshots create "boot-snap-${INCIDENT_ID}" \
  --source-disk="${INSTANCE}" \
  --source-disk-zone="${ZONE}" \
  --project="${PROJECT}" \
  --storage-location=eu \
  --labels="incident-id=${INCIDENT_ID,,},purpose=forensic"

# Step 2: swap network tags so the VM falls under the quarantine firewall policy.
gcloud compute instances remove-tags "${INSTANCE}" \
  --zone="${ZONE}" --project="${PROJECT}" \
  --tags=app-prod

gcloud compute instances add-tags "${INSTANCE}" \
  --zone="${ZONE}" --project="${PROJECT}" \
  --tags=quarantine

# Step 3: disable the instance service account at platform scope.
INSTANCE_SA=$(gcloud compute instances describe "${INSTANCE}" \
  --zone="${ZONE}" --project="${PROJECT}" \
  --format='value(serviceAccounts.email)')

gcloud iam service-accounts disable "${INSTANCE_SA}" \
  --project="${PROJECT}"

# Step 4: attach the incident-id label.
gcloud compute instances add-labels "${INSTANCE}" \
  --zone="${ZONE}" --project="${PROJECT}" \
  --labels="incident-id=${INCIDENT_ID,,}"

# Step 5: extract the instance's audit-log slice and persist to forensic bucket.
gcloud logging read \
  "resource.labels.instance_id=\"$(gcloud compute instances describe ${INSTANCE} \
    --zone=${ZONE} --project=${PROJECT} --format='value(id)')\"
   AND timestamp>=\"2026-05-22T00:00:00Z\"" \
  --project="${PROJECT}" \
  --format=json > "/tmp/${INCIDENT_ID}-${INSTANCE}-logs.json"

gcloud storage cp "/tmp/${INCIDENT_ID}-${INSTANCE}-logs.json" \
  "gs://forensic-evidence-prod/${INCIDENT_ID}/${INSTANCE}/logs.json"

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud Compute Engine snapshot + firewall policy docs (accessed 2026-05)
# Pre-deployed infrastructure that makes the gcloud runbook above executable.

resource "google_compute_resource_policy" "boot_snapshot_daily" {
  project = var.app_project
  name    = "boot-snapshot-daily"
  region  = "europe-west1"

  snapshot_schedule_policy {
    schedule {
      daily_schedule {
        days_in_cycle = 1
        start_time    = "02:00"
      }
    }
    retention_policy {
      max_retention_days    = 30
      on_source_disk_delete = "KEEP_AUTO_SNAPSHOTS"
    }
    snapshot_properties {
      labels        = { purpose = "automated-backup" }
      storage_locations = ["eu"]
    }
  }
}

# Pre-deployed quarantine VPC firewall policy — deny-all except IR analyst IPs.
resource "google_compute_firewall" "quarantine_deny_egress" {
  project   = var.app_project
  name      = "quarantine-deny-egress"
  network   = var.vpc_id
  direction = "EGRESS"
  priority  = 100

  target_tags        = ["quarantine"]
  destination_ranges = ["0.0.0.0/0"]

  deny {
    protocol = "all"
  }
}

resource "google_compute_firewall" "quarantine_allow_ir_ssh" {
  project   = var.app_project
  name      = "quarantine-allow-ir-ssh"
  network   = var.vpc_id
  direction = "INGRESS"
  priority  = 90  # Lower (= earlier) than the deny.

  target_tags   = ["quarantine"]
  source_ranges = var.ir_analyst_ip_ranges

  allow {
    protocol = "tcp"
    ports    = ["22"]
  }
}

Remediation — Config Connector

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeFirewall
metadata:
  name: ir-isolation-deny-all
  namespace: config-control
spec:
  networkRef:
    name: prod-vpc
  direction: INGRESS
  priority: 100
  denied:
  - protocol: all
  sourceRanges:
  - "0.0.0.0/0"
  targetTags:
  - "ir-quarantine"

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(GCE IR + VPC firewall docs)	n/a	IR-4(2); IR-4(7)	A.5.26	CLD.9.5.1

Log signals

Cloud Audit Logs on compute.googleapis.com for v1.compute.instances.setTags applying the documented quarantine network tag — emitted by the responder when isolating a candidate-compromised VM.
VPC firewall-rule events: firewalls.insert creating a deny-all egress rule targeting the quarantine tag — the rule should pre-exist; create-events indicate the responder built the rule on demand.
VM disk-snapshot events tagged forensic-snapshot: compute.snapshots.insert with the documented label set — ties snapshot creation to incident provenance.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND protoPayload.serviceName="compute.googleapis.com"
          AND ((protoPayload.methodName="v1.compute.instances.setTags"
                AND protoPayload.request.items="quarantine")
               OR (protoPayload.methodName="v1.compute.snapshots.insert"
                   AND protoPayload.request.labels.purpose="forensic-snapshot"))

Stream this Cloud Logging filter into the incident management Pub/Sub topic so isolation actions become part of the canonical incident timeline; the join-key on incident IDs is the snapshot's labels.incidentId field.

Alert threshold

Notify (do not page) on every quarantine-tag application — these are valid responder outputs, surface in the incident timeline.
Page if the quarantine-egress firewall rule is missing from the project; it should pre-exist and be enforced as a configuration-as-code invariant.

Initial response

Verify the quarantine tag took effect by inspecting VPC Flow Logs for the affected VM — egress should drop to zero within seconds of the tag applying.
Snapshot the boot disk + any attached data disks with --labels=purpose=forensic-snapshot,incident-id=…; transfer the snapshots to the forensic-evidence project for offline analysis.
If the firewall rule is missing, recreate it from the Terraform baseline and re-apply the quarantine tag to ensure isolation actually engages.

References

Google Cloud — Security foundations: incident isolation (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI

gcp-ir-06-sa-key-revoke ! HIGH RESPONSIVE

Document a runbook for compromised-identity response covering both user identities and service-account credentials. The canonical step order: (1) for a user identity, suspend the user via Workspace Admin SDK / Cloud Identity (gcloud identity + Admin Console) — this immediately blocks future sign-in; (2) revoke all active OAuth refresh tokens for the user via the Admin SDK users.signOut / tokens.delete APIs so already-issued tokens cannot be reused until expiry; (3) for a compromised service account, delete every long-lived JSON key via gcloud iam service-accounts keys delete and disable the SA via gcloud iam service-accounts disable; (4) enumerate active sessions and recent API calls in BigQuery (cross-link to gcp-ir-04); (5) rotate any dependent secrets the identity had access to in Secret Manager via gcloud secrets versions add + gcloud secrets versions disable of the previous version; (6) re-issue scoped IAM bindings on a fresh user / service-account that does not inherit the compromised identity's role accumulation (Google Cloud — Create and delete service account keys (accessed 2026-05)). The zero-tolerance baseline that makes service-account-key compromise rare in the first place is owned by gcp-iam-02-no-sa-keys; this control covers the response path for the residual exceptions and for the user-identity branch.

MITIGATES: Continued unauthorised API access by an attacker holding a leaked service-account key, a stolen OAuth refresh token, or active user sessions on a compromised Workspace account. Compounds when the credential has accumulated wide IAM (Editor on multiple projects), when the credential is referenced from CI/CD configurations, or when the human owner is unaware the credential has been compromised.

ATTACK VECTOR: A service-account JSON key is committed to a public GitHub repository by a contractor under deadline pressure. GitHub secret-scanning emits a webhook to the security-operations Slack within minutes; the on-call responder must revoke the credential before the attacker enumerates its IAM and acts on it. Without a runbook: the responder may delete the key but forget to disable the SA, leaving any in-flight access tokens valid for up to one hour. The complete runbook also disables the SA, rotates dependent Secret Manager secrets, and audits the SA's recent API call history via BigQuery to determine whether the leak has already been exploited.

BLAST RADIUS: Without runbook: every resource the credential could reach during the leak window, plus any access-token TTL remaining after key deletion. With runbook: bounded to the leak window before the runbook executes (typically minutes from secret-scanning detection).

Remediation — gcloud CLI

# gcloud CLI + Admin SDK (latest stable) — credential revocation runbook
INCIDENT_ID='INC-2026-0042'
SA_EMAIL='compromised-sa@svc-app-prod.iam.gserviceaccount.com'
USER_EMAIL='compromised-user@example.com'

# Branch A: compromised service account.

# Step A.1: list every long-lived JSON key on the SA.
gcloud iam service-accounts keys list \
  --iam-account="${SA_EMAIL}" \
  --filter='keyType=USER_MANAGED' \
  --format='value(name)'

# Step A.2: delete every USER_MANAGED key (Google-managed keys cannot/should not be deleted).
for key in $(gcloud iam service-accounts keys list \
  --iam-account="${SA_EMAIL}" --filter='keyType=USER_MANAGED' --format='value(name)'); do
    gcloud iam service-accounts keys delete "${key}" \
      --iam-account="${SA_EMAIL}" --quiet
done

# Step A.3: disable the SA so any in-flight access tokens are rejected at next refresh.
gcloud iam service-accounts disable "${SA_EMAIL}"

# Branch B: compromised user identity.

# Step B.1: suspend the Workspace user (blocks future sign-in).
# Done via Workspace Admin Console OR Directory API.
gcloud identity groups memberships list \
  --group-email=all-users@example.com \
  --format='value(preferredMemberKey.id)' | grep "${USER_EMAIL}"

# Step B.2: revoke all active OAuth refresh tokens for the user.
# Workspace Admin SDK Directory API tokens.delete — invoked via gcloud REST.
gcloud auth print-access-token \
  | xargs -I{} curl -X POST \
    -H "Authorization: Bearer {}" \
    "https://admin.googleapis.com/admin/directory/v1/users/${USER_EMAIL}/signOut"

# Step B.3: enumerate the user's recent API calls in BigQuery (uses gcp-ir-04 routine).
bq query --use_legacy_sql=false --project_id=security-logs-prod \
  "CALL \`security-logs-prod.forensic_hunts.iam_grants_by_actor\`('${USER_EMAIL}', 90);"

# Step C (both branches): rotate dependent Secret Manager secrets.
for secret in $(gcloud secrets list --project=svc-app-prod --format='value(name)'); do
  gcloud secrets versions add "${secret}" \
    --project=svc-app-prod \
    --data-file=/tmp/new-secret-value
done

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: Google Cloud IAM SA-key + Secret Manager docs (accessed 2026-05)
# Pre-positioned: a notification channel and incident-runbook reference that
# the runbook above can be tracked against. SA key creation itself is governed
# by gcp-iam-02-no-sa-keys (zero-tolerance baseline).

resource "google_secret_manager_secret" "app_db_password" {
  project   = var.app_project
  secret_id = "app-db-password"

  replication {
    user_managed {
      replicas {
        location = "europe-west1"
        customer_managed_encryption {
          kms_key_name = var.app_cmek_id
        }
      }
    }
  }

  rotation {
    next_rotation_time = "2026-08-01T00:00:00Z"
    rotation_period    = "7776000s"  # 90 days
  }
}

resource "google_organization_iam_audit_config" "sa_key_events" {
  org_id  = var.org_id
  service = "iam.googleapis.com"

  audit_log_config {
    log_type = "ADMIN_READ"
  }
  audit_log_config {
    log_type = "DATA_WRITE"
  }
}

Remediation — Config Connector

apiVersion: orgpolicy.cnrm.cloud.google.com/v1beta1
kind: OrgPolicyPolicy
metadata:
  name: sa-key-max-lifetime
  namespace: config-control
spec:
  resourceRef:
    apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
    kind: Organization
    external: "organizations/ORG_ID"
  spec:
    rules:
    - values:
        allowedValues:
        - "7d"
  name: "organizations/ORG_ID/policies/iam.serviceAccountKeyExpiry"

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
n/a	n/a	(IAM SA key + Cloud Identity session-revoke docs)	n/a	IR-4; IA-5; AC-2(13)	A.5.26; A.5.17	CLD.9.5.1

Log signals

Cloud Audit Logs on iam.googleapis.com for DisableServiceAccountKey and DeleteServiceAccountKey emitted by the responder — these are the canonical credential-revocation actions during an incident.
Service-account impersonation events post-revoke (iamcredentials.googleapis.com/GenerateAccessToken denied) — confirms the revocation took effect.
OAuth-token revocation events on user accounts via the Workspace Directory API.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND protoPayload.serviceName="iam.googleapis.com"
          AND (protoPayload.methodName="google.iam.admin.v1.DisableServiceAccountKey"
               OR protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccountKey")

This Cloud Logging filter feeds the incident-timeline Pub/Sub topic; pair with a denied-impersonation query (protoPayload.status.code=7 on iamcredentials) so the post-revoke validation is automatic.

Alert threshold

Notify on every responder-issued key disable / delete (expected during incidents).
Page if a previously revoked key produces a successful GenerateAccessToken response within 24 hours of revocation — indicates a residual valid token or cache window.

Initial response

Verify the disable/delete via gcloud iam service-accounts keys list; the key should report disabled: true or be absent entirely.
Force-rotate any downstream consumer that still references the revoked key path; if the consumer uses Workload Identity Federation, no rotation is needed and the responder action is sufficient.
Document the revoked-key fingerprint in the incident record and add the fingerprint to a deny-list checked by the responder so future re-issuance attempts of the same key material are surfaced.

References

Google Cloud — Disabling and enabling service-account keys (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI

gcp-ir-07-tabletop ! MEDIUM PREVENTIVE

Run quarterly tabletop exercises against three to five documented GCP-specific scenarios drawn from the threat-model corpus: (a) a Cloud Storage bucket discovered to be public with PII inside; (b) a service-account JSON key leaked to a public GitHub repository (secret-scanning webhook fires); (c) an SCC CRITICAL finding from Event Threat Detection for Mining: Bitcoin Pool against an unused project's quota; (d) a mass BigQuery export of a regulated dataset to an unfamiliar Cloud Storage bucket; (e) a Cloud Identity sign-in from a country outside the organisation's footprint with anomalous user-agent. For each exercise, run the relevant runbook end-to-end against a tabletop project (separate from production), time the steps, and track mean time-to-contain (MTTC) as the primary metric quarter-over-quarter (NIST SP 800-61 Rev 3 — CSF 2.0 Community Profile, April 2025 release (accessed 2026-05)). Engagement contract with Google Cloud Security (the Mandiant team, now operating under the Google Cloud Security brand following the 2022 acquisition) is itself tested annually by initiating a no-incident engagement — the engagement-works test is the control, not the contract; a contract that has never been exercised is, in practice, a contract that may not work when it is needed (Google Cloud — Mandiant Incident Response services (accessed 2026-05)). Exercises that surface a wrong, missing, or unexecutable runbook step are tracked as findings against the runbook repository and remediated before the next quarter. The control is typed PREVENTIVE (anti-decay rationale per PITFALL B-14; mirrors aws-ir-07 + azure-ir-07) because its value lies in preventing the runbook decay that always happens to documentation nobody runs — runbooks written and never tested are, in practice, runbooks that do not work.

MITIGATES: Runbook decay — runbooks correct when written but no longer reflecting the current GCP service surface (renamed APIs, changed gcloud subcommand syntax, deprecated audit-log field names, new SCC finding categories) by the time they are needed. Also mitigates responder unfamiliarity: a responder who has never run the playbook in anger is slower and more error-prone than one who has, even when the playbook itself is correct. Also mitigates engagement-contract decay where an annual retainer with Google Cloud Security has never been operationally tested.

ATTACK VECTOR: Not a direct attack vector — this control mitigates failure-modes of every other IR control on this page. The canonical scenario: an incident occurs, the responder opens the runbook, the first gcloud command in step 2 returns ERROR: (gcloud.compute) unrecognized arguments because the flag was renamed in a gcloud release eight months ago and nobody noticed. Time-to-contain inflates; attacker dwell time grows; the tabletop would have caught the drift.

BLAST RADIUS: The set of runbooks that have not been exercised. Without scheduled exercises, the set drifts toward "all of them" over time as the GCP surface evolves (gcloud monthly releases, audit-log schema changes, new SCC detector categories). With quarterly exercises, the set is bounded to "anything authored since the last exercise".

Remediation — gcloud CLI

# gcloud CLI (latest stable) — tabletop driver commands
# Tabletop exercises are facilitated workshops; the GCP surface they touch is
# the tabletop project. Representative driver: stand up a deliberately
# misconfigured Cloud Storage bucket so responders can practice scenario (a).

PROJECT='ir-tabletop-2026q2'
BUCKET='ir-tabletop-public-pii-202605'

# Step 1: create the tabletop bucket in the tabletop project.
gcloud storage buckets create "gs://${BUCKET}" \
  --project="${PROJECT}" \
  --location=europe-west1

# Step 2: deliberately DISABLE public-access-prevention for the exercise.
# Production buckets must NEVER have this configuration; this is exercise-only.
gcloud storage buckets update "gs://${BUCKET}" \
  --no-public-access-prevention

# Step 3: drop a synthetic PII file so the responder has something to remediate.
echo "name,ssn" > /tmp/tabletop-pii.csv
echo "Jane Doe,000-00-0000" >> /tmp/tabletop-pii.csv
gcloud storage cp /tmp/tabletop-pii.csv "gs://${BUCKET}/" --project="${PROJECT}"

# Step 4: record the MTTC for the exercise — start timestamp at SCC finding.
# End timestamp when the bucket is back to public-access-prevention=enforced
# AND the PII file has been deleted from the bucket.
gcloud logging write ir-tabletop-mttc \
  "{\"incident_id\":\"TABLETOP-2026Q2-A\",\"event\":\"start\",\"timestamp\":\"$(date -u +%FT%TZ)\"}" \
  --payload-type=json --project="${PROJECT}"

Remediation — Terraform

# Terraform Google provider ~> 5.0
# Source: NIST SP 800-61 Rev 3 + Google Cloud Mandiant IR services (accessed 2026-05)
# Tabletop-project-only resources. Apply only to the dedicated tabletop project.

resource "google_storage_bucket" "tabletop_scenario_a" {
  project                     = "ir-tabletop-2026q2"
  name                        = "ir-tabletop-public-pii-202605"
  location                    = "europe-west1"
  uniform_bucket_level_access = true
  public_access_prevention    = "inherited"  # exercise-only; prod = "enforced"

  labels = {
    purpose = "ir-tabletop"
    quarter = "2026q2"
  }
}

# BigQuery dataset that holds the MTTC time-series across quarters.
resource "google_bigquery_dataset" "ir_tabletop_metrics" {
  project    = "security-ops-prod"
  dataset_id = "ir_tabletop_metrics"
  location   = "europe-west1"
}

resource "google_bigquery_table" "mttc_history" {
  project    = "security-ops-prod"
  dataset_id = google_bigquery_dataset.ir_tabletop_metrics.dataset_id
  table_id   = "mttc_history"
  deletion_protection = true

  schema = jsonencode([
    { name = "incident_id",         type = "STRING",    mode = "REQUIRED" },
    { name = "scenario",            type = "STRING",    mode = "REQUIRED" },
    { name = "quarter",             type = "STRING",    mode = "REQUIRED" },
    { name = "start_timestamp",     type = "TIMESTAMP", mode = "REQUIRED" },
    { name = "contained_timestamp", type = "TIMESTAMP", mode = "REQUIRED" },
    { name = "mttc_seconds",        type = "INT64",     mode = "REQUIRED" }
  ])
}

Remediation — Infrastructure Manager

Compliance mapping

CIS AWS Foundations v7.0.0	CIS Microsoft Azure Foundations v6.0.0	CIS GCP Foundation v5.0.0	CIS OCI Foundation v3.1.0	NIST SP 800-53 rev5	ISO/IEC 27001:2022	ISO/IEC 27017:2015
(best-practices)	n/a	(NIST SP 800-61 rev 3 + Google IR engagement)	n/a	IR-2; IR-3; IR-3(2)	A.5.24; A.5.27	n/a

Log signals

Cloud Scheduler audit entries for the documented quarterly tabletop-exercise job: cloudscheduler.googleapis.com Jobs.run on the scheduler resource named tabletop-quarterly.
Cloud Workflows audit events on the incident-response-tabletop workflow execution: workflowexecutions.googleapis.com Executions.create ties exercise runs to a single execution-id.
Drift on the responder synthetic-injection function: Cloud Functions audit on the function bound to the tabletop topic should fire on every exercise; absence is signal.

Query

logName=~"projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
          AND ((protoPayload.serviceName="cloudscheduler.googleapis.com"
                AND protoPayload.resourceName=~".*jobs/tabletop-quarterly")
               OR (protoPayload.serviceName="workflowexecutions.googleapis.com"
                   AND protoPayload.resourceName=~".*workflows/incident-response-tabletop"))

Pin this Cloud Logging filter to a quarterly Cloud Monitoring scheduled report; the cadence is rare enough that the report doubles as a control-evidence artefact for ISO 27017 CLD.6.3.1 review.

Alert threshold

Page if the tabletop job has not executed within the documented quarterly window (90 days plus a 15-day grace).
Page on any deletion of the tabletop scheduler or workflow resource.

Initial response

Run the tabletop workflow on demand via gcloud workflows execute incident-response-tabletop; collect responder timings and the participant attendance list for the exercise record.
Compare measured responder latencies against the documented SLO targets; any divergence becomes a backlog ticket on the IR team's queue.
Re-run the exercise with the SLO-divergent step recorded as the focus area; the next quarterly run should show measurable improvement against the same step.

References

Google Cloud — Security foundations: incident response (accessed 2026-05)
Cross-provider equivalence: AWS · Azure · OCI

Equivalent on: AWS · Azure · OCI

GCP Incident Response Hardening

Overview

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Config Connector

Remediation — Pulumi (TypeScript)

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Infrastructure Manager

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Config Connector

Remediation — Pulumi (TypeScript)

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Config Connector

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Config Connector

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Config Connector

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Remediation — gcloud CLI

Remediation — Terraform

Remediation — Infrastructure Manager

Compliance mapping

Log signals

Query

Alert threshold

Initial response

References

Sources