General Logging & Detection Principles

Overview

Logging is the foundation on which every other detective and responsive control in the cloud rests. Without an authoritative, tamper-evident record of who did what to which resource, incident response is forensically blind: the responder cannot tell whether an attacker reached a database, whether credentials were used after exfiltration, or whether a configuration change was authorised. The corollary is operational rather than philosophical — a control that exists only as a configuration setting without a corresponding log entry cannot be audited and, in practice, is not enforced. This page sets the provider-neutral principles that the four provider logging pages (aws/logging.html, azure/logging.html, gcp/logging.html, oci/logging.html) then instantiate.

Detection is logging plus interpretation. A CloudTrail event, an Azure Activity Log entry, a Google Cloud Audit Log record, or an OCI Audit event is raw material; turning it into a finding requires a detection rule, an analyst who maintains the rule, a tested true-positive case, and an alert pipeline that reaches the on-call responder before the attacker has finished. The MITRE ATT&CK for Cloud matrix provides the canonical taxonomy of attacker techniques that detection engineering aims to cover; coverage is measured by mapping each maintained detection rule to one or more ATT&CK techniques and reporting gap regions to leadership rather than reporting raw rule counts.

The remainder of this page treats the logging-and-detection pipeline as a single discipline. Three log classes (control-plane, data-plane, network) are aggregated to a security-dedicated destination, made tamper-evident, retained for compliance- and forensic-driven floors, ingested by a SIEM, transformed into alerts via maintained detection content, and routed to runbook-equipped responders via the incident response workflow. The pipeline is the control. Cross-link to general/threat-model.html for the adversary techniques each stage of the pipeline is designed to surface, and to general/network.html for VPC and subnet flow-log sourcing.

What to log

Three log classes are mandatory in every cloud-resident environment. Control-plane audit logs record every API call against the provider's management plane: identity, action, target resource, source IP, success or failure, request parameters. AWS CloudTrail, Azure Activity Log (and Microsoft Entra ID audit and sign-in logs), Google Cloud Audit Logs (Admin Activity, Data Access, System Event, Policy Denied streams), and OCI Audit are the four canonical sources. Control-plane logs are the single most important log class because almost every cloud-resident attack chain — credential abuse, role assumption, key disablement, public-resource creation — passes through the control plane at some point.

Data-plane access logs record every read and write against storage and database services: S3 server-access logs, Azure Storage diagnostic logs, GCS data-access logs (subset of Cloud Audit Logs), OCI Object Storage request logs. Data-plane volume is one to three orders of magnitude higher than control-plane volume; the design choice is therefore which buckets, databases, and PaaS endpoints warrant data-plane logging (typically: every Restricted-tagged resource, every public-facing endpoint, every cross-account-accessed resource) rather than whether to enable it globally.

Network flow logs record connection metadata at the subnet, NIC, or virtual-network level: VPC Flow Logs in AWS, NSG Flow Logs (legacy) and VNet Flow Logs in Azure, VPC Flow Logs in GCP, and VCN Flow Logs in OCI. Flow logs do not carry payload but they carry source / destination / port / protocol / action and are the primary evidence source for post-compromise lateral-movement analysis. Cross-link to general/network.html for the segmentation model that flow logs validate.

NIST SP 800-92 (Guide to Computer Security Log Management) formalises the discipline of selecting, prioritising, and managing log sources; CIS Control 8 (Audit Log Management) in CIS Controls v8 codifies the operational checklist (enable audit logs, centralise collection, ensure adequate storage, configure detailed audit logging, review logs). The principle each codifies is the same: log selection is a deliberate, classification-driven decision, not a side effect of enabling every available source.

Log integrity

A log that the attacker can edit, delete, or silently halt is not evidence. Log integrity is therefore engineered, not assumed. Three controls combine to provide tamper-evidence. First, cryptographic chaining binds log entries together such that any deletion or modification breaks the chain — AWS CloudTrail log file validation produces a digest file that hashes every delivered log file, signed by AWS; equivalent integrity hashing is available in Azure Monitor diagnostic settings exports and in OCI Audit export pipelines. Second, write-once storage places log archives in an object store configured with object-lock or retention-rule policies (S3 Object Lock in Compliance mode, Azure immutable storage with time-based retention, GCS retention policy with bucket lock, OCI Object Storage retention rules) such that even an account-administrator principal cannot delete logs within the locked retention period. Third, cross-account isolation — covered in the next section — separates the identity that writes logs from the identity that can administer the log store, so that compromise of the workload account does not grant access to retroactively modify the logs that captured the compromise.

NIST SP 800-92 §5.4 (Protecting Log Data) defines the integrity model these controls instantiate: logs at rest are protected with the same rigour as the data they describe; logs in transit are encrypted under TLS; access to log infrastructure is restricted to a small, monitored set of administrators. CloudTrail log file validation is verified continuously by an automated job that re-runs the digest check and alerts on any mismatch — a "validation succeeded" check that nobody runs is equivalent to no check at all.

Centralization

Centralisation is the architectural pattern that turns the integrity rules above into operational reality. The pattern is hub-and-spoke: every workload account, subscription, project, or tenancy compartment ("spokes") emits its logs to a single security-dedicated destination ("hub") whose administration is segregated. Centralisation provides three properties simultaneously: tamper-evidence (the hub's object lock survives the compromise of any spoke), unified detection scope (a SIEM ingests from one location rather than N), and economy of analyst attention (cross-spoke correlation surfaces attacks that touch multiple accounts).

Each provider names the centralisation primitive differently. AWS uses an Organization Trail in CloudTrail (covers every account in the organization with one configuration) delivering to an S3 bucket in the dedicated Log Archive account, with Object Lock enabled and a bucket policy that permits the organization to PutObject but denies any DeleteObject or modification action. Azure uses Diagnostic Settings on every subscription routing Activity Log and resource logs to a central Log Analytics workspace (and / or a storage account for long retention) in a dedicated security subscription, optionally fed into an enterprise Microsoft Sentinel workspace. GCP uses Aggregated Sinks at the organization or folder level (one sink, many projects of source) routing to a Cloud Storage bucket, BigQuery dataset, or Pub/Sub topic in a security folder's project. OCI uses Connector Hub to route audit and service logs from every compartment into a centralised destination (Object Storage with retention rules; or directly into Logging Analytics for query).

Cross-account / cross-tenant log routing requires explicit trust configuration. In AWS, the central S3 bucket policy permits the organization principal via aws:PrincipalOrgID; in Azure, diagnostic settings can target a Log Analytics workspace in a different subscription provided the writing identity holds the appropriate role; in GCP, the aggregated sink's writer service account must be granted IAM access on the destination project; in OCI, the Connector Hub identity must hold IAM policies in both source and destination compartments. In every case, the trust is one-way: the spoke can write but cannot read or modify; the hub can read but does not have administrative rights back into the spoke.

Figure 1 — Centralised logging hub-spoke: each workload account or subscription emits control-plane, data-plane, and network flow logs to a dedicated security account's log-archive bucket configured with object lock in Compliance mode. A separate analytics destination (Log Analytics workspace, BigQuery dataset, or Logging Analytics) provides hot-tier queryable storage for the SIEM, while the archive provides immutable cold-tier evidence for the legally required retention period.

Retention

Retention floors are compliance-driven and must be encoded into the centralised log destination's retention configuration rather than left to operator memory. PCI DSS v4.0 requires audit log retention of at least one year with the most recent three months immediately available for analysis. HIPAA's Security Rule (45 CFR §164.316(b)(2)) requires retention of documentation — including audit trails — for six years from the date of creation or last effective date. SOX (Sarbanes-Oxley) financial-controls logs are typically retained seven years. SOC 2 requires retention sufficient to support the audit period (usually one year minimum). The applicable floor for any given log is the maximum of the regulatory floors the underlying data class is subject to.

Hot vs cold tiering reconciles retention floors with searchability cost. Hot tier (CloudWatch Logs / Log Analytics workspace / BigQuery / OCI Logging Analytics) is queryable in seconds, expensive per-GB-month, and typically holds 30-90 days of data. Cold tier (S3 with Glacier transitions / Storage Account with archive tier / GCS Coldline-Archive / OCI Archive Storage) is queryable in hours, cheap per-GB-month, and holds the remainder of the retention floor. The transition policy is automated by lifecycle rules — a SIEM ingest pipeline reads from hot; ad-hoc forensic retrieval reads from cold via a rehydration job documented in the incident response runbook.

SIEM and detection engineering

A SIEM is the system that converts centralised logs into prioritised findings via maintained detection content. Provider-native SIEMs — AWS Security Hub (with GuardDuty, Inspector, Config, Macie, IAM Access Analyzer findings ingested), Microsoft Sentinel (cloud-native SIEM/SOAR with KQL detection rules and built-in workbooks), Google Chronicle Security Operations (paired with Security Command Center for posture findings), and OCI Cloud Guard (with Logging Analytics for log-based detections) — are integrated by default with the corresponding provider's audit, posture, and threat-detection signals. Third-party SIEMs — Splunk, Elastic Security, Sumo Logic, IBM QRadar, Devo — ingest from the same centralised log destinations via Lambda / Logic App / Cloud Function / Service Connector forwarders and are the typical choice when a single SIEM must span multiple clouds plus on-premises sources.

Detection engineering is the discipline of building and maintaining the rules that turn logs into findings. Every detection rule in this corpus follows a four-part contract: hypothesis (a one-paragraph attacker behaviour described in MITRE ATT&CK terms — e.g., "T1078.004 — adversary signs in to a cloud account using valid credentials from an unusual geography"), log-source dependency (which log class and which fields the rule reads), true-positive test case (a reproducible event sequence that triggers the rule, verified at least quarterly), and owner (a named team responsible for tuning the rule). Detection-as-code formalises this: rules live in version control as Sigma (vendor-neutral), Sentinel KQL files, Chronicle YARA-L rules, or Splunk SPL saved searches, with pull-request review and CI-time validation against a golden-event corpus.

Coverage is measured against MITRE ATT&CK for Cloud (the IaaS, SaaS, Office 365, Azure AD, and Google Workspace sub-matrices) rather than against raw rule counts. A team with 800 detection rules concentrated in two ATT&CK tactics has worse coverage than a team with 200 rules spanning twelve tactics. The detection-engineering output therefore includes a heatmap mapping each maintained rule to one or more ATT&CK techniques and an annual review prioritising new rules into gap regions. CISA and the National Security Agency joint guidance on detection engineering for cloud environments supports this technique-driven coverage model.

Figure 2 — SIEM detection pipeline: centralised hot-tier logs feed the SIEM ingestion layer; detection-as-code rules (Sigma, KQL, YARA-L, SPL) under version control evaluate each event; matching events produce findings ranked by severity; severity-tiered routing pushes CRITICAL findings to the on-call paging channel, HIGH findings to a ticketing queue, MEDIUM and LOW findings to a daily-review dashboard. Each rule is mapped to one or more MITRE ATT&CK techniques for coverage measurement.

Alerting and runbook integration

Alert fatigue is the single most common reason detection programmes fail. A responder who receives 200 alerts per shift, 195 of which are false positives or low-severity noise, will stop reading the channel — and the five real findings travel through the same dead channel. The mitigation is severity-tiered routing combined with continuous false-positive review. CRITICAL findings page the on-call engineer directly (PagerDuty, Opsgenie, Splunk On-Call). HIGH findings open a ticket in the security ticketing queue (Jira Security, ServiceNow SIR) for review within one business day. MEDIUM findings populate a daily-review dashboard. LOW findings populate a weekly-review dashboard. Each tier carries a documented true-positive rate target; rules whose true-positive rate falls below the target are tuned or retired rather than left to noise the channel.

Every CRITICAL and HIGH detection rule is paired with a runbook — a documented step-by-step response procedure that the on-call responder executes. The runbook references the general incident response page for the lifecycle phases (containment, eradication, recovery, post-incident) and the canonical actions per phase. Without a runbook, a paged responder spends the first thirty minutes deciding what to do; with a runbook, those thirty minutes are containment.

Cross-provider equivalence

The four providers implement the logging-and-detection pipeline under different names. The table below maps the principles in this page to the provider-native primitives. Each provider deep-dive — aws/logging.html, azure/logging.html, gcp/logging.html, oci/logging.html — carries the per-service configuration detail and the per-provider detection-content libraries.

Principle AWS Azure GCP OCI
Control-plane audit log CloudTrail (Organization Trail) Activity Log + Microsoft Entra audit and sign-in logs Cloud Audit Logs (Admin Activity stream) OCI Audit service
Centralisation primitive Organization Trail → S3 in Log Archive account with Object Lock Diagnostic Settings → central Log Analytics workspace in security subscription Organization-level Aggregated Sink → GCS / BigQuery / Pub/Sub in security project Connector Hub → Object Storage / Logging Analytics in security tenancy compartment
Provider-native SIEM Security Hub + GuardDuty findings aggregation Microsoft Sentinel Chronicle Security Operations + Security Command Center Cloud Guard + Logging Analytics
Network flow logs VPC Flow Logs VNet Flow Logs (NSG Flow Logs legacy) VPC Flow Logs VCN Flow Logs
Log integrity / tamper evidence CloudTrail log file validation + S3 Object Lock Immutable storage with time-based retention policy GCS retention policy with bucket lock Object Storage retention rules with retention-rule lock

Illustrative control — Centralized immutable audit log

The control-box below is an illustrative example of the markup pattern every provider logging page applies. It is not a production control entry — provider pages carry CLI and IaC remediations specific to each cloud — but the threat-model framing and the CRITICAL DETECTIVE pairing transfer directly. Reading this box alongside the CRITICAL PREVENTIVE example on the data-protection page exercises the distinction the methodology page emphasises: same severity, different operational meaning. The illustrative ID gen-log-ex-01 is reserved and is not reused as a real control identifier.

gen-log-ex-01

Centralized immutable audit log for all control-plane API calls

HIGH DETECTIVE

HIGH (not CRITICAL) because the absence of this control does not by itself enable compromise — an attacker still needs an initial-access vector, a credential, or a vulnerability. It does, however, materially expand the cost and likelihood of successful exploitation and forfeits the ability to detect and respond. DETECTIVE (not PREVENTIVE) because the control surfaces unsafe states after they occur rather than preventing them — paired with alerting and runbook integration it becomes the trigger for response, but it does not stop an action at the control plane. This exercises the methodology distinction that a HIGH DETECTIVE differs operationally from a HIGH PREVENTIVE even though both carry the same severity colour: the responder receives an alert and acts; the preventive control would have refused the action entirely. Maps cross-provider to CIS AWS Foundations v3.0.0 (CloudTrail enabled in all regions, log file validation enabled, log delivery to dedicated S3 bucket), CIS Microsoft Azure Foundations v3.0.0 (Activity Log diagnostic settings, log retention, immutability), CIS GCP Foundation v4.0.0 (Cloud Audit Logs configured for all services, sink to immutable storage), CIS OCI Foundation v2.0.0 (Audit retention and centralisation), NIST SP 800-53 rev5 AU-2 (event logging), AU-9 (protection of audit information), and AU-12 (audit record generation), ISO/IEC 27001:2022 A.8.15 (logging), and ISO/IEC 27017:2015 CLD.12.4.1 (monitoring of cloud services).