AWS EKS Hardening

Overview

This page covers hardening controls for Amazon Elastic Kubernetes Service (EKS). Both EKS Standard managed node groups and EKS Auto Mode are addressed — mode-specific differences are noted in per-control callouts immediately below each control header. Where a control is enforced by default in EKS Auto Mode, the callout identifies it; where Standard mode requires explicit configuration, the callout shows the Terraform or aws eks incantation. See general/kubernetes.html for the cross-cutting threat model, cluster-baseline principles, and common misconfigurations that apply to all providers.

Controls are ordered by TSV anchor (01..10) which clusters by topic for cross-provider equivalence; severity ordering is approximately CRITICAL → HIGH → MEDIUM. Terraform examples use hashicorp/aws ~> 5.0. The sealed v1.0 AWS pages use the same provider pin. Supporting IAM prerequisites — including the EKS Pod Identity trust policy template — are on aws/iam.html; VPC patterns (private subnets, NAT egress) are on aws/network.html; CloudWatch sink configuration is on aws/logging.html.

aws-k8s-01 ! CRITICAL PREVENTIVE

EKS Standard: Pass endpoint_public_access = false and endpoint_private_access = true at cluster creation. For required external access, scope public_access_cidrs to a CIDR allow-list. Most VPC config is immutable post-create. EKS Auto Mode: The same private-endpoint configuration is supported. Node management is automated, but control-plane endpoint configuration is identical to Standard mode.

Enable a private EKS cluster endpoint so the kube-apiserver is unreachable from the public internet. Combine with public_access_cidrs if external access is genuinely required (CI runners, admin VPNs). A public kube-apiserver is the number-one Kubernetes breach vector — any leaked kubeconfig credential is immediately usable from the internet without network-level barriers.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_eks_cluster" "hardened" {
  name     = "hardened-cluster"
  role_arn = aws_iam_role.eks_cluster.arn
  version  = "1.30"

  vpc_config {
    subnet_ids              = var.private_subnet_ids
    endpoint_public_access  = false
    endpoint_private_access = true
    # If public access is unavoidable, scope to a narrow allow-list:
    # public_access_cidrs   = [var.management_cidr]
  }

  access_config {
    authentication_mode                         = "API"
    bootstrap_cluster_creator_admin_permissions = false
  }

  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]

  encryption_config {
    provider { key_arn = aws_kms_key.eks_secrets.arn }
    resources = ["secrets"]
  }
}

Remediation — aws eks

aws eks create-cluster \
  --name hardened-cluster \
  --role-arn arn:aws:iam::ACCOUNT:role/eks-cluster-role \
  --resources-vpc-config \
      endpointPublicAccess=false,endpointPrivateAccess=true,subnetIds=subnet-aaa,subnet-bbb \
  --access-config authenticationMode=API,bootstrapClusterCreatorAdminPermissions=false \
  --kubernetes-version 1.30

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS cluster with private-only API endpoint, KMS envelope encryption, and audit logs enabled.
Parameters:
  ClusterName:
    Type: String
  ClusterRoleArn:
    Type: String
  ClusterKmsKeyArn:
    Type: String
  SubnetIds:
    Type: List<AWS::EC2::Subnet::Id>
  SecurityGroupIds:
    Type: List<AWS::EC2::SecurityGroup::Id>
Resources:
  PrivateEksCluster:
    Type: AWS::EKS::Cluster
    Properties:
      Name: !Ref ClusterName
      Version: '1.31'
      RoleArn: !Ref ClusterRoleArn
      ResourcesVpcConfig:
        SubnetIds: !Ref SubnetIds
        SecurityGroupIds: !Ref SecurityGroupIds
        EndpointPublicAccess: false
        EndpointPrivateAccess: true
      EncryptionConfig:
        - Provider:
            KeyArn: !Ref ClusterKmsKeyArn
          Resources:
            - secrets
      Logging:
        ClusterLogging:
          EnabledTypes:
            - Type: api
            - Type: audit
            - Type: authenticator
            - Type: controllerManager
            - Type: scheduler

Remediation — AWS CDK (TypeScript)

import * as cdk from 'aws-cdk-lib';
import { aws_eks as eks, aws_ec2 as ec2, aws_iam as iam, aws_kms as kms } from 'aws-cdk-lib';
import { Construct } from 'constructs';

export interface PrivateEksProps extends cdk.StackProps {
  clusterName: string;
  vpc: ec2.IVpc;
  clusterKmsKeyArn: string;
}

export class PrivateEksClusterStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props: PrivateEksProps) {
    super(scope, id, props);

    new eks.Cluster(this, 'PrivateCluster', {
      clusterName: props.clusterName,
      version: eks.KubernetesVersion.V1_31,
      vpc: props.vpc,
      vpcSubnets: [{ subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS }],
      endpointAccess: eks.EndpointAccess.PRIVATE,
      secretsEncryptionKey: kms.Key.fromKeyArn(this, 'ClusterKmsKey', props.clusterKmsKeyArn),
      clusterLogging: [
        eks.ClusterLoggingTypes.API,
        eks.ClusterLoggingTypes.AUDIT,
        eks.ClusterLoggingTypes.AUTHENTICATOR,
        eks.ClusterLoggingTypes.CONTROLLER_MANAGER,
        eks.ClusterLoggingTypes.SCHEDULER,
      ],
    });
  }
}

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-01 CRITICAL PREVENTIVE AWS EKS n/a (managed control plane) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) AC-17; SC-7; SC-8 A.8.20; A.8.22 CLD.13.1.4 NIST SP 800-190 §4.4.1 NSA/CISA Kubernetes Hardening Guide v1.2 §2 (Network separation)

Log signals

  • CloudTrail eks:UpdateClusterConfig events where requestParameters.resourcesVpcConfig.endpointPublicAccess flips from false to true, or where publicAccessCidrs widens beyond the corporate egress allow-list (matching 0.0.0.0/0 is the canonical regression).
  • EKS audit log entries in /aws/eks/{cluster}/cluster where verb=create and requestObject.kind=Endpoints originate from a sourceIPs not enumerated in the documented administrator CIDR list — indicates the control plane is now reachable from an unexpected network.
  • Config rule eks-endpoint-no-public-access evaluating NON_COMPLIANT on any cluster resource — feeds Security Hub for fleet-wide correlation.

Query

fields @timestamp, eventName, requestParameters.name, requestParameters.resourcesVpcConfig.endpointPublicAccess, requestParameters.resourcesVpcConfig.publicAccessCidrs, userIdentity.arn, sourceIPAddress
          | filter eventSource = "eks.amazonaws.com" and eventName = "UpdateClusterConfig"
          | filter requestParameters.resourcesVpcConfig.endpointPublicAccess = true
          | sort @timestamp desc
          | limit 100

Run the CloudWatch Logs Insights query over the org-trail log group spanning all member accounts; the EKS control-plane API surface is the only attack path that this control closes, so a single hit warrants paging the on-call cluster-operator.

Alert threshold

  • Any endpointPublicAccess=true flip on a production cluster — page immediately; cluster tag env=prod drives the routing.
  • A publicAccessCidrs value of 0.0.0.0/0 — block via SCP-deny preview before alert fan-out and treat the change attempt itself as the incident.
  • Three or more UpdateClusterConfig calls touching the VPC config block within a rolling 24 hours from the same principal — suggests configuration churn rather than a one-shot regression and merits a change-management review.

Initial response

  1. Revert the cluster config with aws eks update-cluster-config --name {cluster} --resources-vpc-config endpointPublicAccess=false; capture the CloudTrail eventID and the userIdentity SAML assertion as forensic ledger entries.
  2. Pivot to VPC Flow Logs for the cluster's ENI subnet over the exposure window and enumerate every inbound TCP 443 flow from non-corporate CIDRs; cross-check the source IPs against GuardDuty findings filtered on resource.eksClusterDetails.
  3. Open an incident via general/ir.html if any inbound flow appears, and rotate cluster certificates (aws eks update-cluster-version path) to invalidate any kubeconfig captured during the public window.

References

Equivalent controls in other providers: GKE private cluster + authorized networks, AKS private cluster, OKE private API endpoint.

aws-k8s-02 ! HIGH PREVENTIVE

EKS Standard: EKS Pod Identity (GA Nov 2023) is the recommended pattern for new clusters. Install the eks-pod-identity-agent managed add-on, then create aws_eks_pod_identity_association resources mapping K8s ServiceAccounts to IAM roles. EKS Auto Mode: The Pod Identity agent is pre-installed and managed by EKS Auto. Create associations the same way; no agent-DaemonSet management required.

Bind Kubernetes ServiceAccounts to AWS IAM Roles via EKS Pod Identity associations. Pod Identity eliminates the per-cluster OIDC provider, uses cluster-scoped associations (not per-pod annotations), and provides faster credential rotation than the legacy IRSA pattern. The default Node IAM role should be least-privilege and decoupled from per-workload AWS permissions; pod-scoped IAM is granted via Pod Identity associations.

Migration from IRSA: Existing IRSA workloads (which use the per-cluster OIDC provider plus the eks.amazonaws.com/role-arn ServiceAccount annotation) remain supported. IRSA is the legacy migration path; new clusters should adopt Pod Identity, and existing clusters can migrate workload-by-workload by creating a Pod Identity association and removing the IRSA annotation.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_iam_role" "app" {
  name = "eks-app-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = { Service = "pods.eks.amazonaws.com" }
      Action = ["sts:AssumeRole", "sts:TagSession"]
    }]
  })
}

resource "aws_eks_pod_identity_association" "app" {
  cluster_name    = aws_eks_cluster.hardened.name
  namespace       = "production"
  service_account = "app-sa"
  role_arn        = aws_iam_role.app.arn
}

# Install the Pod Identity agent add-on (EKS Standard only — EKS Auto manages this)
resource "aws_eks_addon" "pod_identity_agent" {
  cluster_name = aws_eks_cluster.hardened.name
  addon_name   = "eks-pod-identity-agent"
}

Remediation — aws eks

aws eks create-addon \
  --cluster-name hardened-cluster \
  --addon-name eks-pod-identity-agent

aws eks create-pod-identity-association \
  --cluster-name hardened-cluster \
  --namespace production \
  --service-account app-sa \
  --role-arn arn:aws:iam::ACCOUNT:role/eks-app-role

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS Pod Identity association binding a Kubernetes service account to a least-priv IAM role.
Parameters:
  ClusterName:
    Type: String
  Namespace:
    Type: String
  ServiceAccount:
    Type: String
  PodRoleArn:
    Type: String
Resources:
  PodIdentityAssociation:
    Type: AWS::EKS::PodIdentityAssociation
    Properties:
      ClusterName: !Ref ClusterName
      Namespace: !Ref Namespace
      ServiceAccount: !Ref ServiceAccount
      RoleArn: !Ref PodRoleArn

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-02 HIGH PREVENTIVE AWS EKS n/a (managed control plane) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) IA-2; AC-6; IA-5 A.5.15; A.5.18 n/a NIST SP 800-190 §4.4.2 NSA/CISA Kubernetes Hardening Guide v1.2 §4 (IAM/RBAC)

Log signals

  • CloudTrail eks:CreatePodIdentityAssociation attaching a role whose attached policies include AdministratorAccess, iam:*, or any AWS-managed *FullAccess policy — the association binds that role to every pod matching the service-account selector, so over-privileged associations widen blast radius for any container compromise.
  • EKS audit events where a service-account token request resolves a pod-identity role-arn that does not appear in the canonical pod-identity-association-allowlist.tsv maintained by the platform team.
  • CloudTrail sts:AssumeRole for a pod-identity-bound role originating from a node-group ENI whose pod CIDR has not been registered for that role — points to either a mislabelled service account or token replay.

Query

fields @timestamp, eventName, requestParameters.clusterName, requestParameters.namespace, requestParameters.serviceAccount, requestParameters.roleArn, userIdentity.arn
          | filter eventSource = "eks.amazonaws.com" and eventName = "CreatePodIdentityAssociation"
          | parse requestParameters.roleArn /arn:aws:iam::(?<acct>\d+):role\/(?<role>.+)/
          | filter role like /Admin/ or role like /FullAccess/ or role = "OrganizationAccountAccessRole"
          | sort @timestamp desc
          | limit 50

The CloudWatch Logs Insights regex extracts the role short-name from the ARN so the filter can pattern-match against an organisation-wide naming blocklist; couple this query with a daily diff of pod-identity-associations against the allow-list TSV.

Alert threshold

  • Any association attaching a role with iam:* in its policy graph — page immediately, even on lower environments, because the role's pod-execution surface is identical to a CI-runner with break-glass keys.
  • An association whose namespace + serviceAccount tuple is not in the platform allow-list — high-priority ticket within one business hour for the cluster owner.
  • Sustained association-create rate above the 30-day p99 baseline — informational; correlate with platform-team change tickets before escalating.

Initial response

  1. Delete the offending association with aws eks delete-pod-identity-association --cluster-name {cluster} --association-id {id} and force pod restarts in the affected namespace so the token cache invalidates.
  2. Capture the IAM role's effective policy with aws iam simulate-principal-policy against a representative set of read/write actions and attach the simulator output to the incident record.
  3. If the role grants iam:* or sts:AssumeRole on broader principals, escalate per general/ir.html and audit CloudTrail for any sts:AssumeRole traffic from the association during the exposure window — those calls were authenticated against the over-privileged role.

References

Equivalent controls in other providers: GKE Workload Identity Federation, AKS Workload Identity, OKE Workload Identity.

aws-k8s-03 ! HIGH PREVENTIVE

EKS Standard: Envelope encryption is configured at cluster create time and is immutable for the lifetime of the cluster — choose the KMS customer-managed CMK before creation. Enable automatic CMK rotation on the key. EKS Auto Mode: Same configuration model. The customer manages CMK lifecycle (rotation, revocation, grants); EKS uses the CMK to encrypt the Data Encryption Key that wraps Kubernetes Secrets in etcd.

Enable envelope encryption for Kubernetes Secrets using a customer-managed KMS key (CMK). This adds a layer on top of AWS-managed at-rest encryption and gives the customer control over the key lifecycle. Without envelope encryption, AWS holds the encryption key for etcd; with envelope encryption, revoking the CMK (via KMS policy or key disable) makes Secrets unreadable cluster-wide.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_kms_key" "eks_secrets" {
  description             = "EKS envelope encryption CMK"
  enable_key_rotation     = true
  deletion_window_in_days = 30
}

resource "aws_kms_alias" "eks_secrets" {
  name          = "alias/eks-secrets"
  target_key_id = aws_kms_key.eks_secrets.key_id
}

resource "aws_eks_cluster" "hardened" {
  name = "hardened-cluster"
  # ... vpc_config, role_arn ...

  encryption_config {
    provider { key_arn = aws_kms_key.eks_secrets.arn }
    resources = ["secrets"]
  }
}

Remediation — aws eks

aws eks create-cluster \
  --name hardened-cluster \
  --role-arn arn:aws:iam::ACCOUNT:role/eks-cluster-role \
  --resources-vpc-config subnetIds=subnet-aaa,subnet-bbb,endpointPublicAccess=false,endpointPrivateAccess=true \
  --encryption-config '[{"provider":{"keyArn":"arn:aws:kms:REGION:ACCOUNT:key/KEY-UUID"},"resources":["secrets"]}]'

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: AWS Config managed rule asserting every EKS cluster uses KMS envelope encryption for secrets.
Resources:
  EksSecretsEncryptedRule:
    Type: AWS::Config::ConfigRule
    Properties:
      ConfigRuleName: eks-secrets-encrypted
      Source:
        Owner: AWS
        SourceIdentifier: EKS_SECRETS_ENCRYPTED
      Scope:
        ComplianceResourceTypes:
          - AWS::EKS::Cluster

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-03 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §1.2 (etcd encryption) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) SC-28; IA-5 A.8.24; A.8.10 n/a NIST SP 800-190 §4.3.2 NSA/CISA Kubernetes Hardening Guide v1.2 §5 (Secrets)

Log signals

  • CloudTrail kms:DisableKey or kms:ScheduleKeyDeletion on the CMK referenced by the cluster's encryptionConfig.provider.keyArn — secret material in etcd becomes unreadable as soon as the key is disabled, with cluster-wide blast radius.
  • CloudTrail kms:PutKeyPolicy on the envelope CMK where the new policy removes the kms:Decrypt permission for the EKS service-linked principal eks.amazonaws.com or the cluster's IAM role.
  • EKS audit log secrets.create or secrets.update requests that return a 500 status with body containing "Internal error occurred: rpc error: code = Internal desc = failed to encrypt" — the kube-apiserver KMS plugin is failing to reach the CMK.

Query

fields @timestamp, eventName, requestParameters.keyId, requestParameters.pendingWindowInDays, userIdentity.arn, sourceIPAddress, errorCode
          | filter eventSource = "kms.amazonaws.com" and eventName in ["DisableKey","ScheduleKeyDeletion","PutKeyPolicy"]
          | filter requestParameters.keyId in ["alias/eks-envelope-prod","alias/eks-envelope-stage"]
          | sort @timestamp desc
          | limit 50

Maintain the CMK alias allow-list as a managed lookup so the CloudWatch Logs Insights filter does not drift; the alias name is the only stable handle because key-id rotation can occur during automated re-key flows.

Alert threshold

  • Any DisableKey on a production EKS envelope alias — page immediately and freeze any concurrent change-management work touching the cluster.
  • ScheduleKeyDeletion with pendingWindowInDays < 30 on an envelope CMK — treat as confirmed sabotage attempt; the 30-day floor is the AWS-recommended grace window and any shorter value is a deliberate compress-the-blast-radius signal.
  • PutKeyPolicy events on envelope CMKs — informational at create time but correlate against the prior policy hash; deviations that remove kms:Decrypt for the EKS principal are immediate-page.

Initial response

  1. Re-enable the CMK with aws kms enable-key --key-id {alias} or cancel the pending deletion via aws kms cancel-key-deletion; if the policy was modified, restore from the IaC repository (Terraform state, not the live console) so policy drift is closed at the source of truth.
  2. Validate cluster-secret read-back with kubectl get secret -n kube-system aws-auth -o yaml | head (the legacy aws-auth ConfigMap is convenient as a known-present etcd object even though access-entry API is the non-deprecated control-plane path) — a working decrypt confirms the encryption path recovered; an EOF or 500 means etcd entries remain unreadable and a restore-from-backup may be required.
  3. Open an incident via general/ir.html and pivot CloudTrail to enumerate every principal that called kms:Decrypt on the envelope CMK over the prior 24 hours to bound the read-side exposure.

References

Equivalent controls in other providers: GKE Cloud KMS secrets encryption, AKS KMS etcd encryption, OKE Vault CMK secrets encryption.

aws-k8s-04 ! HIGH DETECTIVE

EKS Standard: All five control-plane log types are off by default. Enable all five explicitly — api, audit, authenticator, controllerManager, scheduler — to CloudWatch Logs. Without the audit log, lateral movement via kubectl exec is invisible. EKS Auto Mode: Same five log types apply. Enable them in the cluster spec; EKS Auto delivers them to CloudWatch with the same retention and IAM controls.

Enable EKS control-plane logging (five log types) to CloudWatch Logs. The audit log is the most critical — it records every kube-apiserver request including the calling identity, verb, resource, and response code. CloudWatch Container Insights complements this with worker-node and pod-level telemetry and can be enabled as a Container Insights monitoring add-on.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_eks_cluster" "hardened" {
  name = "hardened-cluster"
  # ... role_arn, vpc_config ...

  enabled_cluster_log_types = [
    "api",
    "audit",
    "authenticator",
    "controllerManager",
    "scheduler"
  ]
}

resource "aws_cloudwatch_log_group" "eks" {
  name              = "/aws/eks/hardened-cluster/cluster"
  retention_in_days = 365
  kms_key_id        = aws_kms_key.logs.arn
}

Remediation — aws eks

aws eks update-cluster-config \
  --name hardened-cluster \
  --logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: CloudWatch Logs metric filter alarming on EKS audit log access-denied spikes.
Parameters:
  AuditLogGroupName:
    Type: String
  AlarmTopicArn:
    Type: String
Resources:
  AccessDeniedMetricFilter:
    Type: AWS::Logs::MetricFilter
    Properties:
      LogGroupName: !Ref AuditLogGroupName
      FilterPattern: '{ $.responseStatus.code = 403 }'
      MetricTransformations:
        - MetricName: EksAuditAccessDenied
          MetricNamespace: Security/EKS
          MetricValue: '1'
  AccessDeniedAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: eks-audit-access-denied-spike
      MetricName: EksAuditAccessDenied
      Namespace: Security/EKS
      Statistic: Sum
      Period: 300
      EvaluationPeriods: 1
      Threshold: 20
      ComparisonOperator: GreaterThanThreshold
      AlarmActions:
        - !Ref AlarmTopicArn

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-04 HIGH DETECTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §1.2.22 (audit policy) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) AU-2; AU-12; SI-4 A.8.15; A.8.16 CLD.12.4.5 NIST SP 800-190 §4.4.3 NSA/CISA Kubernetes Hardening Guide v1.2 §6 (Audit logging)

Log signals

  • CloudTrail eks:UpdateClusterConfig where requestParameters.logging.clusterLogging contains any entry with enabled=false for api, audit, authenticator, controllerManager, or scheduler — turning off any of the five log streams is the most common precursor to evading detection in a subsequent breach.
  • Absence-of-signal: the /aws/eks/{cluster}/cluster log group ingest rate (per-minute bytes) drops to zero or below 1% of the trailing 7-day baseline while the cluster's CloudWatch ContainerInsights metrics still report active workloads — indicates the audit pipeline is silently broken even if the configuration claims it is enabled.
  • CloudTrail logs:DeleteLogGroup targeting /aws/eks/{cluster}/cluster — destroys the audit trail itself rather than disabling the source.

Query

fields @timestamp, eventName, requestParameters.name, requestParameters.logging.clusterLogging, userIdentity.arn, sourceIPAddress
          | filter eventSource = "eks.amazonaws.com" and eventName = "UpdateClusterConfig"
          | filter requestParameters.logging.clusterLogging.0.enabled = false or requestParameters.logging.clusterLogging.1.enabled = false
          | sort @timestamp desc
          | limit 100

Pair the CloudWatch Logs Insights query with a CloudWatch metric alarm on IncomingBytes for the cluster log group with a static threshold at 10% of the 7-day rolling mean — the absence-of-signal alarm fires when an operator masks the disable by sending traffic elsewhere.

Alert threshold

  • Any disable of the audit stream — page immediately; the audit log is the primary forensic source and disabling it is treated identically to disabling CloudTrail at the org level.
  • Disable of api, authenticator, controllerManager, or scheduler — high-priority ticket per stream and per cluster within 30 minutes.
  • Log-group ingest below 10% of 7-day baseline for two consecutive 5-minute windows — informational at first instance, escalate to page on the third consecutive trip.

Initial response

  1. Re-enable every log stream with aws eks update-cluster-config --name {cluster} --logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}' and confirm the change committed by reading back describe-cluster.
  2. Cross-reference the disable timestamp against EKS audit-log entries surviving in the pre-disable window for any verb-create activity on RBAC, secrets, or workload resources — anything between the disable and re-enable is a forensic gap that must be reconstructed from node-level logs.
  3. Open an incident per general/ir.html; preserve the prior log-group's retention setting, and if DeleteLogGroup was the disable vector, restore the deleted group from any cross-account log-archive sink before re-enabling.

References

Equivalent controls in other providers: GKE Cloud Audit Logs, AKS control-plane audit logs (diagnostic settings), OKE OCI Audit Logging.

aws-k8s-05 ! HIGH PREVENTIVE

EKS Standard: Managed node groups inherit IMDS settings from the launch template — set http_tokens = "required" and http_put_response_hop_limit = 1. Pods using EKS Pod Identity do not need IMDS access at all; the agent uses a Unix socket. EKS Auto Mode: IMDSv2 enforcement and hop-limit 1 are the default; verify with aws ec2 describe-instances --instance-ids i-... --query 'Reservations[].Instances[].MetadataOptions'.

Enforce IMDSv2 (http_tokens=required) and a hop-limit of 1 on all worker nodes. Hop-limit 1 prevents containerized workloads from reaching the IMDS endpoint at 169.254.169.254, which would otherwise grant the container the same node IAM role permissions as the host. The worker-node IAM role itself must be least-privilege (the AWS-managed AmazonEKSWorkerNodePolicy + AmazonEC2ContainerRegistryReadOnly + AmazonEKS_CNI_Policy attachments only — no application-level IAM grants).

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_launch_template" "nodes" {
  name_prefix   = "eks-nodes-"
  image_id      = data.aws_ssm_parameter.bottlerocket_ami.value
  instance_type = "m6i.large"

  metadata_options {
    http_endpoint               = "enabled"
    http_tokens                 = "required"   # IMDSv2 only
    http_put_response_hop_limit = 1            # block from inside containers
    instance_metadata_tags      = "disabled"
  }
}

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.hardened.name
  node_group_name = "bottlerocket-ng"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = var.private_subnet_ids

  launch_template {
    id      = aws_launch_template.nodes.id
    version = aws_launch_template.nodes.latest_version
  }

  scaling_config { desired_size = 3; min_size = 3; max_size = 6 }
}

Remediation — aws ec2

# Enforce on a running instance (CI/diagnostic use; prefer launch-template config)
aws ec2 modify-instance-metadata-options \
  --instance-id i-0abc123 \
  --http-tokens required \
  --http-put-response-hop-limit 1 \
  --http-endpoint enabled

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EC2 launch template forcing IMDSv2 + hop-limit 2 for EKS managed node group instances.
Resources:
  NodeLaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties:
      LaunchTemplateName: eks-node-imdsv2
      LaunchTemplateData:
        MetadataOptions:
          HttpEndpoint: enabled
          HttpTokens: required
          HttpPutResponseHopLimit: 2
          InstanceMetadataTags: enabled

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-05 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §4.2 (kubelet/node config) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) AC-3; AC-6; SC-7 A.8.20; A.5.15 CLD.9.5.2 NIST SP 800-190 §4.4.4 NSA/CISA Kubernetes Hardening Guide v1.2 §4 (Worker node hardening)

Log signals

  • CloudTrail ec2:ModifyInstanceMetadataOptions on instances whose resourceArn matches the cluster's node-group launch-template, with requestParameters.httpTokens set to optional or httpEndpoint set to enabled alongside httpPutResponseHopLimit > 1 — the IMDSv1 fallback opens the SSRF pivot from compromised pods to node IAM credentials.
  • EKS audit-log requests where a pod accesses metadata addresses indirectly via a side-car proxy that returns IMDSv1 responses — usually paired with a service-account that lacks pod-identity, indicating workloads still rely on node-role credentials.
  • VPC Flow Logs egress to 169.254.169.254/32 from pod CIDR ranges with action=ACCEPT while the corresponding security group is expected to deny — confirms the metadata path is reachable from the pod network namespace.

Query

fields @timestamp, eventName, requestParameters.instanceId, requestParameters.httpTokens, requestParameters.httpPutResponseHopLimit, userIdentity.arn
          | filter eventSource = "ec2.amazonaws.com" and eventName = "ModifyInstanceMetadataOptions"
          | filter requestParameters.httpTokens = "optional" or requestParameters.httpPutResponseHopLimit > 1
          | sort @timestamp desc
          | limit 100

Join the CloudWatch Logs Insights output against the node-group's instance-id set by tag eks:nodegroup-name; instances outside the node-group should be filtered out so the alert focuses on EKS-managed compute exclusively.

Alert threshold

  • Any httpTokens=optional on a production node-group instance — page immediately; the IMDSv1 fallback re-opens the pod-to-node credential bridge that this control specifically closes.
  • httpPutResponseHopLimit set above 1 — high-priority ticket; hop-limit 2 is the canonical signal that pods (which are one network hop further) should reach the metadata service.
  • More than 10 pod-CIDR-sourced flows to 169.254.169.254 within an hour while IMDSv2 is enforced — informational; usually indicates a workload still calling IMDS via a hard-coded SDK path that should be migrated to pod-identity.

Initial response

  1. Re-enforce IMDSv2 on the offending node with aws ec2 modify-instance-metadata-options --instance-id {id} --http-tokens required --http-put-response-hop-limit 1 and re-bake the node-group launch-template so the next scale-up inherits the hardened defaults.
  2. Use VPC Flow Logs to enumerate every pod-CIDR connection that hit the metadata IP during the relaxation window; cross-reference the source pod identities against pod-identity association coverage to identify workloads still on the legacy path.
  3. Open an incident per general/ir.html if any unattributed traffic appears, rotate the node IAM role's session credentials, and review sts:AssumeRole CloudTrail for the node role during the exposure window to find any pod-originated token use.

References

Equivalent controls in other providers: GKE legacy metadata + ABAC disable, AKS IMDS NetworkPolicy block, OCI IAM least-privilege cluster access.

aws-k8s-06 ! HIGH PREVENTIVE

EKS Standard: Configure authentication_mode = "API" (Cluster Access Management API only) at cluster creation; use aws_eks_access_entry to map IAM principals to Kubernetes groups. A dedicated security group on worker nodes restricts ingress to the cluster security group only. EKS Auto Mode: The same access model applies; node security groups are managed by EKS Auto with a least-privilege baseline.

This control bundles two access-control concerns: (a) Cluster Access Management API access entries — the modern primary mechanism for mapping AWS IAM identities to Kubernetes RBAC — and (b) security-group segmentation between the EKS-managed control plane and worker nodes. The aws-auth ConfigMap is deprecated as the primary EKS access mechanism — new clusters MUST use Cluster Access Management API access entries via aws eks create-access-entry. The legacy aws-auth path remains supported for backward compatibility only, and the aws-auth ConfigMap is invisible to access-entry tooling.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_eks_cluster" "hardened" {
  name = "hardened-cluster"
  # ... role_arn, vpc_config ...

  access_config {
    authentication_mode                         = "API"   # NOT "API_AND_CONFIG_MAP"
    bootstrap_cluster_creator_admin_permissions = false
  }
}

resource "aws_eks_access_entry" "admin" {
  cluster_name  = aws_eks_cluster.hardened.name
  principal_arn = aws_iam_role.cluster_admin.arn
  type          = "STANDARD"
}

resource "aws_eks_access_policy_association" "admin" {
  cluster_name  = aws_eks_cluster.hardened.name
  principal_arn = aws_iam_role.cluster_admin.arn
  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
  access_scope { type = "cluster" }
}

resource "aws_security_group" "nodes" {
  name        = "eks-nodes-sg"
  description = "EKS worker node SG"
  vpc_id      = var.vpc_id
}

resource "aws_security_group_rule" "nodes_from_cluster" {
  type                     = "ingress"
  from_port                = 0
  to_port                  = 65535
  protocol                 = "tcp"
  security_group_id        = aws_security_group.nodes.id
  source_security_group_id = aws_eks_cluster.hardened.vpc_config[0].cluster_security_group_id
}

Remediation — aws eks

aws eks create-access-entry \
  --cluster-name hardened-cluster \
  --principal-arn arn:aws:iam::ACCOUNT:role/ClusterAdmin \
  --type STANDARD

aws eks associate-access-policy \
  --cluster-name hardened-cluster \
  --principal-arn arn:aws:iam::ACCOUNT:role/ClusterAdmin \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
  --access-scope type=cluster

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS node security group permitting only cluster-SG ingress on the kubelet port range.
Parameters:
  VpcId:
    Type: AWS::EC2::VPC::Id
  ClusterSecurityGroupId:
    Type: AWS::EC2::SecurityGroup::Id
Resources:
  NodeSg:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupName: eks-node-sg
      GroupDescription: Node SG — cluster control-plane SG only.
      VpcId: !Ref VpcId
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 10250
          ToPort: 10250
          SourceSecurityGroupId: !Ref ClusterSecurityGroupId
        - IpProtocol: tcp
          FromPort: 1025
          ToPort: 65535
          SourceSecurityGroupId: !Ref ClusterSecurityGroupId

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-06 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §5.1 (RBAC) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) AC-2; AC-3; AC-6 A.5.15; A.5.16; A.5.18 n/a NIST SP 800-190 §4.4.2 NSA/CISA Kubernetes Hardening Guide v1.2 §4 (IAM/RBAC)

Log signals

  • CloudTrail eks:CreateAccessEntry or eks:UpdateAccessEntry binding a principal to the AmazonEKSClusterAdminPolicy access-policy ARN, especially when the principal is an IAM user (rather than an SSO role) or a role outside the platform-team OU.
  • CloudTrail ec2:AuthorizeSecurityGroupIngress on the security-group attached to the cluster's node ENIs introducing rules with cidrIp=0.0.0.0/0 or targeting kubelet port 10250 from outside the VPC CIDR — exposes the kubelet API to lateral attackers.
  • VPC Flow Logs ACCEPT traffic into node ENIs on port 10250 from any source outside the cluster's pod-CIDR and the control-plane managed ENIs.

Query

fields @timestamp, eventName, requestParameters.principalArn, requestParameters.accessPolicyArn, requestParameters.clusterName, userIdentity.arn
          | filter eventSource = "eks.amazonaws.com" and eventName in ["CreateAccessEntry","AssociateAccessPolicy","UpdateAccessEntry"]
          | filter requestParameters.accessPolicyArn like /AmazonEKSClusterAdminPolicy/
          | sort @timestamp desc
          | limit 50

The CloudWatch Logs Insights filter prioritises cluster-admin policy bindings; pair it with a second query on ec2:AuthorizeSecurityGroupIngress restricted to the node security-group IDs maintained in a managed lookup table.

Alert threshold

  • Any cluster-admin binding via Access Entries to a non-SSO IAM user — page immediately; long-lived IAM users with cluster-admin are an audit-finding red flag and historically the first lateral foothold post-credential-leak.
  • Any SG-ingress rule introducing 0.0.0.0/0 on node ENIs — high-priority ticket; coordinate with the SCP-deny gate in aws-iam-08-scp-deny-list so the rule is rolled back automatically where the SCP is attached.
  • Inbound 10250 traffic from outside cluster CIDRs — informational on first occurrence, escalate to incident on second within 15 minutes (kubelet probes have characteristic burst patterns).

Initial response

  1. Revoke the offending access-entry with aws eks delete-access-entry --cluster-name {cluster} --principal-arn {arn} and remove the security-group rule via aws ec2 revoke-security-group-ingress referencing the rule-id from the CloudTrail event.
  2. Pull EKS audit logs for the principal's user field over the exposure window and enumerate every verb they issued; secret reads, exec into pods, and RBAC mutations are the priority forensic markers.
  3. Open an incident via general/ir.html; if the principal was bound to cluster-admin, treat every secret read during the window as a confirmed exfiltration and rotate the affected credentials per the playbook in aws-ir-06-credential-rotation-playbook.

References

Equivalent controls in other providers: GKE control-plane access (private cluster + authorized networks), AKS Entra ID + Azure RBAC, OCI IAM least-privilege cluster access.

aws-k8s-07 ! HIGH PREVENTIVE

EKS Standard: Select BOTTLEROCKET_x86_64 (or BOTTLEROCKET_ARM_64) or AL2023_x86_64_STANDARD as the managed node group ami_type. Amazon Linux 2 (AL2) reached end-of-life in November 2025 — migrate to Bottlerocket or AL2023. EKS Auto Mode: Bottlerocket is the only node OS in EKS Auto — there is no AL2 or AL2023 choice to make; the OS is managed and continually patched by EKS Auto.

Use Bottlerocket (immutable container-optimized OS, read-only root filesystem, atomic image-based updates, SELinux-enforced) or AL2023 (Amazon Linux 2023) as the worker-node operating system. Amazon Linux 2 EKS-optimized AMIs reached end-of-life in November 2025; clusters still running AL2 nodes do not receive kernel security patches. When migrating from Amazon Linux 2, the choice is between AL2023 (incremental upgrade path, same package ecosystem) and Bottlerocket (security-focused redesign with no SSH by default and no general-purpose package manager) — Bottlerocket is the recommended choice for new clusters.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
data "aws_ssm_parameter" "bottlerocket_ami" {
  name = "/aws/service/bottlerocket/aws-k8s-1.30/x86_64/latest/image_id"
}

resource "aws_launch_template" "bottlerocket" {
  name_prefix   = "eks-bottlerocket-"
  image_id      = data.aws_ssm_parameter.bottlerocket_ami.value
  instance_type = "m6i.large"

  metadata_options {
    http_tokens                 = "required"
    http_put_response_hop_limit = 1
    http_endpoint               = "enabled"
  }
}

resource "aws_eks_node_group" "bottlerocket" {
  cluster_name    = aws_eks_cluster.hardened.name
  node_group_name = "bottlerocket-ng"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = var.private_subnet_ids
  ami_type        = "BOTTLEROCKET_x86_64"

  launch_template {
    id      = aws_launch_template.bottlerocket.id
    version = aws_launch_template.bottlerocket.latest_version
  }

  scaling_config { desired_size = 3; min_size = 3; max_size = 6 }
}

Remediation — aws eks

aws eks create-nodegroup \
  --cluster-name hardened-cluster \
  --nodegroup-name bottlerocket-ng \
  --ami-type BOTTLEROCKET_x86_64 \
  --node-role arn:aws:iam::ACCOUNT:role/eks-node-role \
  --subnets subnet-aaa subnet-bbb \
  --scaling-config minSize=3,maxSize=6,desiredSize=3 \
  --launch-template id=lt-0abc,version='$Latest'

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS managed node group on Bottlerocket AMI with IMDSv2 launch template.
Parameters:
  ClusterName:
    Type: String
  NodeRoleArn:
    Type: String
  SubnetIds:
    Type: List<AWS::EC2::Subnet::Id>
  LaunchTemplateId:
    Type: String
Resources:
  BottlerocketNodeGroup:
    Type: AWS::EKS::Nodegroup
    Properties:
      ClusterName: !Ref ClusterName
      NodegroupName: bottlerocket-ng
      NodeRole: !Ref NodeRoleArn
      Subnets: !Ref SubnetIds
      AmiType: BOTTLEROCKET_x86_64
      ScalingConfig:
        MinSize: 2
        DesiredSize: 3
        MaxSize: 6
      LaunchTemplate:
        Id: !Ref LaunchTemplateId
        Version: '1'

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-07 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §4.1 (worker node config) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) SI-2; SI-7; CM-6 A.8.8; A.8.9 CLD.9.5.2 NIST SP 800-190 §4.4 (Container runtime hardening) NSA/CISA Kubernetes Hardening Guide v1.2 §4 (Worker node security)

Log signals

  • CloudTrail eks:CreateNodegroup or eks:UpdateNodegroupVersion where requestParameters.amiType is AL2_x86_64, AL2_ARM_64, AL2023_x86_64_STANDARD, or any value other than BOTTLEROCKET_x86_64 / BOTTLEROCKET_ARM_64 — drifts the node fleet off the minimal-OS posture the control mandates.
  • CloudTrail ec2:RunInstances within the cluster's launch-template scope where the resolved AMI's Name tag does not start with bottlerocket- — catches self-managed node-groups that bypass the EKS-managed AMI selection path.
  • SSM Inventory results showing kernel package versions inconsistent with the Bottlerocket-published variant string for nodes claiming to run Bottlerocket — indicates either an in-place AMI swap or a chroot-style escape from the immutable host.

Query

fields @timestamp, eventName, requestParameters.nodegroupName, requestParameters.amiType, requestParameters.releaseVersion, userIdentity.arn
          | filter eventSource = "eks.amazonaws.com" and eventName in ["CreateNodegroup","UpdateNodegroupVersion"]
          | filter requestParameters.amiType != "BOTTLEROCKET_x86_64" and requestParameters.amiType != "BOTTLEROCKET_ARM_64"
          | sort @timestamp desc
          | limit 100

The CloudWatch Logs Insights filter is the cheap first cut; for self-managed groups, schedule a Lambda that resolves the AMI ID to its publication metadata via ec2:DescribeImages and asserts on the OwnerId=192696855908 (Bottlerocket publishing account) and the Name prefix.

Alert threshold

  • Any node-group create/update to a non-Bottlerocket AMI type in production — page immediately; the immutable-OS posture is a fleet-wide assumption that other controls (no SSH, no package manager) inherit.
  • A self-managed node-group with an AMI not owned by the Bottlerocket publishing account — high-priority ticket within one business hour; verify with the change-management ticket that the deviation was intentional.
  • releaseVersion falling behind the cluster's Kubernetes minor version's supported Bottlerocket band — informational; track on a monthly cadence rather than as an incident.

Initial response

  1. Roll the node-group back to the latest Bottlerocket release with aws eks update-nodegroup-version --cluster-name {cluster} --nodegroup-name {ng} --release-version {bottlerocket-version}; allow EKS to cordon-drain the deviated nodes during the rolling replacement.
  2. Inspect each non-Bottlerocket node via SSM Session Manager (if the AMI allows it) or kubelet exec to verify whether any persistent state (host-path mounts, writable layers) survived the AMI swap — Bottlerocket's read-only root filesystem assumption means any persisted host changes are an integrity break.
  3. Open an incident per general/ir.html; correlate the AMI swap timestamp with EKS audit-log exec verbs into privileged pods, as the most common abuse pattern is to swap to a writable AMI before persisting an implant.

References

Equivalent controls in other providers: GKE Shielded Nodes (closest hardened-OS analog), OKE Oracle Linux 8 node OS hardening parallel.

aws-k8s-08 ! HIGH PREVENTIVE

EKS Standard: The PodSecurity admission controller is built into the kube-apiserver since Kubernetes 1.23. Apply namespace labels — pod-security.kubernetes.io/enforce: restricted — to block privileged pods. No add-on required. EKS Auto Mode: Same built-in admission controller. EKS Auto does not add a separate enforcement layer; namespace labels are the contract.

Enforce Pod Security Standards at the namespace level using the built-in PodSecurity admission controller. Target restricted for application namespaces (blocks privileged, hostNetwork, hostPath, runAsRoot, capabilities beyond NET_BIND_SERVICE); use baseline as the floor for legacy workloads during migration; reserve privileged for kube-system with a documented justification. The pre-1.25 admission mechanism that PSS replaces was removed from Kubernetes — do not attempt to reintroduce it on EKS.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
# Namespace PSS labels via hashicorp/kubernetes ~> 3.0 provider
resource "kubernetes_namespace" "production" {
  metadata {
    name = "production"
    labels = {
      "pod-security.kubernetes.io/enforce"         = "restricted"
      "pod-security.kubernetes.io/enforce-version" = "latest"
      "pod-security.kubernetes.io/audit"           = "restricted"
      "pod-security.kubernetes.io/warn"            = "restricted"
    }
  }
}

Remediation — kubectl

kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/enforce-version=latest \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

# Or as a YAML manifest:
cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
EOF

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: AWS Config custom rule wrapping a Lambda evaluator for Pod Security Standards drift on EKS namespaces.
Parameters:
  EvaluatorFunctionArn:
    Type: String
Resources:
  PssEvaluatorRule:
    Type: AWS::Config::ConfigRule
    Properties:
      ConfigRuleName: eks-pod-security-standards-baseline
      Description: Custom evaluator flagging EKS namespaces without pod-security.kubernetes.io/enforce=baseline.
      Source:
        Owner: CUSTOM_LAMBDA
        SourceIdentifier: !Ref EvaluatorFunctionArn
        SourceDetails:
          - EventSource: aws.config
            MessageType: ScheduledNotification
            MaximumExecutionFrequency: TwentyFour_Hours

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-08 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §5.2 (Pod security) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) CM-6; AC-6; SI-3 A.8.9; A.8.28 CLD.6.3.1 NIST SP 800-190 §4.2 NSA/CISA Kubernetes Hardening Guide v1.2 §3 (Pod security)

Log signals

  • EKS audit-log events with annotations["pod-security.kubernetes.io/enforce-policy"] equal to privileged on namespaces that should run under baseline or restricted — direct evidence the enforcement label was downgraded.
  • Audit-log requestObject.kind=Namespace with verb=create or verb=patch whose requestObject.metadata.labels map sets pod-security.kubernetes.io/enforce=privileged from a principal outside the cluster-admin SSO role group.
  • Pod-create admission denials (response code 403) with reason text containing violates PodSecurity "restricted:latest" — useful as a positive signal that the policy is firing, and as a baseline for noise-tuning before alerting on namespace downgrades.

Query

fields @timestamp, user.username, verb, objectRef.namespace, requestObject.metadata.labels, responseStatus.code
          | filter verb in ["create","patch","update"] and objectRef.resource = "namespaces"
          | filter requestObject.metadata.labels.pod-security_kubernetes_io_enforce = "privileged"
          | sort @timestamp desc
          | limit 50

EKS audit logs land in CloudWatch Logs as JSON; the CloudWatch Logs Insights query addresses the dot-replaced label key pod-security_kubernetes_io_enforce because dots are not legal in field paths — confirm the field name in your own log group with a sample fields @message first.

Alert threshold

  • Any namespace downgrade to enforce=privileged outside the documented privileged-namespace allow-list (typically kube-system, kube-public, and the platform's CNI/CSI namespaces) — page immediately.
  • A drop of the enforce label entirely (relying on cluster default) on a namespace that previously had it set — high-priority ticket; the cluster-default may be weaker than the per-namespace setting that was removed.
  • More than 20 pod-admission denials per minute referencing the same namespace — informational; usually a workload migration in progress and the operator needs to be pointed at the violation rather than the policy loosened.

Initial response

  1. Restore the namespace label with kubectl label namespace {ns} pod-security.kubernetes.io/enforce=restricted --overwrite; if existing pods now violate the policy, cordon the namespace and surface the violation list with kubectl get pods -n {ns} -o json | jq '.items[] | select(.spec.securityContext.privileged==true)'.
  2. Enumerate the principal's audit-log trail over the prior 24 hours for any other namespace-label mutations or RBAC binding changes — the same downgrade pattern often precedes a chained privilege escalation.
  3. Open an incident via general/ir.html if pods were created during the downgrade window with privileged=true, hostPath, hostNetwork, or capabilities.add beyond NET_BIND_SERVICE; treat those pods as a containment-priority population.

References

Equivalent controls in other providers: GKE PSS namespace labels, AKS Azure Policy PSS initiative, OKE PSS admission.

aws-k8s-09 ! HIGH PREVENTIVE

EKS Standard: Enable VPC CNI network policy by setting ENABLE_NETWORK_POLICY=true on the vpc-cni managed add-on (GA 2024), or install Calico as the policy enforcer. Without a NetworkPolicy-capable CNI, NetworkPolicy objects have no effect (silent failure). EKS Auto Mode: The VPC CNI with network policy enforcement is configured by EKS Auto; default-deny NetworkPolicy manifests are honored without additional CNI bootstrapping.

Apply a default-deny NetworkPolicy in every application namespace, then add explicit allow rules for required ingress and egress. Without a default-deny baseline, pod-to-pod and pod-to-external traffic is unrestricted by default, allowing any compromised pod to scan the pod subnet and reach internal services freely.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_eks_addon" "vpc_cni" {
  cluster_name = aws_eks_cluster.hardened.name
  addon_name   = "vpc-cni"

  configuration_values = jsonencode({
    enableNetworkPolicy = "true"
  })
}

# Default-deny NetworkPolicy via hashicorp/kubernetes ~> 3.0
resource "kubernetes_manifest" "default_deny" {
  manifest = {
    apiVersion = "networking.k8s.io/v1"
    kind       = "NetworkPolicy"
    metadata = {
      name      = "default-deny-all"
      namespace = "production"
    }
    spec = {
      podSelector = {}
      policyTypes = ["Ingress", "Egress"]
    }
  }
}

Remediation — kubectl

cat <<'EOF' | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
EOF

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS managed VPC CNI addon configured with enableNetworkPolicy=true for K8s NetworkPolicy enforcement.
Parameters:
  ClusterName:
    Type: String
Resources:
  VpcCniAddon:
    Type: AWS::EKS::Addon
    Properties:
      ClusterName: !Ref ClusterName
      AddonName: vpc-cni
      ResolveConflicts: OVERWRITE
      ConfigurationValues: !Sub |
        {
          "enableNetworkPolicy": "true"
        }

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-09 HIGH PREVENTIVE AWS EKS CIS Kubernetes Benchmark v1.11.0 §5.3 (Network policies) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) SC-7; SC-5; AC-4 A.8.20; A.8.22 CLD.9.5.1 NIST SP 800-190 §4.4.1 NSA/CISA Kubernetes Hardening Guide v1.2 §2 (Network separation)

Log signals

  • EKS audit-log verb=delete events on resources of kind NetworkPolicy in namespaces where a default-deny policy is the documented baseline — especially when the deleted policy's name matches the cluster's standard naming convention (commonly default-deny-ingress / default-deny-egress).
  • Audit-log verb=update on a baseline default-deny NetworkPolicy where the new requestObject.spec.podSelector.matchLabels narrows the selector or where policyTypes drops Egress — weakens the policy without an explicit delete.
  • VPC CNI flow logs (if enabled) showing pod-to-pod traffic between namespaces that should be isolated; for clusters running the AWS VPC CNI with Network Policy support, this surfaces as kubectl logs -n kube-system -l app=aws-network-policy-agent entries with verdict allowed on flows that the policy graph claims should deny.

Query

fields @timestamp, user.username, verb, objectRef.namespace, objectRef.name, requestObject.spec.policyTypes
          | filter objectRef.resource = "networkpolicies"
          | filter verb in ["delete","update","patch"]
          | filter objectRef.name like /default-deny/
          | sort @timestamp desc
          | limit 100

The CloudWatch Logs Insights filter assumes the platform team enforces a default-deny substring naming convention; if the cluster uses Calico or Cilium with their own naming, swap the regex against the actual policy-name allow-list maintained alongside the cluster's GitOps repository.

Alert threshold

  • Any delete of a default-deny NetworkPolicy outside a documented namespace-decommission ticket — page immediately; once removed, the namespace's pod-to-pod surface widens to all peers until the policy is re-applied.
  • An update narrowing podSelector from {} (all pods) to a labelled subset, or removing Egress from policyTypes — high-priority ticket; the deny-by-default invariant is broken even though the policy nominally still exists.
  • Pod-to-pod flows between isolated namespaces ratio rising above the 7-day p99 — informational; typically a side-effect of legitimate cross-namespace service introduction that needs the policy graph extended rather than relaxed.

Initial response

  1. Re-apply the baseline default-deny policy from the GitOps repository (kubectl apply -f policies/default-deny.yaml -n {ns}) and confirm the policy is enforcing by attempting a curl from a debug pod to a peer service — connection-refused confirms restoration.
  2. Pull the audit-log trail for the principal that issued the delete/update over the prior 24 hours; if they also touched RBAC, secrets, or workload manifests, expand the investigation scope to the full set of mutations.
  3. Open an incident per general/ir.html if the namespace handles regulated data, and consult the VPC CNI policy-agent logs for any pod-to-pod flows allowed during the policy-absent window — those flows constitute lateral-movement opportunity for any pre-existing compromised pod.

References

Equivalent controls in other providers: GKE Dataplane V2 NetworkPolicy, AKS Azure CNI NetworkPolicy, OKE Calico NetworkPolicy.

aws-k8s-10 ! MEDIUM PREVENTIVE

EKS Standard: Use the EKS-managed add-on lifecycle for vpc-cni, kube-proxy, coredns — plus optionally aws-ebs-csi-driver, eks-pod-identity-agent, amazon-cloudwatch-observability. Managed add-ons receive AWS-signed versions and integrate with the cluster upgrade flow. EKS Auto Mode: Add-on lifecycle is managed by EKS Auto — manual version selection is unnecessary; AWS schedules upgrades aligned to the K8s control-plane version.

Use EKS-managed add-ons for the three core networking and DNS components (vpc-cni, kube-proxy, coredns). Managed add-ons receive AWS-signed versions, security patches via the EKS control-plane upgrade flow, and conflict resolution. Self-managed add-on installation (e.g. raw kubectl apply of community manifests) skips signature verification, version compatibility checks against the control-plane Kubernetes version, and the upgrade orchestration.

Remediation — Terraform

# Terraform AWS provider ~> 5.0
resource "aws_eks_addon" "vpc_cni" {
  cluster_name                = aws_eks_cluster.hardened.name
  addon_name                  = "vpc-cni"
  resolve_conflicts_on_update = "OVERWRITE"
  configuration_values        = jsonencode({ enableNetworkPolicy = "true" })
}

resource "aws_eks_addon" "kube_proxy" {
  cluster_name                = aws_eks_cluster.hardened.name
  addon_name                  = "kube-proxy"
  resolve_conflicts_on_update = "OVERWRITE"
}

resource "aws_eks_addon" "coredns" {
  cluster_name                = aws_eks_cluster.hardened.name
  addon_name                  = "coredns"
  resolve_conflicts_on_update = "OVERWRITE"
}

Remediation — aws eks

VPC_CNI_VERSION=$(aws eks describe-addon-versions \
  --addon-name vpc-cni \
  --kubernetes-version 1.30 \
  --query 'addons[0].addonVersions[0].addonVersion' --output text)

aws eks create-addon \
  --cluster-name hardened-cluster \
  --addon-name vpc-cni \
  --addon-version "$VPC_CNI_VERSION" \
  --resolve-conflicts OVERWRITE \
  --configuration-values '{"enableNetworkPolicy":"true"}'

Remediation — CloudFormation

AWSTemplateFormatVersion: '2010-09-09'
Description: EKS managed addons pinned to current versions for predictable upgrade cadence.
Parameters:
  ClusterName:
    Type: String
Resources:
  CoreDnsAddon:
    Type: AWS::EKS::Addon
    Properties:
      ClusterName: !Ref ClusterName
      AddonName: coredns
      ResolveConflicts: OVERWRITE
  KubeProxyAddon:
    Type: AWS::EKS::Addon
    Properties:
      ClusterName: !Ref ClusterName
      AddonName: kube-proxy
      ResolveConflicts: OVERWRITE
  EbsCsiAddon:
    Type: AWS::EKS::Addon
    Properties:
      ClusterName: !Ref ClusterName
      AddonName: aws-ebs-csi-driver
      ResolveConflicts: OVERWRITE

Compliance mapping

Control Severity Type Provider CIS Kubernetes Benchmark v1.11.0 CIS Amazon Elastic Kubernetes Service (EKS) Benchmark v1.8.0 NIST SP 800-53 rev5 ISO/IEC 27001:2022 ISO/IEC 27017:2015 NIST SP 800-190 (Sep 2017) NSA/CISA Kubernetes Hardening Guide v1.2
aws-k8s-10 MEDIUM PREVENTIVE AWS EKS n/a (managed add-on lifecycle is provider-specific) n/a (verify against CIS EKS Benchmark v1.8.0 PDF) SI-2; CM-7 A.8.8; A.8.32 CLD.9.5.2 NIST SP 800-190 §4.1 (Image risks) NSA/CISA Kubernetes Hardening Guide v1.2 §7 (Patch management)

Log signals

  • CloudTrail eks:DeleteAddon targeting vpc-cni, kube-proxy, coredns, or aws-ebs-csi-driver — removes the EKS-managed lifecycle hook so the addon reverts to a self-managed unmanaged DaemonSet whose patch cadence is no longer auditable through the EKS API.
  • CloudTrail eks:UpdateAddon with requestParameters.resolveConflicts=OVERWRITE — silently overwrites operator-applied configuration patches that the cluster relies on, breaking expected behaviour without surfacing through the addon's configurationValues diff.
  • DescribeAddon responses (polled by the platform team's daily audit Lambda) where addonVersion is more than two minor versions behind latestVersion — passive drift that does not generate CloudTrail noise but is the canonical signal for un-applied CVE patches.

Query

fields @timestamp, eventName, requestParameters.clusterName, requestParameters.addonName, requestParameters.addonVersion, requestParameters.resolveConflicts, userIdentity.arn
          | filter eventSource = "eks.amazonaws.com" and eventName in ["DeleteAddon","UpdateAddon"]
          | filter requestParameters.addonName in ["vpc-cni","kube-proxy","coredns","aws-ebs-csi-driver","aws-efs-csi-driver"]
          | sort @timestamp desc
          | limit 100

For the version-drift case, schedule a separate CloudWatch Logs Insights search against the platform-team audit Lambda's log group filtering on the structured field addon_version_lag_minor emitted by the daily DescribeAddon sweep.

Alert threshold

  • Any DeleteAddon on the core four addons in production — page immediately; the cluster will continue running on the now-unmanaged DaemonSet but loses EKS-driven CVE patching.
  • UpdateAddon with resolveConflicts=OVERWRITE on any addon — high-priority ticket; surface to the addon's owning team within 30 minutes so they can confirm the overwrite was intentional and re-apply any configuration patches the overwrite stripped.
  • Addon version more than two minor versions behind latest for more than 14 days — informational; track as a monthly hygiene ticket rather than an incident.

Initial response

  1. If the addon was deleted, re-create the managed addon with aws eks create-addon --cluster-name {cluster} --addon-name {name} --addon-version {pinned-version} --resolve-conflicts PRESERVE using the previously-pinned version captured from the GitOps repository.
  2. Diff the live DaemonSet manifest against the pre-overwrite version stored in cluster backups (if etcd snapshots are retained, otherwise from the GitOps repository) and re-apply any custom configuration that the overwrite stripped — typical losses include CNI prefix-delegation tuning and CoreDNS upstream forwarders.
  3. Open an incident via general/ir.html when the affected addon is vpc-cni or aws-ebs-csi-driver, because both have privileged DaemonSets whose un-managed lifecycle creates a window for malicious image pinning by a compromised cluster-admin.

References

Equivalent controls in other providers: OKE Enhanced Cluster add-on lifecycle (closest 1:1).

Sources