Skip to main content
A private cluster places all EC2 instances into private subnets. Nodes have no public IP addresses and all traffic between EMR and AWS services travels over VPC endpoints or a NAT gateway.
Your private subnets must be tagged with "for-use-with-amazon-emr-managed-policies" = true for the managed IAM policy AmazonEMRServicePolicy_v2 to function correctly. See the EMR managed IAM policies documentation for details.

VPC endpoint requirements

To avoid routing cluster traffic through a NAT gateway, create Interface VPC endpoints for elasticmapreduce and sts, and a Gateway endpoint for s3. The example below includes all three:
module "vpc_endpoints" {
  source  = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"
  version = "~> 6.0"

  vpc_id = module.vpc.vpc_id

  endpoints = merge({
    s3 = {
      service         = "s3"
      service_type    = "Gateway"
      route_table_ids = module.vpc.private_route_table_ids
      tags = {
        Name = "${local.name}-s3"
      }
    }
    },
    { for service in toset(["elasticmapreduce", "sts"]) :
      replace(service, ".", "_") =>
      {
        service             = service
        subnet_ids          = module.vpc.private_subnets
        private_dns_enabled = true
        tags                = { Name = "${local.name}-${service}" }
      }
  })

  create_security_group = true
  security_group_rules = {
    ingress_https = {
      description = "HTTPS from private subnets"
      cidr_blocks = module.vpc.private_subnets_cidr_blocks
    }
  }
}

Configuration

Choose between instance fleets or instance groups depending on your workload requirements. Instance fleets let you mix instance types and combine On-Demand and Spot capacity. Instance groups give you a fixed instance type per node group.
Instance fleets support multiple instance types per node group and mixed On-Demand and Spot capacity. The master fleet uses a single On-Demand m5.xlarge. The core fleet mixes three instance types with a combination of On-Demand and Spot capacity, falling back to On-Demand if no Spot capacity is available within five minutes.
module "emr" {
  source = "terraform-aws-modules/emr/aws"

  name = "example-instance-fleet"

  release_label_filters = {
    emr7 = {
      prefix = "emr-7"
    }
  }
  applications = ["spark", "trino"]
  auto_termination_policy = {
    idle_timeout = 14400
  }

  bootstrap_action = [
    {
      path = "file:/bin/echo",
      name = "Just an example",
      args = ["Hello World!"]
    }
  ]

  configurations_json = jsonencode([
    {
      "Classification" : "spark-env",
      "Configurations" : [
        {
          "Classification" : "export",
          "Properties" : {
            "JAVA_HOME" : "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties" : {}
    }
  ])

  master_instance_fleet = {
    name                      = "master-fleet"
    target_on_demand_capacity = 1
    instance_type_configs = [
      {
        instance_type = "m5.xlarge"
      }
    ]
  }

  core_instance_fleet = {
    name                      = "core-fleet"
    target_on_demand_capacity = 2
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        instance_type                              = "c6i.xlarge"
        weighted_capacity                          = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  task_instance_fleet = {
    name                      = "task-fleet"
    target_on_demand_capacity = 1
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  ebs_root_volume_size = 64
  ec2_attributes = {
    # Subnets must be private and tagged with
    # { "for-use-with-amazon-emr-managed-policies" = true }
    subnet_ids = module.vpc.private_subnets
  }
  vpc_id = module.vpc.vpc_id

  keep_job_flow_alive_when_no_steps = true
  list_steps_states                 = ["PENDING", "RUNNING", "CANCEL_PENDING", "CANCELLED", "FAILED", "INTERRUPTED", "COMPLETED"]
  log_uri                           = "s3://${module.s3_bucket.s3_bucket_id}/"

  scale_down_behavior        = "TERMINATE_AT_TASK_COMPLETION"
  step_concurrency_level     = 3
  termination_protection     = false
  unhealthy_node_replacement = true
  visible_to_all_users       = true

  tags = local.tags
}

Supporting resources

The complete working example at examples/private-cluster/main.tf includes a VPC with private subnets and a NAT gateway, VPC endpoints for S3, EMR, and STS, and an encrypted S3 bucket for logs.
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 6.0"

  name = local.name
  cidr = "10.0.0.0/16"

  azs             = local.azs
  public_subnets  = [for k, v in local.azs : cidrsubnet("10.0.0.0/16", 8, k)]
  private_subnets = [for k, v in local.azs : cidrsubnet("10.0.0.0/16", 8, k + 10)]

  enable_nat_gateway = true
  single_nat_gateway = true

  # Tag private subnets so EMR managed policies can reference them
  private_subnet_tags = { "for-use-with-amazon-emr-managed-policies" = true }
}

module "s3_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~> 5.0"

  bucket_prefix = "${local.name}-"
  force_destroy = true

  attach_deny_insecure_transport_policy = true
  attach_require_latest_tls_policy      = true

  server_side_encryption_configuration = {
    rule = {
      apply_server_side_encryption_by_default = {
        sse_algorithm = "AES256"
      }
    }
  }
}

Security considerations

  • Nodes have no public IP addresses, reducing the attack surface from the internet.
  • Use VPC endpoints to keep traffic between EMR and S3/EMR APIs on the AWS network and avoid NAT gateway data transfer charges.
  • The module creates a dedicated service access security group for private clusters, enabling EMR to communicate with cluster nodes over port 8443.
  • Restrict the autoscaling role trust policy to your account and region using aws:SourceAccount and aws:SourceArn conditions, as shown in the full example.