Private cluster

A private cluster places all EC2 instances into private subnets. Nodes have no public IP addresses and all traffic between EMR and AWS services travels over VPC endpoints or a NAT gateway.

Your private subnets must be tagged with "for-use-with-amazon-emr-managed-policies" = true for the managed IAM policy AmazonEMRServicePolicy_v2 to function correctly. See the EMR managed IAM policies documentation for details.

VPC endpoint requirements

To avoid routing cluster traffic through a NAT gateway, create Interface VPC endpoints for elasticmapreduce and sts, and a Gateway endpoint for s3. The example below includes all three:

module "vpc_endpoints" {
  source  = "terraform-aws-modules/vpc/aws//modules/vpc-endpoints"
  version = "~> 6.0"

  vpc_id = module.vpc.vpc_id

  endpoints = merge({
    s3 = {
      service         = "s3"
      service_type    = "Gateway"
      route_table_ids = module.vpc.private_route_table_ids
      tags = {
        Name = "${local.name}-s3"
      }
    }
    },
    { for service in toset(["elasticmapreduce", "sts"]) :
      replace(service, ".", "_") =>
      {
        service             = service
        subnet_ids          = module.vpc.private_subnets
        private_dns_enabled = true
        tags                = { Name = "${local.name}-${service}" }
      }
  })

  create_security_group = true
  security_group_rules = {
    ingress_https = {
      description = "HTTPS from private subnets"
      cidr_blocks = module.vpc.private_subnets_cidr_blocks
    }
  }
}

Configuration

Choose between instance fleets or instance groups depending on your workload requirements. Instance fleets let you mix instance types and combine On-Demand and Spot capacity. Instance groups give you a fixed instance type per node group.

Instance fleet
Instance group

Instance fleets support multiple instance types per node group and mixed On-Demand and Spot capacity. The master fleet uses a single On-Demand m5.xlarge. The core fleet mixes three instance types with a combination of On-Demand and Spot capacity, falling back to On-Demand if no Spot capacity is available within five minutes.

module "emr" {
  source = "terraform-aws-modules/emr/aws"

  name = "example-instance-fleet"

  release_label_filters = {
    emr7 = {
      prefix = "emr-7"
    }
  }
  applications = ["spark", "trino"]
  auto_termination_policy = {
    idle_timeout = 14400
  }

  bootstrap_action = [
    {
      path = "file:/bin/echo",
      name = "Just an example",
      args = ["Hello World!"]
    }
  ]

  configurations_json = jsonencode([
    {
      "Classification" : "spark-env",
      "Configurations" : [
        {
          "Classification" : "export",
          "Properties" : {
            "JAVA_HOME" : "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties" : {}
    }
  ])

  master_instance_fleet = {
    name                      = "master-fleet"
    target_on_demand_capacity = 1
    instance_type_configs = [
      {
        instance_type = "m5.xlarge"
      }
    ]
  }

  core_instance_fleet = {
    name                      = "core-fleet"
    target_on_demand_capacity = 2
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        instance_type                              = "c6i.xlarge"
        weighted_capacity                          = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  task_instance_fleet = {
    name                      = "task-fleet"
    target_on_demand_capacity = 1
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  ebs_root_volume_size = 64
  ec2_attributes = {
    # Subnets must be private and tagged with
    # { "for-use-with-amazon-emr-managed-policies" = true }
    subnet_ids = module.vpc.private_subnets
  }
  vpc_id = module.vpc.vpc_id

  keep_job_flow_alive_when_no_steps = true
  list_steps_states                 = ["PENDING", "RUNNING", "CANCEL_PENDING", "CANCELLED", "FAILED", "INTERRUPTED", "COMPLETED"]
  log_uri                           = "s3://${module.s3_bucket.s3_bucket_id}/"

  scale_down_behavior        = "TERMINATE_AT_TASK_COMPLETION"
  step_concurrency_level     = 3
  termination_protection     = false
  unhealthy_node_replacement = true
  visible_to_all_users       = true

  tags = local.tags
}

Instance groups use a single, fixed instance type per node group. Instance groups only support a single subnet and availability zone. Pass subnet_id (singular) rather than subnet_ids.The instance group example also demonstrates how to supply a custom IAM instance profile and a custom autoscaling role instead of letting the module create them.

module "emr" {
  source = "terraform-aws-modules/emr/aws"

  name                        = "example-instance-group"
  create_iam_instance_profile = false
  create_autoscaling_iam_role = false

  release_label_filters = {
    emr7 = {
      prefix = "emr-7"
    }
  }
  applications = ["spark", "trino"]
  auto_termination_policy = {
    idle_timeout = 14400
  }

  bootstrap_action = [
    {
      name = "Just an example",
      path = "file:/bin/echo",
      args = ["Hello World!"]
    }
  ]

  configurations_json = jsonencode([
    {
      "Classification" : "spark-env",
      "Configurations" : [
        {
          "Classification" : "export",
          "Properties" : {
            "JAVA_HOME" : "/usr/lib/jvm/java-1.8.0"
          }
        }
      ],
      "Properties" : {}
    }
  ])

  master_instance_group = {
    name           = "master-group"
    instance_count = 1
    instance_type  = "m5.xlarge"
  }

  core_instance_group = {
    name           = "core-group"
    instance_count = 2
    instance_type  = "c4.large"
  }

  task_instance_group = {
    name           = "task-group"
    instance_count = 2
    instance_type  = "c4.xlarge"
    bid_price      = "0.17"

    ebs_config = [{
      size                 = 256
      type                 = "gp3"
      volumes_per_instance = 1
    }]
    ebs_optimized = true
  }

  ebs_root_volume_size = 64
  ec2_attributes = {
    # Instance groups only support one subnet/AZ
    subnet_id        = element(module.vpc.private_subnets, 0)
    instance_profile = aws_iam_instance_profile.custom_instance_profile.arn
  }
  iam_instance_profile_role_arn = aws_iam_role.custom_instance_profile.arn
  autoscaling_iam_role_arn      = aws_iam_role.autoscaling.arn

  vpc_id = module.vpc.vpc_id

  keep_job_flow_alive_when_no_steps = true
  list_steps_states                 = ["PENDING", "RUNNING", "CANCEL_PENDING", "CANCELLED", "FAILED", "INTERRUPTED", "COMPLETED"]
  log_uri                           = "s3://${module.s3_bucket.s3_bucket_id}/"

  scale_down_behavior    = "TERMINATE_AT_TASK_COMPLETION"
  step_concurrency_level = 3
  termination_protection = false
  visible_to_all_users   = true

  tags = local.tags
}

When you bring your own instance profile, you must also create the IAM role and attach the required managed policy:

resource "aws_iam_role" "custom_instance_profile" {
  name_prefix        = "custom-instance-profile"
  assume_role_policy = data.aws_iam_policy_document.assume.json
}

data "aws_iam_policy_document" "assume" {
  statement {
    actions = ["sts:AssumeRole"]
    principals {
      identifiers = ["ec2.amazonaws.com"]
      type        = "Service"
    }
  }
}

resource "aws_iam_role_policy_attachment" "emr_for_ec2" {
  role       = aws_iam_role.custom_instance_profile.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"
}

resource "aws_iam_instance_profile" "custom_instance_profile" {
  role = aws_iam_role.custom_instance_profile.name
  name = "custom-instance-profile"

  depends_on = [
    aws_iam_role_policy_attachment.emr_for_ec2,
  ]
}

Supporting resources

The complete working example at examples/private-cluster/main.tf includes a VPC with private subnets and a NAT gateway, VPC endpoints for S3, EMR, and STS, and an encrypted S3 bucket for logs.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 6.0"

  name = local.name
  cidr = "10.0.0.0/16"

  azs             = local.azs
  public_subnets  = [for k, v in local.azs : cidrsubnet("10.0.0.0/16", 8, k)]
  private_subnets = [for k, v in local.azs : cidrsubnet("10.0.0.0/16", 8, k + 10)]

  enable_nat_gateway = true
  single_nat_gateway = true

  # Tag private subnets so EMR managed policies can reference them
  private_subnet_tags = { "for-use-with-amazon-emr-managed-policies" = true }
}

module "s3_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~> 5.0"

  bucket_prefix = "${local.name}-"
  force_destroy = true

  attach_deny_insecure_transport_policy = true
  attach_require_latest_tls_policy      = true

  server_side_encryption_configuration = {
    rule = {
      apply_server_side_encryption_by_default = {
        sse_algorithm = "AES256"
      }
    }
  }
}

Security considerations

Nodes have no public IP addresses, reducing the attack surface from the internet.
Use VPC endpoints to keep traffic between EMR and S3/EMR APIs on the AWS network and avoid NAT gateway data transfer charges.
The module creates a dedicated service access security group for private clusters, enabling EMR to communicate with cluster nodes over port 8443.
Restrict the autoscaling role trust policy to your account and region using aws:SourceAccount and aws:SourceArn conditions, as shown in the full example.

Get Started

Cluster Types

Configuration

Examples

VPC endpoint requirements

Configuration

Supporting resources

Security considerations

Get Started

Cluster Types

Configuration

Examples

​VPC endpoint requirements

​Configuration

​Supporting resources

​Security considerations

VPC endpoint requirements

Configuration

Supporting resources

Security considerations