Skip to main content
Instance fleets let you provision capacity using a mix of On-Demand and Spot instances drawn from multiple instance types. EMR selects instances based on your target capacities and launch specifications, which helps you optimize for cost and availability.
Instance fleets and instance groups are mutually exclusive. You must choose one model for the entire cluster. Setting any *_instance_fleet variable will conflict with the corresponding *_instance_group variable.

Fleet types

The module exposes three fleet variables, one per node role:
VariableNode roleResource
master_instance_fleetMasterInline in aws_emr_cluster
core_instance_fleetCoreInline in aws_emr_cluster
task_instance_fleetTaskSeparate aws_emr_instance_fleet resource
The master fleet controls the primary node that coordinates the cluster. For most clusters a single On-Demand master instance is sufficient.
master_instance_fleet.name
string
Display name for the fleet.
master_instance_fleet.target_on_demand_capacity
number
Number of On-Demand capacity units to provision. For master nodes, set this to 1.
master_instance_fleet.target_spot_capacity
number
Number of Spot capacity units to provision.
master_instance_fleet.instance_type_configs
list(object)
List of instance type configurations the fleet can choose from. See Instance type config for nested attributes.
master_instance_fleet.launch_specifications
object
Spot and On-Demand launch strategies. See Launch specifications for nested attributes.
master_instance_fleet = {
  name                      = "master-fleet"
  target_on_demand_capacity = 1
  instance_type_configs = [
    {
      instance_type = "m5.xlarge"
    }
  ]
}
The core fleet stores data in HDFS and runs compute tasks. You can split capacity across On-Demand and Spot to balance cost and resilience.
core_instance_fleet.name
string
Display name for the fleet.
core_instance_fleet.target_on_demand_capacity
number
On-Demand capacity units to provision.
core_instance_fleet.target_spot_capacity
number
Spot capacity units to provision.
core_instance_fleet.instance_type_configs
list(object)
List of instance type configurations. See Instance type config.
core_instance_fleet.launch_specifications
object
Spot and On-Demand launch strategies. See Launch specifications.
core_instance_fleet = {
  name                      = "core-fleet"
  target_on_demand_capacity = 2
  target_spot_capacity      = 2
  instance_type_configs = [
    {
      instance_type     = "c4.large"
      weighted_capacity = 1
    },
    {
      bid_price_as_percentage_of_on_demand_price = 100
      ebs_config = [{
        size                 = 256
        type                 = "gp3"
        volumes_per_instance = 1
      }]
      instance_type     = "c5.xlarge"
      weighted_capacity = 2
    },
    {
      bid_price_as_percentage_of_on_demand_price = 100
      instance_type                              = "c6i.xlarge"
      weighted_capacity                          = 2
    }
  ]
  launch_specifications = {
    spot_specification = {
      allocation_strategy      = "capacity-optimized"
      block_duration_minutes   = 0
      timeout_action           = "SWITCH_TO_ON_DEMAND"
      timeout_duration_minutes = 5
    }
  }
}
The task fleet adds compute-only capacity to the cluster. Task nodes do not store HDFS data, making them safe to run entirely on Spot instances.
task_instance_fleet.name
string
Display name for the fleet.
task_instance_fleet.target_on_demand_capacity
number
On-Demand capacity units to provision.
task_instance_fleet.target_spot_capacity
number
Spot capacity units to provision.
task_instance_fleet.instance_type_configs
list(object)
List of instance type configurations. See Instance type config.
task_instance_fleet.launch_specifications
object
Spot and On-Demand launch strategies. See Launch specifications.
task_instance_fleet = {
  name                      = "task-fleet"
  target_on_demand_capacity = 1
  target_spot_capacity      = 2
  instance_type_configs = [
    {
      instance_type     = "c4.large"
      weighted_capacity = 1
    },
    {
      bid_price_as_percentage_of_on_demand_price = 100
      ebs_config = [{
        size                 = 256
        type                 = "gp3"
        volumes_per_instance = 1
      }]
      instance_type     = "c5.xlarge"
      weighted_capacity = 2
    }
  ]
  launch_specifications = {
    spot_specification = {
      allocation_strategy      = "capacity-optimized"
      block_duration_minutes   = 0
      timeout_action           = "SWITCH_TO_ON_DEMAND"
      timeout_duration_minutes = 5
    }
  }
}

Instance type config

Each entry in instance_type_configs describes one instance type that the fleet may use.
instance_type
string
required
EC2 instance type, for example "m5.xlarge".
weighted_capacity
number
Number of capacity units this instance type contributes toward the fleet’s target. When omitted, each instance counts as one unit.
bid_price
string
Maximum Spot price in USD per instance-hour. Mutually exclusive with bid_price_as_percentage_of_on_demand_price.
bid_price_as_percentage_of_on_demand_price
number
default:"60"
Maximum Spot price expressed as a percentage of the current On-Demand price. Defaults to 60.
ebs_config
list(object)
EBS volumes to attach to each instance of this type.
  • size — Volume size in GiB. Default: 256.
  • type — EBS volume type. Default: "gp3".
  • iops — Provisioned IOPS for io1/io2 volumes.
  • volumes_per_instance — Number of volumes to attach per instance.

Launch specifications

Launch specifications control how EMR fulfills Spot and On-Demand requests for the fleet.

Spot specification

launch_specifications.spot_specification.allocation_strategy
string
default:"capacity-optimized"
Strategy EMR uses to select Spot pools. "capacity-optimized" prioritizes pools with the most available capacity, reducing interruption risk.
launch_specifications.spot_specification.block_duration_minutes
number
Duration in minutes for a Spot block (defined-duration Spot). Allowed values: 60, 120, 180, 240, 300, 360. Set to 0 to disable.
launch_specifications.spot_specification.timeout_action
string
default:"SWITCH_TO_ON_DEMAND"
Action to take if Spot capacity cannot be provisioned within timeout_duration_minutes. "SWITCH_TO_ON_DEMAND" falls back to On-Demand; "TERMINATE_CLUSTER" stops the launch.
launch_specifications.spot_specification.timeout_duration_minutes
number
default:"60"
Minutes to wait for Spot capacity before taking timeout_action.

On-Demand specification

launch_specifications.on_demand_specification.allocation_strategy
string
default:"lowest-price"
Strategy for selecting On-Demand instance pools. Currently only "lowest-price" is supported.

Complete example

The following example creates a private cluster with all three fleet types configured:
module "emr" {
  source = "terraform-aws-modules/emr/aws"

  name = "example-instance-fleet"

  release_label = "emr-7.9.0"
  applications  = ["spark", "trino"]
  auto_termination_policy = {
    idle_timeout = 3600
  }

  master_instance_fleet = {
    name                      = "master-fleet"
    target_on_demand_capacity = 1
    instance_type_configs = [
      {
        instance_type = "m5.xlarge"
      }
    ]
  }

  core_instance_fleet = {
    name                      = "core-fleet"
    target_on_demand_capacity = 2
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        instance_type                              = "c6i.xlarge"
        weighted_capacity                          = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  task_instance_fleet = {
    name                      = "task-fleet"
    target_on_demand_capacity = 1
    target_spot_capacity      = 2
    instance_type_configs = [
      {
        instance_type     = "c4.large"
        weighted_capacity = 1
      },
      {
        bid_price_as_percentage_of_on_demand_price = 100
        ebs_config = [{
          size                 = 256
          type                 = "gp3"
          volumes_per_instance = 1
        }]
        instance_type     = "c5.xlarge"
        weighted_capacity = 2
      }
    ]
    launch_specifications = {
      spot_specification = {
        allocation_strategy      = "capacity-optimized"
        block_duration_minutes   = 0
        timeout_action           = "SWITCH_TO_ON_DEMAND"
        timeout_duration_minutes = 5
      }
    }
  }

  ebs_root_volume_size = 64
  ec2_attributes = {
    # Subnets should be private subnets and tagged with
    # { "for-use-with-amazon-emr-managed-policies" = true }
    subnet_ids = ["subnet-abcde012", "subnet-bcde012a", "subnet-fghi345a"]
  }
  vpc_id = "vpc-1234556abcdef"

  scale_down_behavior    = "TERMINATE_AT_TASK_COMPLETION"
  step_concurrency_level = 3
  termination_protection = false
  visible_to_all_users   = true

  tags = {
    Terraform   = "true"
    Environment = "dev"
  }
}