Cluster
Controls if resources should be created (affects nearly all resources).
Name of the job flow.
Region where the resource(s) will be managed. Defaults to the Region set in the provider configuration.
Release label for the Amazon EMR release.
Map of release label filters used to look up a release label. Each entry accepts
application (optional string) and prefix (optional string).A case-insensitive list of applications for Amazon EMR to install and configure when launching the cluster.
JSON string for supplying list of configurations for the EMR cluster.
List of configurations supplied for the EMR cluster you are creating. Supply a configuration object for applications to override their default configuration.
JSON string for selecting additional features such as adding proxy information. Note: Currently there is no API to retrieve the value of this argument after EMR cluster creation from provider, therefore Terraform cannot detect drift from the actual EMR cluster if its value is changed outside Terraform.
Ordered list of bootstrap actions that will be run before Hadoop is started on the cluster nodes. Each object accepts:
name(string, required)path(string, required)args(list(string), optional)
Steps to run when creating the cluster. Each object accepts:
action_on_failure(string, required)name(string, required)hadoop_jar_step(object, optional) withjar,args,main_class,properties
List of step states used to filter returned steps.
Instance configuration
Configuration block to use an Instance Fleet for the master node type. Cannot be specified if any
master_instance_group configuration blocks are set. Accepts:name(optional string)target_on_demand_capacity(optional number)target_spot_capacity(optional number)instance_type_configs(optional list of objects)launch_specifications(optional object withon_demand_specificationandspot_specification)
Configuration block to use an Instance Fleet for the core node type. Cannot be specified if any
core_instance_group configuration blocks are set. Accepts the same structure as master_instance_fleet.Configuration block to use an Instance Fleet for the task node type. Cannot be specified if any
task_instance_group configuration blocks are set. Accepts the same structure as master_instance_fleet.Configuration block to use an Instance Group for the master node type. Accepts:
instance_type(string, required)bid_price(optional string)instance_count(optional number)name(optional string)ebs_config(optional list of objects)
Configuration block to use an [Instance Group] for the core node type. Accepts:
instance_type(string, required)autoscaling_policy(optional string)bid_price(optional string)instance_count(optional number)name(optional string)ebs_config(optional list of objects)
Configuration block to use an Instance Group for the task node type. Accepts:
instance_type(string, required)autoscaling_policy(optional string)bid_price(optional string)configurations_json(optional string)ebs_optimized(optional bool, default: true)instance_count(optional number)name(optional string)ebs_config(optional list of objects)
Node settings
Size in GiB of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.
Custom Amazon Linux AMI for the cluster (instead of an EMR-owned AMI). Available in Amazon EMR version 5.7.0 and later.
Amazon Linux release for all nodes in a cluster launch RunJobFlow request. If not specified, Amazon EMR uses the latest validated Amazon Linux release for cluster launch.
Attributes for the EC2 instances running the job flow. Accepts:
additional_master_security_groups(optional string)additional_slave_security_groups(optional string)emr_managed_master_security_group(optional string)emr_managed_slave_security_group(optional string)instance_profile(optional string)key_name(optional string)service_access_security_group(optional string)subnet_id(optional string)subnet_ids(optional list(string))
Cluster behavior
Way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized.
Number of steps that can be executed concurrently. You can specify a maximum of 256 steps. Only valid for EMR clusters with
release_label 5.28.0 or greater (default is 1).Switch on/off run cluster with no steps or when all steps are complete (default is on).
Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default value is
true.Switch on/off termination protection (default is
false, except when using multiple master nodes). Before attempting to destroy the resource when termination protection is enabled, this configuration must be applied with its value set to false.Whether Amazon EMR should gracefully replace core nodes that have degraded within the cluster.
An auto-termination policy for an Amazon EMR cluster. Accepts:
idle_timeout(optional number) — amount of idle time in seconds after which a cluster automatically terminates.
Compute limit configuration for a Managed Scaling Policy. Accepts:
maximum_capacity_units(number, required)minimum_capacity_units(number, required)unit_type(string, required)maximum_core_capacity_units(optional number)maximum_ondemand_capacity_units(optional number)scaling_strategy(optional string)utilization_performance_index(optional number)
The specified placement group configuration. Each object accepts:
instance_role(string, required)placement_strategy(optional string)
Logging and security
S3 bucket to write the log files of the job flow. If a value is not provided, logs are not created.
AWS KMS customer master key (CMK) key ID or ARN used for encrypting log files. This attribute is only available with EMR version 5.30.0 and later, excluding EMR 6.0.0.
Security configuration to create, or attach if
create_security_configuration is false. Only valid for EMR clusters with release_label 4.8.0 or greater.Name of the security configuration to create, or attach if
create_security_configuration is false. Only valid for EMR clusters with release_label 4.8.0 or greater.Determines whether
security_configuration_name is used as a prefix.Determines whether a security configuration is created.
Kerberos configuration for the cluster. Accepts:
kdc_admin_password(string, required)realm(string, required)ad_domain_join_password(optional string)ad_domain_join_user(optional string)cross_realm_trust_principal_password(optional string)
Identifies whether the cluster is created in a private subnet.
IAM
Determines whether the service IAM role should be created.
The ARN of an existing IAM role to use for the service.
Name to use on the IAM role created.
Description of the role.
Map of IAM policies to attach to the service role.
Name to use on the IAM policy created.
Description of the policy.
Determines whether the autoscaling IAM role should be created.
The ARN of an existing IAM role to use for autoscaling.
Name to use on the IAM role created.
Description of the role.
Determines whether the EC2 IAM role/instance profile should be created.
Name to use on the EC2 IAM role/instance profile created.
Description of the EC2 IAM role/instance profile.
Map of IAM policies to attach to the EC2 IAM role/instance profile.
The ARN of an existing IAM role to use if passing in a custom instance profile and creating a service role.
IAM role path.
ARN of the policy that is used to set the permissions boundary for the IAM role.
Determines whether the IAM role name is used as a prefix.
A map of additional tags to add to the IAM role created.
Security groups
The ID of the Amazon Virtual Private Cloud (Amazon VPC) where the security groups will be created.
Determines whether managed security groups are created.
Name to use on managed security group created. Note:
-master, -slave, and -service will be appended to this name to distinguish.Determines whether the security group name (
security_group_name) is used as a prefix.A map of additional tags to add to the security group created.
Description of the master security group created.
Security group ingress rules to add to the master security group. Each rule accepts
cidr_ipv4, cidr_ipv6, description, from_port, to_port, ip_protocol, prefix_list_id, referenced_security_group_id, reference_slave_security_group, name, and tags.Security group egress rules to add to the master security group. Each rule accepts the same fields as ingress rules.
Description of the slave security group created.
Security group ingress rules to add to the slave security group. Each rule accepts
cidr_ipv4, cidr_ipv6, description, from_port, to_port, ip_protocol, prefix_list_id, referenced_security_group_id, reference_master_security_group, name, and tags.Security group egress rules to add to the slave security group. Each rule accepts the same fields as ingress rules.
Description of the service access security group created.
Security group ingress rules to add to the service access security group. Each rule accepts
cidr_ipv4, cidr_ipv6, description, from_port, to_port, ip_protocol, prefix_list_id, referenced_security_group_id, reference_master_security_group, name, and tags.Security group egress rules to add to the service access security group. Each rule accepts the same fields as ingress rules.
Tags
A map of tags to add to all resources.