Streamlining Your Infrastructure: Unlocking the Power of Terraform Cloud Agents

Terraform is a great tool to automate, scale and optimise your infrastructure in today's rapidly evolving technology landscape where managing and deploying infrastructure can be a complex and time-consuming task. Simply plugin the provider and start managing the resources via Terraform.

Terraform Cloud is offered by HashiCorp as a managed service, built on top of the widely adopted open-source tool Terraform. It provides a centralised and collaborative platform which allows teams to optimise workflows and increase productivity.

With the introduction of Terraform Cloud Agents, we can seamlessly and securely configure and orchestrate resources across our environments. These are purpose-built, lightweight agents that acts as a bridge between Terraform Cloud and our Infrastructure. Use of these agents reduces exposure to external threats as they operate within our private network while still allowing us to streamline automation process.

In this article, we will understand how we can use these agents to deploy K8s packages or helm charts using Terraform Cloud on an EKS cluster that is hosted in a private network. So, let’s dive in!


Deploying Agent

We will deploy these agents as a service on ECS Fargate on AWS to minimise the cost and simplify the management and scaling of these agents. A fully functional VPC with private subnets is required to successfully deploy the agents so please make sure to create it in case you don’t have one.

Step 1: Launching Cluster

First things first, let’s create an ECS cluster in the same VPC where our EKS cluster is hosted or in a VPC from which we can reach the EKS cluster over private IP.

Note: Before running the AWS CLI commands please make sure to configure credentials.
aws ecs create-cluster --cluster-name tfc-agents --capacity-providers FARGATE_SPOT

Fig 1. ECS Cluster

To further cut down the cost of running Terraform Cloud Agents and since this is only a tutorial, I’m using Fargate Spot as the capacity provider.

Next step is to create a task execution role.

Step 2: Creating an IAM Role

tfc-agent-role-assume-policy.json

{
   "Version": "2012-10-17",
   "Statement": [
      {
         "Effect": "Allow",
         "Principal": {
            "Service": [
               "ecs-tasks.amazonaws.com"
            ]
         },
         "Action": "sts:AssumeRole",
         "Condition": {
            "ArnLike": {
                "aws:SourceArn": "arn:aws:ecs:REGION:ACCOUNT_ID:*"
            },
            "StringEquals": {
               "aws:SourceAccount": "ACCOUNT_ID"
            }
         }
      }
   ]
}
Note: Make sure to replace REGION and ACCOUNT_ID with their actual values.
aws iam create-role --role-name tfc-agents --assume-role-policy-document file://tfc-agent-role-assume-policy.json

Now, let’s attach an AWS managed policy that allows ECS task to perform usual operations.

aws iam attach-role-policy --role-name tfc-agents --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

Fig 2. IAM Role

Step 3: Creating SSM Parameter

Terraform Cloud Agent requires a token to authenticate itself and instead of hardcoding it in the environment variable of container definition, we will store it in a SSM Parameter store and pass it to the task definition and ECS will fetch the value during the container startup and inject it in the environment.

To create a token for Terraform Cloud Agent:

  • Go to your Terraform Cloud’s organization's settings, click Agents, and then click Create agent pool.

  • Provide an Agent Pool Name, and then click Continue.

  • Enter a token Description, and finally click Create Token.

Make sure to copy the token before clicking the Finish button as we need this token in the next step and once you move away from this page the token can be retrieved.

Fig 3. Terraform Cloud Agent Pool

aws ssm put-parameter --name "/secret/terraform/agent-token" --type "SecureString" --value "xxxxxx"
Note: Make sure to replace xxxxxx in the above CLI command with the actual token value generated earlier in the Terraform Cloud.

Keep the SSM Parameter ARN handy because we are going to need it in the following steps.

Fig 4. Terraform Cloud Agent Token

Step 4: Creating Task Definition

To launch a service on ECS we need a task definition. So, let’s create that.

tfc-agents-task-definition.json

{
  "containerDefinitions": [
    {
      "name": "tfc-agent",
      "image": "hashicorp/tfc-agent:latest",
      "cpu": 512,
      "memory": 2048,
      "essential": true,
      "environment": [
        {
          "name": "TFC_ADDRESS",
          "value": "https://app.terraform.io"
        },
        {
          "name": "TFC_AGENT_NAME",
          "value": "tfc-agent"
        }
      ],
      "secrets": [
        {
          "name": "TFC_AGENT_TOKEN",
          "valueFrom": "SSM_PARAMETER_ARN"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-create-group": "true",
          "awslogs-group": "/aws/fargate/service/tfc-agent-ecs-demo",
          "awslogs-region": "AWS_REGION",
          "awslogs-stream-prefix": "ecs-demo"
        }
      }
    }
  ]
}
Note: Before registering the task make sure to replace the placeholders SSM_PARAMETER_ARN and AWS_REGION with their actual values. SSM_PARAMETER_ARN is the ARN of the SSM parameter that we created above. TFC_ADDRESS in the container definition is set to https://app.terraform.io but in case you are hosting an enterprise version then please replace app.terraform.io with the domain you use to access the terraform cloud.
aws ecs register-task-definition --family tfc-agents --execution-role-arn arn:aws:iam::ACCOUNT_ID:role/tfc-agents --network-mode awsvpc --requires-compatibilities FARGATE --cpu 512 --memory 2048 --cli-input-json file://tfc-agents-task-definition.json
Note: When executing the command don't forget to replace ACCOUNT_ID with its actual value.

Fig 5. ECS Task Definition

Great! Now that the we have the task definition we can launch the service but just before that we need to create an IAM policy for the task execution role so that it can read value from the SSM Parameter otherwise the agent will never be able to authenticate with Terraform Cloud and allow the task to create CloudWatch log group if it does not exists.

tfc-agent-inline-policy.json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameters"
      ],
      "Resource": [
        "SSM_PARAMETER_ARN"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}
Note: Make sure to replace SSM_PARAMETER_ARN with its actual value. Also, if you have used a custom KMS key for encryption don't forget to include kms:Decrypt permission and the ARN of the key in the resource block
aws iam put-role-policy --role-name tfc-agents --policy-name additional-access --policy-document file://tfc-agent-inline-policy.json

Fig 6. IAM Role Inline Policy

Step 5: Launching Terraform Cloud Agent

Before running Terraform Cloud Agent as a Fargate service, we need to create a security group that will allow the agent to connect to Terraform Cloud on port 443.

Note: It is recommended to run the Terraform Cloud Agents in private subnet because they don't need any ingress connectivity. These agents constantly polls for new job hence only need egress connectivity with Terraform Cloud.
# Creating security group
aws ec2 create-security-group --group-name tfc-agents-sg --description "Security Group for Terraform Cloud Agents" --vpc-id VPC_ID

# Adding egress rule
aws ec2 authorize-security-group-egress --group-id SECURITY_GROUP_ID --protocol tcp --port 443 --cidr 0.0.0.0/0
Note: Please make sure to replace VPC_ID placeholder with the ID of the VPC in which you want to create the security group and launch the Fargate service. After creating the security group make sure to copy the value of security group ID as you need replace it with SECURITY_GROUP_ID placeholder.

Fig 7. Security Group

Note: In case you have deployed a DNS firewall with whitelisting approach then you will have to whitelist few domains in order for agent to communicate with Terraform Cloud. These are app.terraform.io, registry.terraform.io, releases.hashicorp.com and archivist.terraform.io.

Now, let’s launch the agents.

aws ecs create-service --cluster tfc-agents --service-name tfc-agents --task-definition tfc-agents --desired-count 2 --launch-type FARGATE --platform-version LATEST --network-configuration "awsvpcConfiguration={subnets=[PVT_SUBNET_ID_1,PVT_SUBNET_ID_2],securityGroups=[SECURITY_GROUP_ID]}"
Note: Remember to replace SECURITY_GROUP_ID with the ID of the security group that we created above and PVT_SUBNET_ID_1 and PVT_SUBNET_ID_2 with the actual IDs of private subnets in your VPC. It is recommended to use private subnet to host Terraform Cloud Agents for a good security posture and also because the agents don't need ingress connectivity as they poll for new jobs.

Fig 8. ECS Service

Once the service is in active state, you must see an agent listed in the Terraform Cloud Agents console.

Fig 9. Terraform Self-hosted Agent

In the end, we need to connect this agent to a workspace so that it uses the self-hosted agent to run terraform plan. To do so, go to the Settings page of the workspace you want to associate the agent to and under General Settings, scroll down to Execution mode, select the mode as Agent and select the agent pool you want to associate. Finally, don’t forget to save the settings.

Fig 10. Terraform Cloud Agent Association

Fantastic! We have now streamlined our automation to deploy helm charts or K8s packages to our privately hosted EKS cluster.

Note: You need to make sure that the Terraform Cloud Agents can communicate with the EKS cluster to deploy the resources. Make sure the security group of EKS cluster allows inbound connections from the agent and if the agent and EKS cluster are in different network make sure the NACL allows traffic flow between the network or there's peering setup between the network if both agent and EKS cluster are in separate VPC.

Covering the basics

  • Terraform Cloud (TFC) is a multi-tenancy managed cloud service whereas Terraform Cloud Enterprise (TFE) is a privately hosted self-managed offering by HashiCorp to manage Terraform runs centrally.

  • Terraform Cloud has tons of amazing features and to start with it allows you to remotely and centrally manage the runs, store state with versioning support and connect a workspace with version control.

  • Terraform under the hood uses AWS Go SDK along with the credentials you provide to authenticate and authorise the API calls.


Vimal Paliwal

Vim is a DevSecOps Practitioner with over seven years of professional experience. Over the years, he has architected and implemented full fledged solutions for clients using AWS, K8s, Terraform, Python, Shell, Prometheus, etc keeping security as an utmost priority. Along with this, during his journey as an AWS Authorised Instructor he has trained thousands of professionals ranging from startups to fortune companies for over 2 years.

Previous
Previous

Optimise and Secure AWS HTTP API Gateway by locking down direct access

Next
Next

Securing authentication between Terraform Cloud and AWS using OIDC