Terraform – TECHNICLOUD

Our team at Rubrik uses Terraform extensively to manage our infrastructure as code. This means that our infrastructure configurations are version controlled and resources are provisioned in an automated fashion through CI/CD workflows. Because it’s a customer-zero environment, we’re constantly evaluating new tools to find better ways to manage and scale the environment. This led us to trying out Terraform Cloud.

Easy collaboration is the name of the game with Terraform Cloud. It offers team-oriented remote execution and is consumed as a SaaS platform. In this post, I’ll cover remote state management, cost estimation, and collaboration with Terraform Cloud.

Remote State Management

State files capture the existing state of provisioned infrastructure for a specific workspace. State files are stored on the local machine by default. This becomes unwieldy when the rest of the team is involved.

Remote state management is a design consideration with which we’ve extensively experimented. My colleague, Chris Wahl, has written about using Amazon S3 to store state, which is how we have historically managed state. This resembles the following:

terraform {
  backend "s3" {
    bucket = "technicloud-bucket-tfstate"
    key    = "dev/terraform.tfstate"
    region = "us-east-1"
  }
}

Using Terraform Cloud to manage remote state resembles the following:

terraform {
  backend "remote" {
    hostname     = "app.terraform.io"
    organization = "technicloud"

    workspaces {
      name = "scaling-compute"
    }
  }
}

With Terraform Cloud, the state file is abstracted from the user; it exists but is secured and managed by the platform. This allows for granular access control, versioning, and backup so that I’m able review previous points in time. While Amazon S3 provides these same features, it requires quite a bit more effort to do so. For example, remote state management with Terraform Cloud provides integrated locking, eliminating the need to spin up a DynamoDB table.

Terraform Cloud enables teams to easily collaborate asynchronously by using the platform as remote state file storage.

Cost Estimation

A very cool feature that stood out was the cost estimation, which displayed an approximate monthly cost with each workflow run. This is particularly beneficial to me because we use Terraform to deploy resources across all three major cloud service providers. Holistic billing management across multiple clouds has long plagued me:

For 59 days I have been focused on migrating and consolidating billing for AWS, GCP, and Azure.

I apologize for whatever I did to deserve this. pic.twitter.com/s0mRdruDIa
— Rebecca Fitzhugh (@RebeccaFitzhugh) August 23, 2019

https://platform.twitter.com/widgets.js

This standard interface provides a valuable way for our team to analyze, report on, and visualize cloud spend across cloud providers.

While this alone does not give a complete picture of our monthly bill, it certainly helps us be mindful of cost when testing and building demos. We are regularly building demos to showcase our product’s cloud functionality; this process consists of design time spent architecting a solution and then usually a lot of prototyping to get the demo perfect. The prototyping phase consists of deploying and destroying resources numerous times, which can quickly rack up a big bill when not paying attention to cost.

However, the Terraform Cloud Cost Estimation API provides a lot of granular data that can be pulled into our central billing dashboard. This helps us be mindful of monthly costs to operate our cloud environment. Using this data, we made the decision to use demo leases of 4 hours to help minimize costs for demo; after 4 hours, the resources are stopped. This helps us keep central IT off our backs 🙂

Team Collaboration

Terraform Cloud offers a number of collaboration features to help teams easily work together. Our team prioritizes making our code as reusable as possible; we regularly write modules that fit our design specifications and use cases. The Private Module Registry allows us to easily share the different use case modules that we’ve built.

There’s also multi-tenancy with the ability to create and manage multiple teams and organizations and enforcing Role Based Access Control (RBAC) across the different workspaces. Moreover, you can manage Terraform Cloud configurations using Terraform.

Here’s an example of using the Terraform Cloud provider to create an organization, workspace, team, and permissions:

# Create the Terraform Cloud Organization
resource "tfe_organization" "technicloud" {
 name  = "technicloud"
 email = "rebecca@technicloud.com"
}
 
# Create the Technicloud Workspace
resource "tfe_workspace" "technicloud-wordpress" {
 name         = "technicloud-wordpress"
 organization = tfe_organization.technicloud.id
}
 
# Add Web Dev Team
resource "tfe_team" "web-dev" {
 name = "technicloud-web-dev"
 organization = tfe_organization.technicloud.id
}
 
# Add User to Web Dev Team
resource "tfe_team_member" "user1" {
 team_id  = tfe_team.web-dev.id
 username = "rfitzhugh"
}
 
resource "tfe_team_access" "test" {
 access       = "plan"
 team_id      = tfe_team.web-dev.id
 workspace_id = tfe_workspace.technicloud-wordpress.id
}

So basically…

You can find the above code sample on GitHub.

Summary

In this post I reviewed a handful of compelling Terraform Cloud features. This includes remote state management, cost estimation, and collaboration features. Consider using Terraform Cloud for state storage and collaboration (especially the Private Module Registry), it’s free for small teams (up to 5)! Since we do not yet use Sentinel, I did not get a chance to test out Sentinel policies with Terraform Cloud but hope to implement it soon.

If you have any questions, please reach out to me on Twitter.

At work, my team owns and maintains a large lab environment for the development and testing of Rubrik Build projects. It was built in a hurry, causing some of our original design principles to be compromised. My team and I have decided to use this no-travel period as an opportunity to redesign and redeploy our lab environment. I will review our design in a later post.

One of our design goals is to leverage infrastructure as code principles (where possible). The team’s primary tool of choice for provisioning is Terraform.

Terraform allows us to define what resources we need in a declarative manner, where we simply define the end state needed for our infrastructure. Here’s a few reasons why we like using Terraform:

Multi-platform, similar operations across a number of providers
Easy provisioning and deprovisioning of resources
Idempotent, saves current state as a file
Detects diffs from current state when applying changes

This post will dive into Terraform syntax, architecture, and operations.

Terraform Syntax

The low-level syntax of Terraform is defined in HashiCorp Configuration Language (HCL). The following example shows a generic configuration code block for Terraform:

command_type "provider_resource_label" "resource_label" {
  argument_name = "argument_value"
  argument_name = "argument_value"
}

Let’s dig into the syntax:

Command — the command type resource tells Terraform you want to create a resource, such as an S3 bucket or an EC2 instance.
Provider Resource Label — this is the type of resource you want to create. The resource name is specified by the provider. For example, you may use aws_instance to provision an EC2 instance using the AWS provider.
Resource Label — this what you want to colloquially label the resource within your Terraform configuration. This label should be unique within this configuration file as it is used later when referencing the resource.
Arguments — allows you to specify configuration details for the resource being provisioned. These are defined as an argument name and an argument value. As an example, when provisioning an EC2 instance, you may want to specify which AMI is used.

Note that comments using # or // or even /* or */ are supported.

To put these concepts together, an example configuration code block may resemble:

resource "aws_instance" "my-first-instance" {
  ami = "ami-008c6427c8facbe08"
  instance_type = "t2.micro"
  availability_zone = "us-west-2c"
  
  tags = {
    Name = "my-first-instance"
    Environment = "test"
  }
}

This example will provision a single EC2 instance in the US-West-2C availability zone, using the AMI specified, along with assigning the two tags.

Most of your Terraform configuration is written in these code blocks. Once you master this, then you’ll be able to quickly write and provision more resources.

Terraform Architecture

A typical Terraform module may have the following structure:

project-terraform-files
│
└─── terraform-module-example01
│   │   main.tf
│   │   variables.tf
│   │   terraform.tfvars
│   │   outputs.tf
│   
└─── terraform-module-example02
│   │   provider.tf
│   │   data-sources.tf
│   │   main.tf
│   │   variables.tf
│   │   terraform.tfvars
│   │   outputs.tf

The names of the files are not important. Terraform will load all configuration files within the directory.

Providers

A provider is the core construct that allows Terraform to interact with the APIs across various platforms (PaaS, IaaS, SaaS). Think of this as the translator between the platform API and the HCL syntax. Before you can begin provisioning resources, you must first defined which platform by specifying the provider:

provider "aws" {
  region = "us-west-2"
}

Place the provider block in your main.tf file or create a separate provider.tf file.

Resources

I previously covered how to structure resource code blocks in the Terraform Syntax section.

This example defines the creation of an instance based off the defined AMI, sized as t2.micro, and properly tagged:

resource "aws_instance" "my-first-instance" {
  ami = "ami-008c6427c8facbe08"
  instance_type = "t2.micro"
  availability_zone = "us-west-2c"
  
  tags = {
    Name = "my-first-instance"
    Environment = "test"
  }
}

Define the desired outcome for your resources in the main.tf file.

Data Sources

Data sources enable you to reference resources that already exist outside of Terraform or defined by a separate Terraform configuration. This allows you to extract information that can then be fed into a new resource. First, defined the data source and then reference this as an argument value:

data "aws_ami" "ubuntu" {
  most_recent = true
  owners = ["aws-marketplace"]

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

resource "aws_instance" "my-first-instance" {
  ami = "${data.aws_ami.ubuntu.id}"
  instance_type = "t2.micro"
  availability_zone = "us-west-2c"
  
  tags = {
    Name = "my-first-instance"
    Environment = "test"
  }
}

In this example, I am again creating a new EC2 instance. However, this time I am gathering AMI information using a data source to find and use the latest Ubuntu version instead of manually defining that AMI value. This allows my configuration to be more flexible because I no longer need to manually find and input the appropriate AMI value.

A data source is declared similarly to resources, except that the information provided is used by Terraform to discover existing resources rather than provision. Once defined, data sources can be referenced repeatedly to pass information to new resources.

Place the data source blocks in your main.tf file or create a separate data-sources.tf file.

Variables

To make your code more modular, you can choose to use variables instead of hard-coding values. Once defined, variables can be referenced:

provider "aws" {
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key
  region = var.aws_region
}

I typically declare my essential variables in a separate variables.tf file. This may resemble:

variable "aws_access_key" {
  description = "AWS access key for authorization"
  type = "string"
}

variable "aws_secret_key" {
  description = "AWS secret key for authorization"
  type = "string"
}

variable "aws_region" {
  description = "AWS region in which resources will be provisioned"
  type = "string"
  default = "us-west-2"
}

In this example, I have declared a value for the AWS region to be reused when provisioning the infrastructure defined. The descriptions are optional, and for the developer’s benefit only, but I always recommend being kind to the next person using your code. The possible variable types are string (default type), list, and map. Variables can also be declared but left blank, setting their values through environment variables or a .tfvars file.

Sometimes the variable definition may be specified as a default in the variables.tf file. Otherwise, this value should be defined by creating a file named terraform.tfvars, which allows variable values to persist across multiple executions. This is especially valuable for sensitive information such as secret keys.

For example, the contents of the terraform.tfvars file may resemble the following variable definition:

aws_access_key = "ABC0101010101CBA"
aws_secret_key = "abc87654321zyxw"
aws_region = "us-west-2"

Terraform automatically loads all files in the current directory with the exact name of terraform.tfvars or any variation of *.auto.tfvars. If the file is named something else, you can use the -var-file flag to specify a file name.

However, keep in mind that these persistent variable definitions often contain sensitive information, such as passwords or API token, and should be treated with care. Consider adding this to your .gitignore file.

Outputs

Outputs can be used to display information needed or export information after Terraform completes a terraform apply command. An example output may resemble:

output "instance_id" {
  value = "${aws_instance.my-first-instance.id}"
  }

You can save the outputs files in a specific file called outputs.tf.

State

When you use Terraform to build resources, a state file gets created and contains configuration information for the resources provisioned. This is what allows Terraform to determine which parts of the configuration have changed, ultimately what provides idempotency because Terraform is able to determine the resource is present and does not create it again.

After the terraform apply command is executed, the affiliated directory will contain two new files:

terraform.tfstate
terraform.tfstate.backup

Note: any manually changes made to Terraform provisioned infrastructure will be overwritten by terraform apply.

Modules

Terraform configuration files can be packaged as modules and used as building blocks to create new infrastructure resources without having to put forth much effort. Modules are available publicly in the Terraform registry, and can be directly added to configuration files for quickly provisioning resources.

If I were to use a pre-packaged module to provision an AWS S3 bucket, the code may resemble:

module "s3_bucket" { 
  source = "terraform-aws-modules/s3-bucket/aws" 
  bucket = "my-s3-bucket" 
  acl = "private" 
  versioning = { 
    enabled = true 
  } 
}

In this case, you are reusing the configurations specified by the module. All you need to input are the configuration values.

Terraform Operations

Terraform is managed through a simple CLI. Terraform is a single command-line application: terraform and you specify the action through a subcommand such as apply or plan.

To view a list of the available commands at any time, just run terraform with no arguments.

In order to get started, you will need to run terraform init to initialize a number settings for Terraform that will create the required environment to proceed. It will also download the necessary plugins for the selected provider.

Before provisioning, you may want to generate an execution plan, or otherwise known as a dry-run of your changes. Generate by running terraform plan. Terraform outputs a delta, showing you which resources will be destroyed (marked with a -), which will be added (marked with a +), and which will be updated in-place (marked with a ~).

Once you have reviewed the execution plan and are ready to begin provisioning, run terraform apply to the changes to be executed. If at any point you need to remove the resources, simply use the command terraform destroy. If there are multiple resources in the module, you can specifically name which resource(s) to destroy. For example:

terraform destroy - target=aws_instance.my-first-instance

In general, once you have defined the infrastructure in the .tf files, working with Terraform is pretty much just running terraform plan and terraform apply repeatedly (unless you use CI).

Summary

In this post, I described Terraform syntax, architecture, and common operations. Throughout the article I used the example of creating an AWS EC2 instance, however, these principles apply to all resources types across providers. I hope this helps you get started in your infrastructure as code journey.

Happy Terraforming!

TECHNICLOUD

Tag: Terraform

Hello, Terraform

Terraform Syntax

Terraform Architecture

Providers

Resources

Data Sources

Variables

Outputs

State

Modules

Terraform Operations

Summary