The Boring Registry

Managing Terraform Modules with The Boring Registry

The TIER infrastructure team manages a number of opinionated Terraform modules with some of our best-practices predefined. It makes life easier for the module caller as there is no need to worry about things like resource tagging, encryption in-transit/at rest, backups, etc. Being there by default, it allows us to perform common changes in one place and propagate it down to all the environments without major effort.

In this blog post we will answer the questions why and how we built the Boring Registry as a lightweight implementation of the Module Registry Protocol.

What is the problem?

Managing Terraform modules can be complicated, especially when it comes to versioning and releasing modules independently inside a mono-repo. We need to ensure we don’t release any breaking changes unintentionally. In case we did, we want to roll the changes out gradually.

TIER uses trunk-based development and that makes using multiple git branches impossible. A Git-based approach leaves no opportunity to “subscribe” to patch and minor versions. This makes it significantly difficult for a caller to get access to new features, etc. Below is our current way to reference a module:

module "bucket" {
  source = "git@github.com:TierMobility/terraform.git?ref=s3-0.1.0"
  name   = "my-bucket-name"
}

For every instance of a module, referencing our central Github repository, a shallow clone of the said repository is downloaded. terraform init generates a number of duplicated copies of monorepo and it doesn’t scale very well:

# initialize terraform (get providers, modules etc.)
$ terraform init
# find the number of references pointing to Github as a source
$ grep “source = \"github.com:“ | wc -l
   56
# get the number of individual files pulled from Github
$ find .terraform/modules -type f | wc -l
   20937
# calculate the amount of data pulled from Github
$ du -hc -d 0 .terraform/modules
   167M	 total

As shown above, using a single get repository for managing terraform modules wastes compute, network and storage every time the stack is run. This could partly be mitigated by using dedicated repositories per module. Semantic versioning wouldn’t be supported by this either.

Speaking of numbers, did you know that TIER’s first model, the es2 ninebot, weighed only 13kg?

What are we aiming for?

We want our Terraform module callers to receive critical bug fixes and feature improvements, but at the same time protect them from unexpected breaking changes. The second goal is to version each module inside the monorepo separately without relying on git tags and branches.

What are our actions?

After considering possible solutions in the wild, we ended up with writing our own module registry. We didn’t aim to implement the complete Terraform Registry, but only a small part of it called the Module Registry Protocol.

The Module Registry Protocol exposes two endpoints, one for listing available module versions and one for downloading a specific version of a module.

Examples:

GET /v1/modules/{namespace}/{name}/{provider}/versions
GET /v1/modules/{namespace}/{name}/{provider}/{version}/download

The Boring Registry itself is just a thin layer on top of a storage backend (in our case AWS S3). It doesn’t keep any state or index of modules in its own database, but leverages the storage backend for this. It doesn’t hand out packages directly to Terraform, it points Terraform to the storage location for a given module and version (supported sources are diverse, the Boring Registry supports S3 and GCS buckets; see the TF documentation). This makes deploying, configuring and scaling the Boring Registry easy for the operators.

Since the Boring Registry points to the location of the module archive, it is essential to provide Terraform itself with the necessary permissions to access the storage backend.

There are advantages from a security perspective as well: If an attacker could compromise the Boring Registry, they could only list archives, but can not alter or download them – the registry has list-only permissions on the storage backend and only keeps track of the archive location, but not the actual data.

The storage backend expects a clear path structure to know where modules live. Example structure:

namespace=tier/name=s3/provider=aws/version=1.0.0

An example bucket looks like this when all modules have been uploaded:

$> tree .
.
└── namespace=tier
    ├── name=s3
    │   └── provider=aws
    │       └── version=1.0.0
    │           └── tier-s3-aws-1.0.0.tar.gz
    └── name=dynamodb
        └── provider=aws
            ├── version=1.0.0
            │   └── tier-dynamodb-aws-1.0.0.tar.gz
            └── version=1.0.1
                └── tier-dynamodb-aws-1.0.1.tar.gz

Running stage

Getting started with the Boring Registry is quite easy, the first step is to set up the storage backend. There are currently only two providers supported (S3 & GCS), however, other providers/implementations can be added.

After preparing the storage bucket and uploading the module archives, the Boring Registry can identify the archives. The Boring Registry comes with a CLI that is used to upload packages to the storage backend or to your CI/CD pipeline.

It is important to define a configuration file boring-registry.hcl before uploading the module archives. The file should be placed at the root of the module directory, otherwise the upload subcommand won’t be able to find Terraform modules.

The boring-registry.hcl file should look like this:

metadata {
  namespace = "tier"
  name      = "s3"
  provider  = "aws"
  version   = "2.0.0"
}

This is an example usage for uploading module archives to an s3 bucket:

$ boring-registry upload -type=s3 -s3-bucket=${bucket} -s3-region=${region} ${dir}

By using the upload command you can upload modules to the registry storage backend, so that terraform can download them later. It doesn’t matter if there is a deeply nested module hierarchy, the Boring Registry looks inside all folders of the project for the hcl configuration and creates an archive from the sub-directory.

It is necessary to specify -type=s3, because the Boring Registry also supports GCP buckets as storage backends.

Setting up the server component is also straightforward, the first step is to grant it access to the storage backend (if applicable), then point the Boring Registry to it. Protecting the Module Registry Protocol is possible, but at the moment it can only deal with static API keys.

An example usage for starting up the server, pointing to a s3 bucket:

$ boring-registry server -type=s3 -s3-bucket=${bucket} -s3-region=${region} ${dir} -api-key=my-secret-api-key

If the server was configured with an API key, the following lines should be added to ~/.terraformrc file:

credentials "terraform-registry.${domain}" {
  token = “${api_key}”
}

Putting it together

Now when the registry has a backend and the service is up, the modules can be referenced in a terraform configuration file:

module "main-s3" {
  source  = "boring-registry/tier/s3/aws"
  version = "~> 1"
}

And here we are: full semantic versioning, no git shallow clones, authentication on the storage backend, no moving parts – just a bucket and a fronting service 🔥