Better CloudFormation automation with tacks

When dealing with the ever increasing complexity of building those infrastructures we tend to put every operation performed into some piece of executable documentation.
04.08.2015
Tags

At kreuzwerker we implement at lot of projects running on AWS (with some small forays into GCP). When dealing with the ever increasing complexity of building those infrastructures we tend to put every operation performed into some piece of executable documentation: software that does the creation, updating, configuration and wiring for us, in a reproducable and auditable manner. Our approach to this problem of infrastructure provisioning on AWS is typically split into doing 80% with CloudFormation templates and 20% with custom tooling on top. Why the dichotomy?

Some of the reasons for the split are as follows:

  1. not every API is available for CloudFormation and not every feature of every API is (fully) exposed via CloudFormation (e.g. Origin Access Identities for CloudFront)
  2. some resources which should be grouped together cannot, unless you are taking a timeout while creating the stack, do some magic to fix the missing preconditions, and continue (e.g. setting up a stack containing lambda functions together with their S3 source bucket - this triggers a validition for the source key to be present in the bucket, but S3 objects cannot be provisioned with CloudFormation)
  3. there is no sensible way of displaying the results of applying a stack change out of the box
  4. there is no embedded concept of “environments” (e.g. production and staging) or “versions” (for allowing blue / green deployments) - such environments often need different “update” strategies (e.g. never updating a certain production stack but instead creating a new version of it)
  5. there is no simple way of applying a stack against multiple regions (e.g. for enabling CloudTrail support for a new account)
  6. certain resources are more prone to getting stack update rollback to fail, resulting in a state that can only be recovered by AWS support - to avoid this it’s advisable to split up your stacks (this has been partially addressed recently by the added support for custom lambda resources in the stack, basically moving the cross-referencing of resources from another stack into a lambda function that is called by CloudFormation itself)

The net effect is, that - for several projects - we have build new (or build on top of existing) toolchains, over and over again. This is not great: every new developer / operator joining a new project will have to deal with a toolchain that looks almost like the one from the other project but has some quirks or missing features. The need to setup a toolchain for even trivial static site projects (Origin Access Identity again!) makes adoption of CloudFormation for small infrastructure setups (ranging from provisioning IAM users to S3 backup buckets) much harder than it should be.

For some time we’ve been internally building a small toolkit (called tacks) to help ease the pain of using CloudFormation. It’s not quite ready for large projects but it will get there eventually. What it provides today are the following features:

  1. it bundles stack metadata such as tags and region in one file with the stack JSON
  2. it supports multiple environments natively, using the familiar YAML inheritence mechanism (templates are written in YAML 1.2, which enables us to write CloudFormation templates in JSON and/or in YAML - this also enables comments in your stacks, even when looking like JSON)
  3. since comments via # are perfectly legal YAML, tacks can (in fact: should) be run via the shebang line - no more guessing if it’s make or rake or thor etc., just execute the stack itself
  4. it supports pre- and post-operations following the API call (fixing the Origin Access Identity problem, finally)
  5. it allows for the specification of variables, which can be defined / overriden per environment - these variables can be the result of commandline tools, e.g. looking up the id of some VPC)
  6. it has an embedded event viewer which can be triggered standalone or following a stack creation / update

Currently the tool is in it’s early stages (but already perfectly usable) and specifically lacks documentation. We plan to fix this in the near future (as well as adding more features, see the issue tracker for our immidiate plans). In the meantime, have a look at the example stacks for an idea what tacks can do for you (for example: setting up OpsWorks). Pull requests are welcome!

Image attribution goes to derrickcollins.