2 minute read

Starting off this week with a post on operability, a topic that probably deserves more focus that it normally gets when discussing new technology.

From our sponsor, Victorops

System visibility is DevOps 101. With cloud-based applications, DevOps teams need a whole different set of tools for monitoring, alerting, and incident management. So, we laid out some helpful resources for maintaining more robust cloud services:


A good slidedeck on the importance of operability as a shared concern for development and operations teams. Discussion of modern logging, metrics, runbooks and more.

An excellent experience report of using Istio, looking at the overhead imposed by current versions and the cost/benefit analysis of adopting any middleware.

A slide deck on withstanding regional outages in infrastructure environments. Looking at availability, latency and cost. Lots of diagrams to explain the context.

A handy way of testing your Kubernetes cluster configuration. Specifically this post shows how to check RBAC rules using the handy can-i subcommand and then with bats.

An interesting post on avoiding some of the disadvantages of using an ORM, looking at approval testing to avoid issues introduced at the SQL level.

This set of content rounds up the concept of Chaos Engineering nicely. A bit of history, the importance of failure in complex distributed systems and how to use chaos monkey and more recent tools.

An example of using the kubeless serverless framework on Kubernetes. Useful Python examples of reading Kubernetes events as well.

A detailed set of posts which cover both the architecture of Pivotal Cloud Foundry and also the important metrics and logs to pay attention to in order to run a production-quality cluster.

A post on balancing building your own tooling with using existing tools, and resisting not-invented-here.


Close.io is hiring a DevOps Engineering Team Lead! We are a ~30 person entirely remote team (~13 engineers) that is profitable and building a product our customers love. You will be doing both hands-on technical work yourself and managing a small remote team (2-3) of exceptional Senior SRE/DevOps Engineers.


Kubehiera is a port of the Hiera tool popular with Puppet users for managing hierarchical configuration, for example for different environments. In this case Kubehiera is aimed at rendering Kubernetes configuration.

A provisioner for Terraform which allows for running inspec-based acceptance tests during the run.

System visibility is DevOps 101. With cloud-based applications, DevOps teams need a whole different set of tools for monitoring, alerting, and incident management. So, we laid out some helpful resources for maintaining more robust cloud services:
