2 minute read

Interesting posts this week on holding incident drills, monitoring ephemeral infrastructure, the glut of high-quality books on devops topics, managing Kubernetes configuration and more.

From our sponsor, VictorOps

Your team builds a stronger culture of accountability and collaboration when your developers take code ownership and handle on-call responsibilities. See how you can bring developers on-call and why it’s important for building reliable systems:
http://try.victorops.com/devopsweekly/why-devops-matters

News

A discussion of the importance of practice, looking at improving incident management practices by conducting incident drills.
https://medium.com/kudos-engineering/faking-fires-get-better-incident-management-with-practise-e61a5d66578d

If you’ve used Kubernetes you have probably come across the Kubeconfig file for configuring which cluster you’re working with. This post contains lots of tips for managing the files.
https://ahmet.im/blog/mastering-kubeconfig/

The following set of posts provides a good introduction to the architecture behind AWS Elastic Container Service and then delves into the various metrics, and tools for measuring them, useful when running ECS workloads.
https://www.datadoghq.com/blog/amazon-ecs-metrics/
https://www.datadoghq.com/blog/ecs-monitoring-tools/

One common approach to building resilient batch workload systems is to rely on checkpointing. This post explores why that doesn’t work as well as you might think.
https://blog.0x74696d.com/posts/checkpointing-failure/

Knative is a project to provide serverless primitives on top of Kubernetes. Up until now running Knative required Istio, which provides a lot more functionality than is required. This post explores an alternative using Gloo.
https://medium.com/solo-io/gloo-by-solo-io-is-the-first-alternative-to-istio-on-knative-324753586f3a

A look at adding BCC and eBPF monitoring visualisation to the Titus open source performance monitoring tool. Lots of interesting discussions of monitoring in a world of ephemeral containers.
https://medium.com/netflix-techblog/extending-vector-with-ebpf-to-inspect-host-and-container-performance-5da3af4c584b

A discussion of the Kapitan Kubernetes configuration management tool. Rather than a how to this post explores some of the problems found with managing Kubernetes configs and discusses the components of potential solutions.
https://medium.com/@alessandro.demaria/let-kapitan-take-the-helm-of-kubernetes-e455e3d9ed08

A good list of books for anyone interested in devops, covering technical as well as the people and practices side.
https://www.thirdrepublic.com/blog/10-books-every-devops-enthusiast-must-read

An example of using Atomist to evolve a codebase, changing file name conventions and imports automatically.
https://the-composition.com/making-change-stick-with-code-transforms-and-autofixes-587d19e0ba1b

Events

Rootconf 2019 is coming up in Bangalore on the 21st and 22nd of June. Topics will include Infrastructure security, cloud architecture, cloud optimization and distributed systems.
https://hasgeek.com/rootconf/2019/

Tools

Footloose is a useful looking tool for providing VM-like machines for local experiments, backed by Docker containers. Think a more Vagrant-like workflow but with the speed of containers.
https://github.com/dlespiau/footloose

Your team builds a stronger culture of accountability and collaboration when your developers take code ownership and handle on-call responsibilities. See how you can bring developers on-call and why it’s important for building reliable systems:
http://try.victorops.com/devopsweekly/why-devops-matters

Updated: