2 minute read

A collection of posts on monitoring and on the importance of managing cost this week, along with a few good getting started tutorials and other posts. Enjoy.


Attending AWS re:Invent 2017? Visit the VictorOps booth, schedule a meeting, or join us for some after hours fun. See you in Vegas!


An excellent essay on the pitfalls of viewing IT as a cost center, and how not addressing that as part of digital transformation efforts can lead to failure.

Monitoring CPU or memory usage is relatively easy, but the SRE book proposes latency, traffic, errors, and saturation as more useful and important. This post, the first in a series, explores how to monitor these golden signals.

osquery provides powerful capabilities for querying the state of an entire infrastructure. The following post explores why that’s particularly useful for various security use cases, and describes how to build and configure a multi-platform osquery install.

A good blog post listing out a set of guidelines and emerging best practices for observability.

A handy tutorial for getting up and running with OpenFaaS locally using Minikube. Some other interesting observations about latency in FaaS environments.

A deep dive into one of those seemingly easy, but actually fiendishly hard, problems in distributed systems; time. This post explores unified clocks in online games, but it’s a great technical post about the challenges with time.

A relevant post for anyone managing cloud infrastructure, focused on managing cost with a number of practical tips.

Go has become increasingly popular for infrastructure tools in the last several years. This presentation looks at how one organisation adopted it, and features some interesting examples of tooling to assist teams collaborate on projects.

CNCF - Cloud Native Computing Foundation

Free Webinar - What’s New in Prometheus 2.0 November 30, Online

Join members of the Prometheus 2.0 release team to learn about the major features in this release and demo some of the highlights in the areas of storage, staleness handling, snapshot backups, and more.

KubeCon + CloudNativeCon - Join leading Kubernetes, Docker, and Cloud Native technologists, December 6-8, in Austin for a broad range of technical sessions on the cloud native ecosystem. We sold out in Berlin and are excited to see thousands of you from the community join us, this time in Austin!


container-diff is a new tool for comparing container images, either locally or from a remote repository. It detects packages from a number of package management tools, as well as filesystem changes. Useful for debugging image build problems.

Open Policy Agent is an interesting new project providing which provides a DSL for describing policy, and software to ensure described systems remain compliant. Examples include managing Kubernetes API admission, Terraform permissions and SSH authorization.

Attending AWS re:Invent 2017? Visit the VictorOps booth, schedule a meeting, or join us for some after hours fun. See you in Vegas!