The growing cost of cloud services, on-demand staging environments, defining SLOs in code, data sharing and several other topics this week across the devops spectrum.

The DORA/Google State of Devops survey is open, focusing this year on metrics, how SRE fits with Devops, security and compliance, distributed teams and more.

A post on providing on-demand test environments for growing development teams, supported by a developer portal.

OpenSLO is a service level objective (SLO) language that declaratively defines reliability and performance targets using a simple YAML specification. Store your SLOs in Git, with tooling to help validation in your CI pipeline.

A post on the reality of cloud costs as organisations grow. Lots of useful insights into public data, mainly pointing out it’s more nuanced at scale for certain types of workloads.

A reminder that system complexity can easily reduce uptime of the whole. The examples are simplistic, ignoring partial failures, but still a useful example to bear in mind.

A post on profiling production services using Prometheus and Jaeger.

An in-depth look at autoscaling in Kubernetes. What problems does it solve, how is it implemented and how it works under-the-hood.

Delta sharing is an open protocol for secure real-time exchange of large datasets, aiming to enable secure data sharing across products.


Open Policy Agent, and it’s Rego language, can be applied to lots of different problems. Confectionary provides a set of policies for testing Terraform plans. My hope is we’ll see more of these sorts of pre-packaged policies.