Starting out this week with a good post on something I think the devops movement has made more common, or at least more visible; people moving between adjacent engineering roles, whether systems administration to development or to information security. Lots of other posts on large tech company architectures, hindsight applied to Kubernetes and new tools for monitoring and machine learning.
StackHawk sponsors Devops Weekly
Check out how we have built our microservices in Kubernetes here at StackHawk.
A post on moving from systems administration to information security, with good observations of the advantage of being new to a topic and how a variety of skills help when moving to different roles.
Hindsight is an interesting design tool for future systems. This post, from someone ver familiar with its working, looks at what applying hindsight and some opinions to Kubernetes might mean if you build a new container orchestrator.
An interesting post on scaling CI/CD pipelines across development teams, with a focus on self-service.
A breakdown of the recent large AWS outage, based on the public information but with a useful diagram to understand the apparent architecture and a handy list of the proposed plans to avoid similar issues in the future.
Software is always rooted in when it was first written. This post touches on GitHub’s architecture and technology choices and also how, and why, that’s evolving.
Details of scaling datastores, from active/active MySQL clusters to using Vitess.
The Kubernetes API is designed to be extended with new resources. This post looks at a more flexible Deployment resource which supports more fine grained control around rollout and running multiple versions of a service at the same time.
Opstrace is a new horizontally-scalable metrics and logs platform, optimised for installation on cloud platforms. It exposes a prometheus-compatible API, as well as working with a variety of agents like those from Fluentd and Datadog.
Replicate is a new tool that aims to solve version control problems for machine learning. It’s a python library that allows for snapshots to be saved in S3 or Google Cloud Storage and tools for retrieving and reusing those versions.
Nydus is a set of tools that aims to improve over the current OCI image specification in terms of container launching speed, image space and network bandwidth efficiency. The tutorial is a nice way of understanding how it works.