1 minute read

A short issue this week, with posts on Incident response, learning operations, managing configuration, evolving altering systems and a few interesting tools.

StackHawk sponsors Devops Weekly

The StackHawk team will be at CdCon in Austin this week! Stop by our booth (S4) to get some free StackHawk swag and see how our product automates security testing in CI/CD.

News

Human error is often an easy solution to reach for when investigating an incident. Too easy. This post encourages folks to discount human error as a forcing function for better post-incident conversations.
https://surfingcomplexity.blog/2022/05/30/imagine-theres-no-human-error/

One part of a growing team and a move towards service-based architecture is managing alerts. This post highlights some of the challenges and lessons learned.
https://cherkaskyb.medium.com/my-big-fat-monolithic-alert-498f21c1bb8a

Grafana now has first-class support for managing configuration in Git. This post explains why that’s useful and how to use it.
https://dev.to/aws-builders/a-gitops-way-to-manage-grafana-data-sources-at-scale-59la

How do you get started operating a new service? That might be as a new member to a team, or some other situation that brings with it new operational responsibilities. This deck has some suggestions.
https://speakerdeck.com/hgsgtk/a-guide-to-joining-operational-work-in-your-new-devops-team

A detailed post looking at Kubernetes liveness probes and how to implement health checks for your applications.
https://www.containiq.com/post/kubernetes-liveness-probe

Tools

Trdl is a way of providing a secure channel for delivering updates from the Git repository to the end user. It uses TUF (The Update Framework) under the hood.
https://github.com/werf/trdl

caddy-nats is a caddy module that allows the caddy server to interact with a NATS server. This extension supports multiple patterns: publish/subscribe, fan in/out, and request reply.
https://github.com/codegangsta/caddy-nats

Updated: