1 minute read

The rise of very public, in-depth, high-profile incident reports in the last few years is definitely of benefit to the art of systems administration. Atlassian’s post is a great example, covering the recent multi-week outage. Plus posts on organisation design, least privilege and some interesting tools for testing and Kubernetes management.

StackHawk sponsors Devops Weekly

ICYMI: The StackHawk & Snyk in Action webinar is up on YouTube. Follow along to see how your team can automate security testing in CI/CD using these integrated tools. Watch now:
https://sthwk.com/snyk-in-action-yt

News

Atlassian had a large global outage last month. This in-depth indecent report goes into lots of interesting operational detail about the timeline, what happened and lessons learned.
https://www.atlassian.com/engineering/post-incident-review-april-2022-outage

A great post on organisational design, and in particular dependencies amongst teams adopting more product-centric funding models.
https://betterprogramming.pub/untangling-organizational-dependencies-c52c843bfaf1

An interesting post on using monitoring of a local environment to inform implementing least privilege AWS access control.
https://blog.symops.com/2022/05/06/least-privilege-policies-from-aws-logs/

Open source software is a large part of most systems administration efforts today. This site recounts 30 years experience of maintaining a critical open source tool, Curl.
https://un.curl.dev

Some thoughts on software architecture, advocating for more local-first experiences, using the cloud mainly for storage, synchronising and burst compute.
https://docs.google.com/presentation/d/1YoM54KRkMptZV3K5Wg91C_H2RUrIGUhslxF8iWRlWso/edit

Events

SLOConf kicks off tomorrow, running from the 9th of May to the 12th. A free, online event with a wide range of talks, from those focused on getting started to more advanced topics.
https://www.sloconf.com/

Tools

Korb is a handy tool for working with Kubernetes storage, specifically moving data from PVCs between StorageClasses or renaming them.
https://github.com/BeryJu/korb

Tracetest is a tool for writing end-to-end tests for microservice-based applications, using OpenTelementry traces to speed up the test authoring.
https://github.com/kubeshop/tracetest

Otomi is a platform as a service layer built atop Kubernetes. The focus looks to be on providing a visual management experience and it comes integrated out-of-the-box with Argo, Vault, Prometheus and lots more.
https://github.com/redkubes/otomi-core

Updated: