1 minute read

A mix of people and process (on-call for managers, team design), practice (observability, logging) and tooling (Terraform, Kubernetes) posts this week.

StackHawk sponsors Devops Weekly

Hi DevOps Weekly. StackHawk is proud to be a new sponsor of the newsletter. We are application security testing built for CI/CD. Read our post on Developer-Centric AppSec Testing.
http://sthwk.com/dow-appsec

News

Designing on-call often lands on managers, so understanding the difference between good and bad on-call is critically important if you want to be a good engineering manager. This post is a great introduction.
https://charity.wtf/2020/10/03/on-call-shouldnt-suck-a-guide-for-managers/

A look at applying approaches from domain driven design and team topologies to identify improvements in how teams build reliable co-operating systems.
https://www.joaorosa.io/2020/08/18/using-team-topologies-to-discover-and-improve-reliability-qualities/

Five tips for implementing observability, looking at black box monitoring, service metrics, tracing and more.
https://prometheuskube.com/5-tips-on-implementing-observability

An interesting new Lambda feature, extensions open up lots of opportunities for monitoring and security use cases that have been hard to implement up to now.
https://lumigo.io/blog/aws-lambda-extensions-what-are-they-and-why-do-they-matter/

Anyone who has worked with logs will be familiar with the concept of log levels. This post has a bit of history and discusses how log levels are commonly used.
https://sematext.com/blog/logging-levels/

Terraform is building up an ecosystem of developer tools around it, but which to try first? This video playlist has several tool reviews which might be of interest.
https://www.youtube.com/playlist?list=PLvz1V_9d3uivwNgADT_eB-wKEWOzOOQXy

Many large organisations end up using multiple cloud providers, whether by inertia or design. This post covers some common issues and misconceptions.
https://www.cloudops.com/blog/the-biggest-myths-of-multi-cloud/

Tools

Troubleshoot is a set of tools for supporting applications deployed to Kubernetes. Preflight provides pre-installation cluster conformance testing and validation and support-bundle provides post-installation troubleshooting and diagnostics.
https://github.com/replicatedhq/troubleshoot

Updated: