DEVOPS WEEKLY ISSUE #433 - 14th April 2019

2 minute read

A discussion of a method for managing on-call, common challenges adopting chaos engineering, advice for adopting new technology and practices and several good database posts amongst other good devops topics this week.

From our sponsor, VictorOps

[You’re Invited] Death to Downtime: How to Quantify and Mitigate the True Costs of Downtime. VictorOps and Catchpoint are teaming up for a live webinar on 5 monitoring and incident response best practices for preventing outages.
http://try.victorops.com/devopsweekly/catchpoint-victorops-webinar

News

Adopting new technologies or practices may give you an advantage, or take you down the wrong path. How do you decide? This post has lots of good concrete suggestions as well as context for the ever-moving landscape.
https://medium.com/@kaimarkaru/quick-note-fomo-vs-legacy-tech-18a993e6948e

A walkthrough of a presentation on chaos engineering traps, common issues that people adopting chaos engineering practices for the first time often run into. Lots of context and good advice.
https://medium.com/@njones_18523/chaos-engineering-traps-e3486c526059

CASE is a method for improving monitoring, in particular on-call handling. It provides teams and organizations a way to discuss the care and feeding of automated alerting with a degree of formal language and emphasis on what’s important.
http://onemogin.com/monitoring/case-method-better-monitoring-for-humans.html

A look at an interesting serverless pattern, removing the need for functions to transform incoming data and moving that responsibility into the API gateway. More examples of why FaaS is only part of the serverless approach.
https://www.stackery.io/blog/future-of-serverless/

As the systems we build get more complex we change how we manage them. Monitoring (observability) and testing (chaos engineering) have seen lots of discussion but what about the rest of ITSM?
https://medium.com/itrevolution/learning-from-devops-why-complex-systems-necessitate-new-itsm-thinking-4a9aa5c14c18

The next post provides a useful primer for designing scalable database systems, looking at horizontal and vertical scaling, read replicas, sharding and application to microservices architectures.
https://www.red-gate.com/simple-talk/cloud/cloud-data/designing-highly-scalable-database-architectures/

Another good database post, this one covering implementing a load testing tool for a postgresql cluster. Good technical details for anyone interested in postgres and a good debugging story too.
https://blog.lawrencejones.dev/building-a-postgresql-load-tester/

I’ve always considered the relationship between testing and operations to be close. This post introduces logging, monitoring and alerting from the point of view of a tester, useful to convince folks who haven’t made the leap yet.
http://thethinkingtester.blogspot.com/2019/04/logging-monitoring-and-alerting.html

Quick tips for anyone building Docker images using the multi-stage built pattern, focused on making sure you’re taking advantage of the layer cache.
https://pythonspeed.com/articles/faster-multi-stage-builds/

Events

The Open Source Datacenter conference (OSDC) is coming up in Berlin on the 14th and 15th of May with talks on log management, devops practices, provisioning, storage, securing database and more.
https://osdc.de/

Tools

Popeye describes itself as a Kubernetes cluster sanitizer. It scans your cluster looking for unused secrets, namespaces without any objects, images using the latest tag and a dozen other potential misconfigurations.
https://github.com/derailed/popeye

KubeCDN is a distribution of Terraform code and other configuration for standing up your own small content delivery network running on AWS using Kubernetes.
https://github.com/ilhaan/kubeCDN
https://blog.insightdatascience.com/how-to-build-your-own-cdn-with-kubernetes-5cab00d5c258

You’re Invited] Death to Downtime: How to Quantify and Mitigate the True Costs of Downtime. VictorOps and Catchpoint are teaming up for a live webinar on 5 monitoring and incident response best practices for preventing outages.
http://try.victorops.com/devopsweekly/catchpoint-victorops-webinar

Updated: