DEVOPS WEEKLY ISSUE #369 - 21st January 2018

2 minute read

A range of content this week, from broadly applicable posts on patterns or process or people, to more specific stories of individual technologies and services. One theme recently has definitely been debugging microservices, and I’m finding more and more tools aimed at making that easier (or even just possible).

Sponsor

See how AlienVault focuses their incident management on collaboration and shared responsibility while relying on the rules engine of the VictorOps Transmogrifier.
http://try.victorops.com/DevOpsWeekly/AlienVault

News

If you’re running a production service then it’s wise to have some process around incidents, especially for identify and resolving the really critical ones. This post looks at how to establish an incident management program for high-severity incidents.
https://www.gremlin.com/how-to-establish-a-high-severity-incident-management-program/

A nice story of migrating an application to a new deployment model, and adding in automated deployment and chatops using AWS services.
https://keita.blog/2018/01/01/ecs-chatops-with-codepipeline-and-slack/

Another good migration story, this time for a Rails application moving to Kubernetes on Google Cloud. It moves from the basics of application deployment to rolling updates, load balancing and database snapshots.
http://www.akitaonrails.com/2018/01/09/my-notes-about-a-production-grade-ruby-on-rails-deployment-on-google-cloud-kubernetes-engine

A good post about the fallacy of the single root cause for an incident, and just how easy it is to fall into the trap of looking for the one thing that caused all of the problems, and how you can avoid it.
https://cevo.com.au/post/2017-11-10-fallacy-of-a-single-root-cause/

AWS EC2 has grown over the years into a pretty large service, and knowing exactly what to monitor and how can be a large task. This series of posts takes you through the important metrics and where to find them.
https://www.datadoghq.com/blog/collecting-ec2-metrics/
https://www.datadoghq.com/blog/ec2-monitoring/

As data science becomes more common moving past just writing scripts to the typical issues of running production services is important. This post covers packaging, versioning and continuous integration for those coming from a data science background.
http://www.florianwilhelm.info/2018/01/ds_in_prod_packaging_ci/

A good introduction to Terraform for managing AWS, looking at a simple but very common use case; managing the hosting of static websites using CloudFront.
https://blog.buildo.io/reusable-cloudfront-configuration-with-terraform-4c1de144c735

If you’re running JVM-based applications in containers then it’s important to understand the interactions between cgroups and the JVM in order to avoid various memory issues. This post explains with examples what’s happening and what you can do about it.
https://banzaicloud.com/blog/java-resource-limits/

A look at the immutable infrastructure pattern, with a description of the problems being solved and how taking an immutable approach addresses some common problems.
https://technicloud.com/2018/01/09/delving-into-immutable-infrastructure/

Tools

OpenCensus is a new project which provides a set of libraries for multiple languages in support of metrics collection and distributed tracing. Having this be consistent across different runtimes is an interesting selling point.
http://opencensus.io/
https://github.com/census-instrumentation

Squash aims to be a debugger for microservices. It targets only a few languages and platforms today but allows for jumping in and setting breakpoints in your distributed applications from within your IDE.
https://github.com/solo-io/squash

Stern is a handy tool for anyone debugging Kubernetes services. It allows for tailing logs from multiple containers within a pod, as well as across pods and supports a simple regex interface for finding the logs you’re looking for.
https://github.com/wercker/stern

See how AlienVault focuses their incident management on collaboration and shared responsibility while relying on the rules engine of the VictorOps Transmogrifier.
http://try.victorops.com/DevOpsWeekly/AlienVault

Updated: