[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A look at how one team used gamedays as a tool to test and improve performance and resilience.
https://firehydrant.com/blog/improving-signals-speed-and-resilience-through-pressure-testing/
Lots of complex applications are managed by large amounts of configuration, including OpenAPI specs and Kubernetes manifests. This post looks at how API and Kubernetes configuration might come together.
https://wso2.com/library/blogs/streamlining-cloud-native-app-development-in-kubernetes-with-prioritized-api-management/
The Octoverse report from the end of last year, looking at the state of open source, with stats on the growing adoption of AI tools, continued growth in declarative infrastructure configuration and more.
https://github.blog/2023-11-08-the-state-of-open-source-and-ai/
Measuring developer productivity is easy to do, and hard to make useful. This post makes the case that qualitative metric might be more useful than quantitative.
https://martinfowler.com/articles/measuring-developer-productivity-humans.html
Ray LLM is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs as well as compare the outputs of different models.
https://github.com/ray-project/ray-llm
Neon is billed as serverless PostgreSQL. It separates storage and compute in order to support autoscaling, branching, and expanded storage.
https://github.com/neondatabase/neon
Zed is a modern code editor, optimised for performance and providing multi-user editing and close integration with generative AI tools.
https://zed.dev/
https://github.com/zed-industries/zed
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A look at BuildKit, the modern container image building tool under the hood of Docker Build and other container ecosystem tooling.
https://depot.dev/blog/buildkit-in-depth
A course on API observability, covering an introduction to OpenTelemetry as well as lots of API-specific topics.
https://pages.tyk.io/api-observability-fundamentals-on-demand
The Scaled Agile Devops Maturity Framework is billed as “Enterprise transformation without the risk of culture change!”. Worth a read if you’ve spent time in large organisations :)
https://scaledagiledevops.com/
A post on the benefits of Serverless platforms for small infrastructure automation tasks, with some suggested use cases.
https://stackify.com/why-you-should-go-serverless-for-devops/
Java is coming to RISC-V. A great example of open source innovation at work.
https://devops.com/what-is-risc-v-and-why-has-it-become-important-for-java/
Daytona is a new tool for managing a development environment. It supports both local and remote environments as well as integration with various Git services and IDEs.
https://github.com/daytonaio/daytona
As eBPF becomes more popular, we need tools to help manage applications using it. bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.
https://netflixtechblog.com/announcing-bpftop-streamlining-ebpf-performance-optimization-6a727c1ae2e5
https://github.com/Netflix/bpftop
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A look at Platform Engineering, and introducing a layered model of platform, with the oft-missing platform orchestration layer binding together the application and infrastructure.
https://www.syntasso.io/post/platform-engineering-orchestrating-applications-platforms-and-infrastructure
A good troubleshooting writeup (featuring ceph, systemd and containerd), and a good reminder of the layered complexity of the software systems we operate.
https://blog.palark.com/sre-troubleshooting-ceph-systemd-containerd/
Good alert design is hard. This post looks in particular at the problem with workflows acknowledging alerts and the impact on alert fatigue.
https://medium.com/production-care/keep-your-dashboard-clean-acknowledgement-is-not-a-solution-501c3d832c62
A quick post on building a Slack bot to help with handoff for ongoing incidents. A good reminder of the powerful combination of APIs for your ops tools and the low barrier to writing and running bots like this with modern cloud infrastructure.
https://medium.com/@matt_weingarten/creating-an-oncall-handoff-bot-7ee3f67d1033
Another blog post in this series, looking at testing in microservices environments, and how one organisation has solved this with its own customer tooling.
https://medium.com/riskified-technology/elevating-microservices-testing-and-development-using-dynamicenv-852ffeeacff2
A few handy checklists for security focused code review, for both server and frontend applications. Plus a discussion of security code reviews in general, including reference to threat modelling.
https://axolo.co/blog/p/code-review-security-checklist
LLRT is an experimental, lightweight JavaScript runtime intended to Serverless environments. It targets a subset of the Node API, but enables much faster startup time and faster overall performance, as well as lower running costs.
https://github.com/awslabs/llrt
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A good introduction to both Terraform and Ansible, covering similarities and differences, as well as how you might use them together.
https://www.env0.com/blog/ansible-vs-terraform-when-to-choose-one-or-use-them-together
A look at AWS offering extended support for Kubernetes versions in EKS, and the associated costs. Longer support cycles for Kubernetes has been a long term conversation, but the post outlines some of the downsides of this not being in the mainline.
https://medium.com/@talkimhi/aws-extended-eks-support-a-costly-band-aid-for-kubernetes-clusters-120b8d537abe
A set of posts for anyone running etcd, covering the key metrics to measure, and how to measure them.
https://www.datadoghq.com/blog/etcd-key-metrics/
https://www.datadoghq.com/blog/etcd-monitoring-tools/
If you’re doing analysis on a data set, it’s likely important to understand where that data came from, in order for the analysis to be meaningful. This post covers the concept of data lineage, why it’s important, and some tools that can help.
https://semaphoreci.com/blog/data-lineage-big-data
testkube is a Kubernetes-native testing framework for test execution and orchestration. Store tests from any testing tool as CRDs and run them on the cluster.
https://testkube.io/
https://github.com/kubeshop/testkube
The Kubernetes Telemetry Controller can turn OpenTelemetry event streams – logs, metrics, and traces – into Kubernetes resources.
https://axoflow.com/reinvent-kubernetes-logging-with-telemetry-controller/
https://github.com/kube-logging/telemetry-controller
The challenge with having discrete applications is that they might not work together, which leaves more work for the user. UFO is a UI-Focused Agent for Windows, which can orchestrate actions across multiple apps without configuration, using GPT-Vision to work out how to achieve the instructions.
https://github.com/microsoft/UFO
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A post on how to write a good incident postmortem, focused on the importance of understanding context and on applying the 5 whys.
https://medium.com/@vincesackschen/writing-an-excellent-postmortem-8534409f6e0d
An interesting observation about teams banning the use of merge commits in Git, backend by data and with an explanation of why folks are doing so.
https://graphite.dev/blog/why-ban-merge-commits
A breakdown of modern web frameworks, from static site builders to full stack frameworks and simpler/faster alternatives.
https://dev.to/wasp/web-frameworks-we-are-most-excited-for-in-2024-4d15
The end of year report from the Open Source Software Security Initiative, a multi-stakeholder group focused on policy solutions to help improve the security of the open source software ecosystem.
https://whitehouse.gov/wp-content/uploads/2024/01/Securing-the-Open-Source-Software-Ecosystem-OS3I-End-of-Year-Report-MASTERCOPY.pdf
A look at OpenTelemetry’s Semantic Conventions which allow for a common naming scheme for traces that can be standardised across a codebase, libraries, and platforms.
https://www.honeycomb.io/blog/effective-trace-instrumentation-semantic-conventions
A little dated, but a good post on comparing the Serverless framework with CDK, and why you might prefer one over the other.
https://www.alexdebrie.com/posts/serverless-framework-vs-cdk/
Ortelius is a unified evidence store of supply chain data designed to simplify. It provides developers a coordinated view of who is using a service, its version, and inventory across all end-points.
https://ortelius.io/
https://github.com/ortelius/ortelius
Write your build configuration in C# with Nuke. Includes native integration into a variety of CI/CD tools as well, so no need to write additional YAML configuration.
https://nuke.build/
https://github.com/nuke-build/nuke
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
A good post on some of the history of declarative container image builds, and the complexity of build systems as they grow.
https://www.chainguard.dev/unchained/images-as-code-the-pursuit-of-declarative-image-builds
We’re seeing more research into complex supply chain attacks at the moment, and this next post covers MavenGate, which highlights the issue of abandoned domain names in some software ecosystems.
https://blog.oversecured.com/Introducing-MavenGate-a-supply-chain-attack-method-for-Java-and-Android-applications
Another recent vulnerability disclosure. This one affects container build and runtime environments and allows for a full container escapate to the host.
https://snyk.io/blog/leaky-vessels-docker-runc-container-breakout-vulnerabilities/
A couple of posts on evolving incident management practices, looking at the need to introduce gradual changes, standardising severity levels, the importance of training and more.
https://medium.com/dyninno/dyninnos-incident-management-an-introduction-a4516b910269
https://medium.com/dyninno/streamlining-and-implementing-incident-management-at-dyninno-c8ea06327f3a
A good post looking at integrating accessibility testing into developer workflows. Good discussion of modern toolchain challenges and integration options.
https://innovation.ebayinc.com/tech/engineering/introducing-an-accessibility-linter-for-marko-shortening-the-accessibility-testing-pipeline/
Glasskube is a new package manager for Kubernetes. It ships with a GUI and CLI tooling, as well as a central public package repository and the ability to auto-update packages.
https://github.com/glasskube/glasskube/
https://glasskube.dev/
APISIX is an API Gateway with a range of traffic management features including load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability and more.
https://github.com/apache/apisix
https://apisix.apache.org/
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
An interesting post on the perils of productivity metrics for software development, in particular considering the impact of generative AI developer tools.
https://isthisit.nz/posts/2024/engineering-productivity-metrics-genai/
A recent paper on developer experience, including evidence of the benefits of investment in this area, and good points on selling the business benefits.
https://queue.acm.org/detail.cfm?id=3639443
When should you issue an alert? This post talks about the messiness of alerting, and in particular the need to constantly recalibrate alerts to make them useful.
https://www.honeycomb.io/blog/alerts-are-fundamentally-messy
A few predictions for devops in 2024. Observability, generative AI, Infrastructure as Code, multi-cloud and security.
https://securityboulevard.com/2024/01/navigating-the-future-devops-predictions-for-2024/
Another 2024 post, with similar conclusions on the future and opportunities of devops.
https://www.kovair.com/blog/future-of-devops-and-opportunities/
A pair of posts on evolving JSON schemas, exploring pros and cons of different approaches. Some of this is specific to Kafka, but much of the content is more widely applicable too.
https://www.creekservice.org/articles/2024/01/08/json-schema-evolution-part-1.html
https://www.creekservice.org/articles/2024/01/09/json-schema-evolution-part-2.html
The top 10 AWS blog posts from 2023. Interesting to see what’s popular in such a large ecosystem.
https://aws.amazon.com/blogs/devops/the-most-visited-aws-devops-blogs-in-2023/
The 13th annual DevOpsDay LA will happen on Friday March 15th, as part of SCALE21x at the Pasadena Convention Center in Pasadena California.
https://www.socallinuxexpo.org/scale/21x/events/devopsday-la
https://www.socallinuxexpo.org/scale/21x
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
What does it look like to deploy an application 30-40 times a day, with contributions from a large number of teams? A post on the move from people-managed deployments to a fully automated system.
https://slack.engineering/the-scary-thing-about-automating-deploys/
An updated long-form post on continuous integration. Although not a new practice, it’s often misunderstood.
https://martinfowler.com/articles/continuousIntegration.html
A detailed post defining microflows, the difference to workflows, and why the distinction is useful in building distributed systems.
https://wso2.com/whitepapers/towards-a-precise-definition-of-microflows-distinguishing-short-lived-orchestration-from-workflows/
A quick guide to using containers to run Llamafile, a project for running open source Large Language Models such as Llama-2-7B or Mistral 7B.
https://dev.to/spara_50/a-quick-guide-to-containerizing-llamafile-1101
Some predictions for Devops in 2024, focused on the continued move to the cloud, the importance of a security-first approach, AI/ML adoption and more.
https://medium.com/@jimoh_abdol/embracing-the-future-devops-in-2024-14e9c835ae11
Octoherd is a toolset for manipulating lots of GitHub repositories at once. Lots of examples scripts, for things like renaming the main branch, enabling branch protection and more.
https://github.com/octoherd/octoherd
https://github.com/octoherd
TypeSpec is a language for describing APIs and generating other API description languages, client and service code, documentation, and other assets. Support for OpenAPI, GraphQL, gRPC and more.
https://typespec.io/
https://github.com/microsoft/typespec
Prodzilla is a synthetic monitoring tool. Define probes and stories in a YAML configuration and run them to check a web application is behaving as expected.
https://github.com/prodzilla/prodzilla
[ICYMI] DAST is Dead! Long Live DAST! The Evolution of Dynamic API security Testing webinar is now available on YouTube. Watch on-demand here.
https://sthwk.com/long-live-dast-webinar
An epic post that’s well worth the long read. A look at each of the 14 points from Deming’s System of Profound Knowledge with modern cyber security examples.
https://itrevolution.com/articles/out-of-the-cyber-crisis-deming-in-the-world-of-cybersecurity/
The Developer Productivity and Happiness Framework is a useful set of documents formalising how to talk about and measure developer productivity and set goals to make improvements.
https://linkedin.github.io/dph-framework
A look at a new tool called DynamicEnv which rapidly spins up custom testing/dev environments for complex microservices architecture, streamlining workflow and boosting team agility.
https://medium.com/riskified-technology/revolutionizing-development-and-testing-with-dynamic-environment-a-solution-to-microservices-chaos-abad8a1865a7
Some good tips for being more user centred in IT departments, including multidisciplinary service teams. I do think how IT is treated from the outside has a lot to do with the problem as well though.
https://public.digital/2024/01/17/user-centred-it-why-best-practice-isnt-good-enough-in-information-technology
A post for other DSL geeks. Mainly discussing Polar, but I think this applies to other policy languages and domains too. Do you want your DSL to be Turing complete?
https://www.osohq.com/post/is-polar-turing-complete-and-why-i-hope-not
Vanna allows you to train a model on your database content, and then ask questions of an LLM which will return SQL queries to query it.
https://github.com/vanna-ai/vanna
Chalk is a new tool that captures metadata at build time, and can add a small ‘chalk mark’ with that information to any artefacts (like compiled binaries or container images).
https://github.com/crashappsec/chalk
[Jan. 17] Webinar: “DAST is Dead! Long Live DAST! The Evolution of Dynamic API Security Testing.” Drop by to learn more about the new era of API Security Testing with StackHawk. Register here:
https://sthwk.com/long-live-DAST
Alert fatigue quickly becomes a problem as systems grow, and monitoring software does it’s thing. This next post talks about how to prevent it.
https://www.datadoghq.com/blog/best-practices-to-prevent-alert-fatigue/
A nice rundown of patterns for scaling infrastructure management.
https://spacelift.io/blog/scalable-infrastructure
Never waste a crisis. This post describes the response to a major downtime incident and a commitment and investment in reliability, resulting in data centre expansion and architecture improvements.
https://www.infoq.com/news/2024/01/roblox-cellular-infrastructure/
A look back at OpenTelemetry and observability advances in 2023, and what the new year might bring.
https://thenewstack.io/opentelemetry-and-observability-looking-forward/
Internal Developer Portals are all the rage at the moment. This post ties this to the pursuit of improved developer productivity.
https://devops.com/the-state-of-internal-developer-portals-idps/
DevOpsDays Kansas City is back, with the event on May 15th and 16th, and the CFP open now. With Devops coming up on 15 years old the CFP is looking for retrospective talks and ignites in particular.
https://talks.devopsdays.org/devopsdays-kc-2024/cfp
SBOMit is a new tool to add in-toto attestations to SBOMs. It allows for embedding assurance about certain practices and steps in a build process, which can be verified by a consumer.
https://openssf.org/blog/2023/12/13/introducing-sbomit-adding-verification-to-sboms/
https://github.com/SBOMit
Pgxman is a package manager for PostgreSQL extensions, along with a repository of packages. It integrates with native build systems for installation.
https://pgxman.com/
The CI tool Dagger now has a repository of modules in the latest experimental version. This makes composition of pipelines much quicker. There are already quite a few modules available too.
https://daggerverse.dev/