SRE / DevOps / Kubernetes Weekly Collection#55(Week 7, 2021)

  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #529 February 14th, 2021
SRE Weekly Issue #257 February 14th, 2021
KubeWeekly #251 February 19th, 2021

DEVOPS WEEKLY ISSUE #529 February 14th, 2021


Lots of details about Kubernetes liveness probes, including the problems they look to solve and some of the common implementation errors.

  • The title is “Kubernetes Liveness Probes — Examples & Common Pitfalls”.
  • It was covered in KubeWeekly #250 last week, so I will skip it.

A post on potential trends in and around infrastructure and operations in 2021. Continued evolution rather than revolution.

  • The title is “Top 8 DevOps Trends for 2021”.
  • As the title suggests, the first half explains the following eight trends, and the second half talks about the future outlook under the title of “The Future of DevOps in 2021 (and Beyond)”.
  1. Maturation of Infrastructure Automation (IA) Tools
  2. The Use of Application Release Orchestration (ARO) Tools
  3. More Complex Toolchains
  4. The Rise of DevSecOps
  5. Application Performance Monitoring (APM) Software
  6. A Wider Scope of Cloud Management Platforms (CMPs)
  7. More Uncertain Goals and Requirements
  8. Further Growth of AgileOps

Open source software has been an important part of the devops story. This paper explores some of the recent conversations about licensing, ethics and the issues with open communities.

  • The title of the paper is “The Tyranny of Openness: What Happened to Peer Production?”.
  • It discusses the ongoing “cultural war” among the software peer-production communities. I will read it again later.

A post on automating the release process for a complex monorepo project using Lerna.

  • The title is “Automated release process for (Lerna) mono repo”.
  • It explains how to automate the process of releasing a new version of software with a mono-repository managed by “Lerna”.
  • Lerna is a tool that optimizes your workflow for managing multi-package repositories using git and npm.

Logging is too easy to get wrong. This post covers lots of details about Java logging, useful for those building or operating Java applications.

  • The title is “Log4j Tutorial: How to Configure the Logger for Efficient Java Application Logging”.
  • It provides information for understanding the current Log4j setup (specifically the log4j 2.x version).
  • Although log4j 1.x is EOL, it is widely used in legacy apps all over the world, so it takes up space for being EOL and migrating to the 4j 2.x version.


I’m a big fan of making it possible to experiment and learn with even complex platforms locally. Knative on Kind (Konk) does just that for Knative.

  • A GitHub page of “KonK (Knative on Kind)”. I will try this.

A simple but handy utility to resize EBS volumes on AWS.

  • As the name suggests, the GitHub page of “ebs-autoresize”, a tool that automatically resizes AWS EBS.

Another useful utility, SecretScanner does exactly what you’d think, finding secrets from file systems or container images.

  • A GitHub page of Secret Scanner, a tool that allows users to scan a container image or local directory on a host and output a JSON file containing details of all the secrets found.
  • The “Disclaimer” clearly states that it should not be used for hacking purposes.

Monorepos appear particularly popular with some JavaScript communities. Rush is a new toolkit helping to build and publish many packages from a common Git repo.

  • As the Editor commented, the web page of “Rush”, a tool developed to ease JavaScript developers who build and publish many NPM packages in a mono-repository configuration at once.

SRE Weekly Issue #257 February 14th, 2021


Sometimes alerts have inobvious reasons for existing

This one really got me thinking. Make sure you document why an alert exists, not just what it checks for.

Chris Siebenmann

  • A post written by the author in the wake of “Somewhat recently I saw people saying negative things about common alerting practices, specifically such as generating some sort of alert when a TLS certificate was getting close to expiring” I think what the author wants to convey is as the Editor comment above.

Incident response from monolith to microservices

If you start with a monolith and adopt a microservice architecture, your incident response process will need to change as well.

Mya Pitzeruse — effx

  • An article that explains the content of the title with the following five points.
    ○ Know the key differences
    ○Establish developer accountability
    ○ Enable teams with visibility and access
    ○ Invest in your SRE team and practices
    ○ Parting thoughts

Minesweeper automates root cause analysis as a first-line defense against bugs

Another one that needs a disclaimer: there’s no single “root cause” for an incident, and this article is not about that. This is about using statistical software to aid humans in debugging by looking at the activities performed by different users before they encounter a given bug.

Vijay Murali, Edward Yao, Umang Mathur, Satish Chandra — Facebook

  • An article on Facebook’s Engineering blog that introduces “Minesweeper” that identifies the root cause of a bug caused by a symptom and automates RCA (Root cause analysis) . See Editor’s comments above for Disclaimer.

On Not Being a Cog in the Machine

A new SRE at Honeycomb shares insight on the job and SRE attitudes in general.

Fred Hebert — Honeycomb

  • An article the author wrote in its first week as Honeycomb’s first dedicated SRE it explained the scene as “I was asked if I wanted to write a blog post about my first impressions and what made me decide to join the team?”. The content is as commented by the above Editor, and it proceeds with the following three points.
    ○ Fostering Human Processes
    ○ Sociotechnical systems and context awareness
    ○ Adapting and sharing observability

Slack’s Jan 2021 outage: a tale of saturation

Slack’s Jan 2021 outage: a tale of saturation This post considers the January 4th Slack outage as a set of cases of saturation.

Lorin Hochstein

  • The company’s engineering blog “Jan. 4, 2021 outage”, which summarizes Slack’s 2021/1/4 outage, is taken up and explained from the perspective of saturation.
  • The following is quoted from Slack’s blog article “Building the Next Evolution of Cloud Networks at Slack”, which mentions when Slack operated AWS with a single account, and explains three types of saturation. Based on these circumstances, it is easy to get an image.
    ○ As our customer base grew and the tool evolved, we developed more services and built more infrastructure as needed. However, everything we built still lived in one big AWS account. This is when our troubles started. Having all our infrastructure in a single AWS account led to AWS rate-limiting issues, cost-separation issues, and general confusion for our internal engineering service teams.
  • Certainly I don’t want to think about hundreds of accounts for CIDR management of VPC Peering. Then, the following technology is adopted.
    ○ AWS shared VPCs
    ○ AWS Transit Gateway Inter-Region Peering

KubeWeekly #251 February 19th, 2021

The Headlines

Editor’s pick of the highlights from the past week.

Kubernetes README: What books to read to learn more about Kubernetes

Chris Short, Red Hat

Cool resource! Find out what books to read to learn more about Kubernetes. Please submit pull requests for books, tutorials, or other assets that would be useful to folks.

  • This looks good to get into the world of Kubernetes. As Chris of the above one of the editors of KubeWeekly says, you can send PR, so if you have any useful English resources, I think you should suggest it.

The Technical

Tutorials, tools, and more that take you on a deep dive into the code.

Configure multi-tenancy with Kubernetes namespaces

Mike Calizo,

  • It explains how to partition a single Kubernetes cluster and take advantage of this built-in Kubernetes tool using Kubernetes Namespace and some basic RBAC configurations.

NetworkPolicy Editor: Create, Visualize, and Share Kubernetes NetworkPolicies

Sergey Generalov., Isovalent

  • It introduces “Network Policy Editor”, a tool that supports the creation of YAML files for Kubernetes Network Policy.
  • It looks good. But it does not work for my PC/monitor environment. I couldn’t use it with the message “Policy Editor doesn’t support small screens Please use desktop or expand the window”. I tried to enlarge the browser window by moving it from the monitor to the main unit, but it didn’t work.

csantanapr/knative-kind: Knative on Kind (KonK)

Carlos Santana, IBM

  • The GitHub page of “KonK”, which is also featured in DEVOPS WEEKLY ISSUE #529 above.

Creating an Argo Workflow With Vault Integration Using Helm

Jason Froehlich, Red Hat

  • It explains that Argo provides a convenient way to access Red Hat OpenShift secrets, but if you’re using Vault as a company, how to use it and package it in a Helm Chart for easy installation and reuse.

OpenShift Administrator’s Office Hour: Windows Containers w/ Special Guest Christian Hernandez

Andrew Sullivan, Christian Hernandez, Chris Short, Red Hat

  • A Webinar video with the above title. The blog post is here, and this video is also embedded, so it might be better to watch it from there.

shell-operator & addon-operator news: hooks as admission webhooks, Helm 3, OpenAPI, Go hooks, and more!

Ivan Mikheykin, flan

Rate Limiting in controller-runtime and client-go

Daniel Magnum

  • The content of the title is explained in the following section structure.

What was observability again?

Cristian Klein, Elastisys

  • The observability is explained in detail from the following two viewpoints.
    ○ Various types of observability
    ○ The technical implications of implementing observability
  • The reader aims to understand the following:
    ○ At the end of this post, you will understand why you should resist the temptation to save a few bucks on observability.
  • When I was given the number as follows, “In fact, observability is so critical that as of February 2021, the Cloud Native Computing Foundation (CNCF) lists 102 projects in that category” I saw twice the number of related projects.

Building Custom Control Planes using Crossplane

Sahil Lakhwani, InfraCloud

  • It explains how to use Crossplane to create your own control plane on top of your cloud provider. This time, as an example, it explains with a pattern that uses the AWS environment.

Build and publish container images to any cloud with Infrastructure as Code

Joe Duffy, Pulumi

  • A Pulumi’s blog post that explains how to build, publish, and use a simple container image across the cloud using just a few lines of code.
    ○ Approach
    ○ Prepare a Container Registry
    ○ Build and Publish Your Container
    ○ Consume the Container Image
    ○ Wrapping Up

ICYMI: CNCF online programs this week

A weekly summary of CNCF online programs from this week.

Toward Hybrid Cloud Serverless Transparency with Lithops Framework

Gil Vernik @IBM

  • It takes a deep dive into how to make serverless computing easy to use in a wide range of scenarios, including high performance computing, Monte Carlo simulation, Big Data pre-processing, and molecular biology.

This Week in Cloud Native (Livestream): KCD El Salvador

  • A session in Spanish. I think it would be nice to have regular CNCF Online Programs in Japanese as well.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Datadog and the Container Report, with Michael Gerstenhaber

Craig Box, Kubernetes Podcast from Google

  • The Kubernetes Podcast by Google employees. The current Co-host is Craig Box. Adam Glick goes to greener pastures. Past guests will be invited as guest hosts for several weeks.
  • This week, guest host is Saad Ali of Google, who led the development around storage of Kubernetes including CSI (Container Storage Interface) and volume subsystem that appeared in Episode #103.
  • The guest is Michael Gerstenhaber, Director of Product Management of Datadog and curator of the company’s annual Container Report.
  • The topics I was interested in in the News of the week are as follows.
    Jetstack Secure
    Kong Konnect is GA

Kubernetes Deployment Antipatterns — part 1

Kostis Kapelonis

  1. Deploying images with the “latest” tag
  2. Hardcoding configuration inside container images
  3. Coupling the application with cluster constructs
  4. Mixing infrastructure deployments with application releases
  5. Doing manual deployments using kubectl
  6. Using kubectl for debugging clusters
  7. Not understanding the Kubernetes network model
  8. Wasting resources on static environments instead of dynamic ones
  9. Mixing production and non-production workloads in the same cluster
  10. Not understanding memory and CPU limits
  11. Misusing health probes
  12. Not understanding the benefits of Helm
  13. Not have effective application metrics
  14. Handling secrets in an ad-hoc manner
  15. Adopting Kubernetes even when it is not the proper solution.

Kubernetes Pods Advanced Concepts Explained

Regis Wilson, Release

  • It describes certain advanced concepts related to Kubernetes init containers, sidecars, config maps, and probes.

Discover and invoke services across clusters with GKE multi-cluster services

Emeka Nwafor, Product Manager, and Jeremy Olmsted-Thompson, Staff Software Engineer, Google Cloud

  • An Introductory article with GA of MCS (multi-cluster services), which is a Kubernetes native cross-cluster service discovery and calling mechanism.
  • In Common MCS use cases, the following comments were introduced by Mercari as an early adopter.
    ○ “We have been running all our microservices in a single multi-tenant GKE cluster. For our next-generation Kubernetes infrastructure, we are designing multi-region homogeneous and heterogeneous clusters. Seamless inter-cluster east-west communication is a prerequisite and multi-cluster Services promise to deliver. Developers will not need to think about where the service is running. We are very excited at the prospect.” — Vishal Banthia, Engineering Manager, Platform Infra, Mercari

Upcoming CNCF Online Programs

CNCF End User technology radar, February 2021 — Secrets Management
James Nugent @Apple, Steve Nolan @RStudio, Andrea Galbusera @AuthKeys, and Tyler Gass @Peloton
February 23, 2021
Register Now

This Week in Cloud Native (Livestream): Fluent Bit updates and Stream Processing
Anurag Gupta @FluentBit
February 24, 2021 at 12:00 pm PT
Register Now

The Container Security Checklist
Liz Rice @Aqua Security
February 25, 2021
Register Now

CNCF Online Programs Playlist on YouTube
Check out our playlist for more curated content you don’t want to miss! New content is added every Friday.

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #CKA, #CKAD, #Certified AWS SAP

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store