SRE / DevOps / Kubernetes Weekly Collection#22(Week 27)

Image for post
Image for post
  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #496 June 28th, 2020
SRE Weekly Issue #225 June 28th, 2020
KubeWeekly #223 July 2nd, 2020

DEVOPS WEEKLY ISSUE #496 June 28th, 2020

Multi-cloud, whereby organisations either by design or simply scale, use multiple cloud providers, is likely to continue growing, This posts looks ahead at what that might mean for the software (and service) stack we use to build applications.

  • The title is “The coming SMOKEstack: rethinking and retooling “multi-cloud””.
  • An article by Redmonk (an industry analysis company focusing on developers) that analyzes the current state of hybrid cloud and multi-cloud and explains SMOKEstack.
  • The following abbreviations for SMOKEstack are proposed by Mark Hinkle. Discovered and covered by the author who was exploring the “Serviceful” approach of the serverless community.
    ○ Serviceful
    ○ Mashable
    ○ Open
    ○ K(C)composable
    ○ Event-driven

A good introduction to Open Policy Agent, based on notes taken by a new user.

  • The title is “First look at OPA (Open Policy Agent)”.
  • An article that outlines OPA, demos, how to write policies, and how to use OPA.

A post making the case for adding security testing to your CI pipeline, with a discussion of different types of security testing.

  • The title is “Why Doesn’t Your CI Pipeline Have Security Bug Testing?”
  • “While clearly vitally important, current AppSec models are broken. The traditional approaches to application security prioritize training over tooling and finding over fixing. “, raising an issue and explaining the benefits of running security tests of the application in every build and how to get started.

A discussion of Serverless adoption and the current barriers to entry, both organisational and technical.

  • The title is “For the Love of Serverless: Brian Le Roux of Begin”.
  • A blog series that has serverless leaders as guests. In the past, themes are “ thriving community “, “ rapid upward trajectory “ and “ heightened accessibility “.
  • With Brian Le Roux (CTO and Co-Founder at Begin) as a guest, the article touches on the industry’s competitive landscape, opportunities to improve onboarding , and why we’re excited about the new experimental JavaScript runtime.

An interesting look at the evolution of a technology stack (in this case for Slack) over the course of several years.

  • The title is “Stack History: A Timeline of Slack’s Tech Stack Evolution.”
  • As the title says, this article describes the evolution of Slack’s technology stack in chronological order. I’ve never used StackShare, so I created an account. It would be nice to see the technology stack of each company and ideas for each tool.

A good outstage report and investigation into a Casandra cluster issue caused by counter columns.

  • The title is “Cassandra counter columns: nice in theory, hazardous in practice”.
  • An article summarizing precautions for using Cassandra Counters on a large scale in a production environment through obstacles and countermeasures in “Ably Realtime “.
  • I love this closing phrase, “Unlike the priestess Cassandra of greek myth, Ably Realtime is not in the business of making prophecies, whether they are believed or not. Just that expecting practice to always live up to theory can be hazardous. Oh, and don’t use Cassandra counter columns. Not even once.” May Apollon’s anger not be in this Cassandra.

A podcast recording, and transcript/notes from a discussion on devops workflows and Kubernetes. Some good points about the importance of knowledge sharing.

  • The title is “Nigel Poulton on How Kubernetes Can Make or Break the Devops Workflow”.
  • Semaphore podcast. With Nigel Poulton, the author of The Kubernetes Book: Updated Feb 2020 Kindle Edition , as a guest guest, the theme is “How Kubernetes Can Make or Break the Devops Workflow”.
  • He likes Kubernetes himself, and talks from the point of view of the one mainly teaching Kubernetes to the people.
    ○ “Kubernetes is a monster of complexity”
    ○ “I see people struggling with it all the time, and almost being forced into deploying to Kubernetes without having enough sort of knowledge. A lot of people deploying to Kubernetes are deploying with just enough Kubernetes knowledge, and that worries me.”
    ○ “I feel like Kubernetes is not for everyone, but it’s been marketed as if it is for everyone”

A useful paper for anyone running containers. Describing a methodology for penetration testing Docker-based systems.

  • The title is “A Methodology for Penetration Testing Docker Systems”.
  • Joren Vrancken’s bachelor’s thesis explains how to perform penetration tests on Docker-based systems. Docker’s prerequisite knowledge is well organized and good.

King is looking for new members for the infrastructure engineering teams to help develop, manage and expand our software based networking setup across datacenters and (Google) cloud. Please take a look at the open role for networking engineer. We’re also still looking for both database and streaming data engineers, if that is more your style.

  • Continued job information from King(at that moment). There seemed to be no fluctuation in the post. It seemed that they were looking for SRE , Database SRE , Network SRE.

awsls is a handy utility for listing resources in AWS. Given the huge number of different APIs this should be useful for anyone working regularly with AWS.

Konstraint is a tool for anyone using Open Policy Agent Gatekeeper. It makes it easy to generate ConstrainTemplates from rego policy files, making it easier to use standard Open Policy Agent tooling.

  • A GitHub page of OSS of CLI tools,”Konstraint” to support the creation and management of restrictions on the use of” “Gatekeeper”.

KUTTL is a new testing tool for Kubernetes clusters. It’s focused on integration and end-to-end testing of Kubernetes operators

SRE Weekly Issue #225 June 28th, 2020

Catchpoint’s SRE Report 2020 — The Highlights

This suggests an upcoming shift in our field:

50 percent of SREs believe they will be working remotely post COVID-19, as compared to only 20 percent prior to the pandemic.

Kameerath Kareem — Catchpoint

BONUS CONTENT: An outside take on the survey results is here (Mike Vizard —

  • An article that introduces “Catchpoint’s SRE Report 2020” and highlights the points.
  • It says that “According to Google, there should be an upper bound goal of 50% ops work and 50% dev, but this 50/50 split may just be a pipe dream. Based on the survey results, most of the SRE work is dominated by operations-type activities. “.

Even Experts Need Experts

No one person can (or should) know everything. How do we allocate expertise and build connections in order to maximize resilience and adaptive capacity?

Will Gallego

  • An article that left behind some thoughts and questions, such as the standardization of fault response that is related to software engineering, from the event when I asked a handyman for repair due to a trouble around the water.

Heroku Incident Folow-up: Incident #2038

A new feature was accidentally rolled out to too wide an audience, causing log message loss.


  • Follow-up information for the issue that occurred on Heroku from 02/06/2020 14:34 UTC to 06/20/2020 18:42 UTC.

The impact of slow NFS on data systems

[…] one slow block device can affect the performance of processes even when those processes don’t use the slow block device.

Kalyanasundaram Somasundaram — LinkedIn

  • In the Engineer’s blog at LinkedIn, they found and solved a problem with Espresso, LinkedIn’s de facto NoSQL database solution.
  • They clarified the behavior of shared page caches between block devices and discovered how one slow block device can affect the performance of a process even when the slow block device is not in use.

SRE error budgets and maintenance windows

Should you count scheduled maintenance against your error budget? It depends.

Jesus Climent — Google

  • On the Google Cloud web page, an article with the theme “How maintenance windows affect the error budget-SRE tips”.

Cassandra counter columns: nice in theory, hazardous in practice

An investigation in response to three incidents led to this stark conclusion about Cassandra’s “counter columns” feature:

In fact, they don’t appear to have any properties that make them a useful primitive for building predictable distributed systems.

Paddy Byers — Ably

  • I will omit it because it is covered in DevOps Weekly above.

How to Be a Financially Conscious Site Reliability Engineer

This article explains why we should have cost data at our fingertips as we design cloud-based systems.

[…] a well-architected system is often a cost-efficient system.


A Shared Pilot-Autopilot Control Architecture for Resilient Flight

This is a new concept to me, and I really like it:

Capacity for maneuver (CfM) is a measure of how much adaptability or room to respond to a new challenge that a given part of the system has, whether a person or autonomous agent.

Amir B. Farjadian, Benjamin Thomsen, Anuradha M. Annaswamy, and David D. Woods (original paper) Thai Wood — Resilience Roundup (summary)

An update to our nameservers has been rolled back. We are monitoring recovery.

KubeWeekly #223 July 2nd

Editor’s pick of the highlights from the past week.

KubeCon + CloudNativeCon North America is Going Virtual + CFP extended!

We have some news — KubeCon + CloudNativeCon North America is going online!

Connecting and collaborating is in the DNA of the open source cloud native community. With over 92,000 contributors to CNCF projects, 11 new sandbox projects, and 30 new members in Q2 (bringing membership to 570 organizations!), KubeCon + CloudNativeCon has been a key place to keep conversations going and continue building cloud native’s momentum. Although we can’t wait for the day we’ll be able to get together in-person, our virtual events are essential in educating and keeping our community thriving.

This also means the CFP has been extended! The deadline to submit a talk is Sunday, July 12 at 11:59 pm PDT. Learn more about the CFP and other event details here. We look forward to bringing the community together soon!

  • An article by CNCF reporting that KubeCon + CloudNativeCon North America was held virtually and that the CFP deadline was extended(at that moment).
  • It described the virtual event as indispensable for the education and prosperity of the community, and expresses the unwillingness to gather and hold the members of the community.

KubeCon + CloudNativeCon EU Virtual Session Spotlight

The countdown to KubeCon + CloudNativeCon EU Virtual on August 17–20, 2020 is on! As we approach the event, we curated a few recommended sessions that we don’t want you to miss. Please see the feature for this week and be sure to register today!

The Beginners Guide to the CNCF TOC

Presented by: Liz Rice, VP of Open Source Engineering at Aqua Security

Who is the Technical Oversight Committee? What do its members do? How do projects get picked for adoption into the CNCF? Let’s shine a light on this group who determine which projects are adopted by the CNCF, set the future direction of the cloud native landscape, and are even responsible for the definition of the term “cloud native.”

This talk discusses the pros & cons of a project’s participation in the CNCF from the perspective of end users, vendors, contributors, and maintainers. It covers the lifecycle for a CNCF project, including:

– why projects want to be in the CNCF — how the project adoption process works — the requirements that the CNCF has on projects at different phases of maturity

Attendees will leave this talk with insights into how the technical arm of the CNCF works, why it’s important, what the TOC wants to do next, and how they can get involved.

Register now!

  • KubeCon + CloudNativeCon EU Virtual highlights the “The Beginners Guide to the CNCF TOC” session. Schedule: Tuesday, August 18 16:57–17:13 CEST (Central European Summer Time).
  • Liz Rice (VP of Open Source Engineering at Aqua Security/CNCF’s TOC chair) will introduce the TOF (Technical Oversight Committee) of CNCF, and will explain the good pros and cons of participating in CNCF from various viewpoints.

Weekly recap of CNCF member and project webinars that you might have missed.

You can view all CNCF recorded and upcoming webinars here.

CNCF Ambassador Webinar: Commoditise Kubernetes with cluster-api

Gianluca Arbezzano, Senior Staff Software Engineer @Packet

  • It contains a demonstration and explanation using cluster-api on bare metal.
  • About 20 minutes without a slide, the presenter explains the mechanism and background of Kubernetes so far.

CNCF Member Webinar: Best Practices for Running and Implementing Kubernetes

Kendall Miller, President @Fairwinds and Robert Brenna, Director of Open Source @Fairwinds

  • It explains the considerations when using Kubernetes and common pitfalls.
  • I thought it was nice for the two people to proceed along with the slides while explaining/supplementing. The slide was a style where questions/wonders are listed and ideas/best practices were replied.

CNCF Member Webinar: 7 Critical Reasons for Kubernetes-Native Backup

Deepika Dixit, Member of Technical Staff @Kasten and Mark Severson, Member of Technical Staff @Kasten

  • It contained demos of the CNCF project (Kubernetes, kind, CSI), and explained a cloud-native backup strategy, and its benefits.

CNCF Member Webinar: Pivoting Your Pipeline from Legacy to Cloud Native

Nathan Martin, CEO @Sagecore Technologies and Tracy Ragan, CEO @DeployHub

  • It explains how the approach needed to be converted to a service-based approach and how to deal with it.
  • The title of the slide was “Pivoting Your Pipeline for Microservices”

Tutorials, tools, and more that take you on a deep dive into the code.

ConfigMaps in Kubernetes: how they work and what you should remember

Flant staff

  • An article explaining how ConfigMap works with Kubernetes and what you need to remember. It said in the beginning that “Please note that this is not a complete guide, but rather a reminder/tips collection for those who already use ConfigMap in Kubernetes or are in the middle of preparing their applications to use it.”.

Docker and Kubernetes — root vs. privileged

Bryant Hagadorn

  • An article that compares/explains the root authority operation on UNIX-based MacOS and Linux and the Docker — privileged flag.

Verify your Kubernetes Cluster Network Policies: From Faith to Proof

Jan Harrie

  • An article that sets Kubernetes network policy, considers how to test the validity of the setting, and implements/explains it.

Introducing Frigate: A documentation generation tool for Kubernetes Helm Charts

Jacob Tomlinson

  • An article that introduces Frigate, a tool that automatically generates documentation for Helm charts.

User-defined Webhooks in Puppet Relay with Knative and Ambassador API Gateway

Noah Fontes, Puppet

  • An article on Ambassador’s blog explaining how to set up a user-defined webhook for Puppet Relay with Knative and Ambassador API Gateway.

The Building Blocks of DX: K8s Evolution from CLI to GitOps

Katie Gamanji, American Express

  • It focused on the evolution of Cluster DX over time, we are introducing tools that contributed to the expansion of Kubernetes adoption.

Kubernetes Operator (GitHub)

This operator deletes stale feature branches in a Kubernetes cluster.

  • Operator’s GitHub page that removes the old feature branch in the Kubernetes cluster.

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Mirantis and Docker Enterprise, with Adrian Ionel

Craig Box and Adam Glick, Kubernetes Podcast from Google

  • Kubernetes Podcast by Google employees. The current co-hosts are Craig Box and Adam Glick.
  • Adrian Ionel (Co-founder and CEO) of Mirantis was welcomed as a guest.
  • We talk about the introduction of OpenStack to engineering plastics (NASA example, etc.), Kubernetes community from the experience of OpenStack, and acquisition of Docker Enterprise in the flow from the establishment of Mirantis.
  • The host side prepared the questions in a nice way and it was smooth.
  • What have you learned from your experience with OpenStack? “Robustness and simplicity is very important, that’s the key lesson we learnt.” I’m also interested in the Airship project.
  • The topics of interest in News of the week are: There was a lot of news already covered in this article.
    ACI and Docker integration now public
    gRPC-Web for.NET now GA
    Episode 94, with Richard Belleville

Introducing the Hewlett Packard Enterprise Ezmeral software portfolio

Kumar Sreekanti, Hewlett Packard Enterprise

  • An article introducing the “Hewlett Packard Enterprise Ezmeral software portfolio” on the page of HPE.
  • “Ezmeral” is derived from Spanish and means “emerald”. “Emeralds have the mysterious power to strengthen intelligence, anticipate future events, relieve stress and boost immunity,” as well as an image of helping customers with AI and data-driven innovation.

Building Cloud-Scale DBaaS with Kubernetes Operators

Benjamin Anderson, IBM Cloud Databases

  • IBM blog. IBM Cloud runs several Database-as-a-Service (DBaaS) products directly on top of Kubernetes, building a control plane based on the Operator pattern.
  • An article from the history of stateful services to understand this approach, its motivation, and its implications.

I Found A Painless Way To Manage Secrets In Google Kubernetes Engine

Merlin, Hacker Noon

  • An article that explains how to manage GKE’s Secret using the OSS tool Berglas.

Optimize the Kubernetes Developer Experience with Version 0

Richard Li, Ambassador

  • An article that points out that microservices may not work well and introduces a “ Version 0 Strategy” that helps integrate developer experience into an organization’s development workflow.

How Microservices facilitate Feature Teams’ work

Mia-Platform Team

  • An article that describes what a “Feature Team” is and how microservices can facilitate that work.
  • There are various approaches to assigning names and roles to teams, but I think it is important to give teams a clear role/viewpoint that crosses functions/services/organizations that tend to be personal.

Kubernetes static code analysis with Checkov

Jon Jozwiak, Bridgecrew

  • An article introducing “Checkov”, which is an OSS infrastructure analysis tool. Scan the Kubernetes manifest to identify security and configuration issues for Kubernetes workloads.
  • Covers infrastructure security scans as code for Terraform and CloudFormation for AWS, Azure, GCP, catches misconfigurations and helps maintain cloud security best practices.

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

Member Webinar: Stay on top of ongoing Kubernetes security hygiene
Zohar Kaufman, Co-Founder and VP R&D
Ariel Shuper, VP Product
July 2, 2020 10:00 AM Pacific Time

Member Webinar: Optimize your Kubernetes Clusters on Azure with Built-in Best Practices
Jorge Palma, Senior Program Manager @Microsoft
July 7, 2020 10:00 AM Pacific Time

Member Webinar: The Challenges and Countermeasures of Service Mesh Practice
裴斐 (Fei Pei), 网易 杭州研究院 云计算技术专家、架构师 @网易
This webinar will be delivered in Chinese.
July 8, 2020 10:00 AM China Standard Time

Project Webinar: What’s new in Linkerd 2.8 : Multi-cluster Kubernetes made simple and secure by default
Oliver Gould, Linkerd Project Lead, co-founder & CTO @Buoyant
July 8, 2020 10:00 AM Pacific Time

Member Webinar: Building Production-ready Services with Kubernetes and Serverless Architectures
Mike Metral, Software Architect and Engineer @Pulumi
Jason (Jay) Smith, App Modernization Specialist @Google Cloud
July 8, 2020 1:00 PM Pacific Time

Member Webinar: 如何落地 Service Mesh — 从技术选型到实践
马若飞 FreeWheel 北京研发中心首席工程师 @FreeWheel
This webinar will be delivered in Chinese.
July 9, 2020 10:00 AM China Standard Time

Member Webinar: The top 10 most-useful Kubernetes APIs for comprehensive cloud-native observability
Caleb Hailey, Co-founder and CEO @Sensu
July 9, 2020 10:00 AM Pacific Time

Member Webinar: Securing and Accelerating the Kubernetes CNI Data Plane with Project Antrea and NVIDIA Mellanox ConnectX SmartNICs
Antonin Bas, Maintainer of Project Antrea and Staff Engineer @VMware
Moshe Levi, Sr. Staff Engineer @NVIDIA
July 14, 2020 10:00 AM Pacific Time

Member Webinar: Serving Millions of Customers with Cloud Native and DevSecOps
Chris Hollies, CTO, Oracle Practice @Capgemini
Akshai Parthasarathy, Principal Director, Cloud Native and DevOps @Oracle Cloud
July 15, 2020 7:00 AM Pacific Time

Member Webinar: Advancing image security and compliance through Container Image Encryption!
Brandon Lum, Senior Software Engineer @IBM
July 15, 2020 10:00 AM Pacific Time

Member Webinar: Kubernetes and storage. Kubernetes for storage. An overview.
Kiran Mova, Chief Architect at MayaData and core maintainer of OpenEBS @MayaData
July 16, 2020 10:00 AM Pacific Time

Member Webinar: Learn how to clean up your cloud-native “DevOps Dumping Ground”
Melissa Sussmann, Product Marketing Lead @Puppet
Kenaz Kwa Principal Product Manager @Puppet
July 17, 2020 10:00 AM Pacific Time

Member Webinar: Kubernetes Security Anatomy and the Recently Disclosed CVEs
Gadi Naor, CTO & Co-Founder @Alcide
July 21, 2020 10:00 AM Pacific Time

Member Webinar: Implementing Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Oleg Chunikhin, CTO @Kublr
July 22, 2020 1:00 PM Pacific Time

Member Webinar: Observability of multi-party computation with OpenTelemetry
Antoine Toulme, Engineering Manager @Splunk
Dave McAllister, Sr. Technical Evangelist @Splunk
July 23, 2020 10:00 AM Pacific Time

Member Webinar: Kubernetes Policies 101
Eran Leib, Founder, VP Product Management @Apolicy
Spenser Paul, Director of Sales, North America @DoiT International
July 28, 2020 10:00 AM Pacific Time

Member Webinar: Cluster API — Yesterday, Today, Tomorrow
Saad Malik CTO & Co-Founder @Spectro Cloud
Jun Zhou Chief Architect @Spectro Cloud
July 30, 2020 10:00 AM Pacific Time

Project Webinar: How We Doubled System Read Throughput with Only 26 Lines of Code
TiKV team
July 31, 2020 10:00 AM Pacific Time

Project Webinar: How We Doubled System Read Throughput with Only 26 Lines of Code
TiKV team
July 31, 2020 10:00 AM Pacific Time

Project Webinar: Kubernetes 1.19
Kubernetes release team
Aug 28, 2020 10:00 AM Pacific Time

Member Webinar: Getting started with container runtime security using Falco
Loris Degioanni, CTO and Founder @Sysdig
Sept 2, 2020 1:00 PM Pacific Time

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

Written by

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #GCP, #AWS SAP, #National Tour Guide for English

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store