SRE / DevOps / Kubernetes Weekly Collection#40(Week 45)

  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.

DEVOPS WEEKLY ISSUE #514 November 1st, 2020
SRE Weekly Issue #242 November 1st, 2020
KubeWeekly #240 November 6th, 2020

DEVOPS WEEKLY ISSUE #514 November 1st, 2020


A fun set of 10 short stories highlighting the reality to production incidents.

Digital transformation is increasingly a strategic priority for all large organisations. That means it’s important for business executives to be familiar with the need to modernise applications and platforms.

  • The title is “App Modernization 101: An Executive’s Guide to Shipping Better Software”.

Pulumi allows for defining infrastructure using general purpose programming languages. With the new automation API, it’s now possible to embed this capability in other programs, with initial support for Typescript and Go.

  • The title is “The Pulumi Automation API — The Next Quantum Leap in IaC”.

A quick case study of operating a large multi-tenant Kubernetes cluster in the public cloud. Covers provisioning, management, visibility and more important operations challenges.

  • The title is “How Salesforce Operates Kubernetes Multi-tenant Clusters in Public Cloud at Scale”.

The term cloud native has become increasingly prevalent. This post talks about why, and breaks down several tooling areas to focus on.

  • The title is “How to Become Cloud Native — And the Tools to Get You There”.

Both microservice and serverless architectures push for smaller units of execution, this post looks at the differences between the two.

  • The title is “Microservices & Serverless Functions — The difference”.


WTF Are Microservices? Join Sam Newman, author of Monolith to Microservices, on 5 November at 11:30 CET for a 90-minute crash course in microservices architecture: WTF it is, but also when you should and shouldn’t use it. Register now

  • As mentioned above, a 90-minute course will be held. 11/5 (Thursday) 11:30 CET (Central European Time zone).

WTF Is Cloud Native? It’s blogs, videos, events, and more, about an ever-changing world of strategy, culture, technology, and more, brought to you by Container Solutions. Let’s f*#king do this! Subscribe to the newsletter.

  • An information page for blogs, videos, events, and newsletters about Cloud Native of Container Solutions, which provides the above “WTF Are Microservices?”.


Earthly is an interesting new build tool focused on repeatable builds. It combines Dockerfile and Make and makes it easier to run isolated tests and other commands.

  • The web page of the build automation tool in the container era, “Earthly”. You can run all builds in a container and create Docker images and artifacts(binaries, packages, arbitrary files, etc.).

Tempo is an easy-to-use and high-scale distributed tracing backend. Tempo is integrated with cloud-based object storage and can be used with a variety of tracing protocols, including Jaeger, Zipkin and OpenTelemetry.

  • The GitHub page of OSS’s large distributed trace backend “Grafana Tempo”.

Ripgrep is a local code search tool that’s optimised for performance and nicely integrates with other developer tools like gitignore files.

  • The GitHub page of the line-oriented search tool “Ripgrep” that recursively searches for regular expression patterns in the working directory.

SRE Weekly Issue #242 November 1st, 2020


Here are 4 Ways SRE Helps New Employees Onboard

The work of SREs and the material we produce can be an excellent source of information to onboard new employees (not just SREs!).

Author Emily Arnot — Blameless

  • The content of the title is explained as “The SRE mentality can provide insights into many areas, including onboarding itself. “, and how SRE can take onboarding to the next level is explained with the following four points. doing.
    ○ Runbooks as guides for new employees
    ○ Incident retrospectives as a library of learning
    ○ SLIs, SLOs, and error budgets as focal points and confidence boosters
    ○ Refining onboarding with an SRE mentality

Sharp tools for emergencies and the –clowntown flag

Having safeguards in your tools to prevent errors, is wise. Allowing the user to disable those safeguards when the need arises is even wiser.

Rachel by the bay

  • Spotlighting the Facebook internal term “clown town” or “clown town” as an example, it explains how to prepare sharp tools for emergencies.

United States Air Force Aircraft Accident Investigation Board Report — F-35A, T/N 12–005053

Lots of factors contributed to the crash and destruction of this $175 million USD aircraft. The pilot escaped with minor injuries.

Colonel Bryan T. Callahan et al. — USAF

  • U.S. Air Force fighter accident report. When I saw the list of “ACRONYMS AND ABBREVIATIONS”, I thought “There are so many!”

The Future of Ops Careers

Serverless isn’t going to make ops go away. NoOps is a myth.

Charity Majors — Honeycomb

  • At the beginning, it writes, “Even if you don’t run any servers or have any infrastructure of your own, you’ll still have to deal with operability and operations engineering problems. I hate to be the bearer of bad news (not really), but the role of operations isn’t going away. At best, the shifts that supposedly reduce your ops are simply delegating the operability of your stack to someone that does it better. The reality for most teams is that operations engineering is more necessary than ever.” and explains about the future of Ops.

The KPIs of improved reliability

In this blog post, we’ll present reliability-centric metrics and key performance indicators (KPIs) that show the positive impact that reliability has on businesses.

Andre Newman — Gremlin

  • It introduces reliability-centric indicators and KPIs that show the positive impact of reliability on business.

The failure of a computer you didn’t even know existed

“Outage of a CRL server” isn’t the first thing that would come to mind when diagnosing a database connection failure.

Oren Eini — RavenDB

  • It explains the event that his blog went down and asked for opinions.

Telltale: Netflix Application Monitoring Simplified

Telltale combines anomaly detection, alerting, dashboarding, and incident management.

Andrei Ushakov, Seth Katz, Janak Ramachandran, Jeff Butsch, Peter Lau, Ram Vaithilingam, and Greg Burrell — Netflix

  • Netflix’s in-house application monitoring tool “Telltale” is explained from the specific background that was needed.

File Descriptor Transfer over Unix Domain Sockets

What?! I had no idea this was possible! You can transfer file descriptors (and the open files they point to) to another process, even outside of the normal parent/child process relationship.

Cindy Sridharan

  • The author explains the impact and content of reading a paper.


  • GeoComply
    GeoComply, a geo-location service used by most online gaming sites in the US to monitor the physical location of their customers, experienced a major outage.

Failure information of each of the above companies

KubeWeekly # 240

The Headlines

Editor’s pick of the highlights from the past week.

Honoring Dan Kohn

This weekend, we lost a titan of the open source community with the passing of Dan Kohn. CNCF, the foundation Dan helped build as its Executive Director, will always be home to Dan’s legacy as a pioneer and innovator in the world of technology. As a community, we remain humbled and grateful to the tireless effort Dan gave to this foundation, his colleagues, and his friends. His work in creating an inclusive foundation that was welcoming and safe was momentous and beneficial to all. The strong and diverse leadership we experience today stems from Dan’s determination. Dan was unwavering in his passion for and belief in open source. His presence will be severely missed, but never forgotten by those who knew his gentle nature and felt his supportive touch. Our thoughts and prayers remain with the Kohn family, who so gracefully shared Dan’s light with us for so many years. While it’s almost impossible to imagine CNCF without Dan, we know there would never be a CNCF without him, either, and for that, we are truly thankful. Thank you, Dan.

  • An article by the CNCF in honor of Dan Kohn, who has contributed significantly to the foundation and development of the CNCF as an Executive Director, and passed away.

ICYMI: CNCF Webinars

You can view all CNCF recorded and upcoming webinars here.

CNCF Member webinar: Security in the world of service meshes

John A. Joyce, Principal Engineer @Cisco

  • It provides an overview of security in the world of service meshes, starting with an introduction to key security concepts and describing the key system components that implement those concepts.

CNCF Member webinar: Managing your policies and standards

Ahmed Badran, Chief Technology Officer @Magalix

  • It explains the following points and gives a real-world example of implementing a simple governance framework using Rego and OPA.
    ○ What is governance and why it is important
    ○ How to establish a governance framework
    ○ How Open Policy Agent and the Rego language could help
    ○ Example policies for Kubernetes

CNCF Member webinar: Building edge as a service

Dr. Bin Ni, CTO @Wangsu Science & Technology / CDNetworks

  • It shares a concept model that can efficiently achieve the goal “establishment of a standard method of providing edge computing to developers as a service”.

Tutorials, tools, and more that take you on a deep dive into the code.

Ensuring YAML best practices using KubeLinter

Saiyam Pathak, Civo

Set up your K3s cluster for high availability on DigitalOcean

Alex Ellis, OpenFaas

  • It provides an overview of the reference architecture for setting up K3 in a high availability (HA) configuration.

metal3-io / baremetal-operator : Bare metal host provisioning integration for Kubernetes

  • The GitHub page of “Bare Metal Operator” that implements the Kubernetes API for managing bare metal hosts.

Using WireGuard to extend OpenShift networks

Sebastian Jug, Red Hat

  • Red Hat’s PSAP (Performance Sensitive Applications) team presents its work on titles in collaboration with WireGuard.

Security hardening Kubernetes


  • YouTube Webinar video by Johan Tordsson, CTO of Elastisys.

The road to Flux v2 — November update

Daniel Holbach, Weaveworks

  • At the beginning, the document and the embedded video are introduced for the reader.
    ○ If you are new to the community and GitOps, you might want to check out our GitOps manifesto or the official GitOps FAQ.
    ○ If you want to see the latest demo of GitOps Toolkit in action, check out this video:

CI/CD with Chris Short (2/2) — YouTube

  • A YouTube video featuring Chris Short, one of Kube Weekly’s editors. Episode 216 of the YouTube channel “Roaring Elephant”.

How to use skopeo to migrate off Docker Hub

JJ Asghar

  • It introduces you how to migrate from Docker Hub to or GitHub Container Registry using “skopeo” provided by Red Hat.

Oracle continues building DTrace for Linux atop BPF


  • Along with the title, it explains the past and latest movements of DTrace for Linux by Oracle.

Disposable Kubernetes clusters

Garry Wilson, Curve

  • Curve’s case study article. It provides an overview of how to manage a Kubernetes cluster to handle live Curve card transactions while upgrading without downtime. They have switched from Kops to EKS and upgraded the version of EKS, aiming for full automation in the future.

Reminiscing control theory and the future of observability

Michael Hausenblas, AWS

  • At the beginning, he touches on the connection between his own control theory and observability (o11y), and explains the recent movement and future of observability.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

CNCF welcomes Katie Gamanji as Ecosystem Advocate

Cheryl Hung, CNCF

  • An article reporting that Katie Gamanji of American Express, a member of the CNCF’s TOC (Technical Oversight Committee), has been appointed to the CNCF’s Ecosystem Advocate. An interview video with Chryl Hung is embedded.

Antrea, with Antonin Bas

Adam Glick and Craig Box, Kubernetes Podcast from Google

What’s new in CKA/CKAD with CKS coming up!

Saiyam Pathak (Civo) and Walid Shaari

  • A YouTube video that describes the latest changes in CKA / CKAD and some use cases, as well as the new CKS certification.

Preparing Google Cloud deployments for Docker Hub pull request limits

Michael Winser and Dhaivat Pandit, Google Cloud

D2iQ takes the next step forward

Tobi Knaup, D2iQ

  • An announcement that D2iQ’s platform will be concentrated from Mesosphere to Kubernetes-based DKP (D2iQ Kubernetes Platform). The Mesosphere platform has begun a process towards termination.

Cloud native explained. An interview with Cheryl Hung, VP Ecosystem at CNCF

John Leonard, Computing

  • It was an article that required membership registration. I tried to register, but I gave up because I couldn’t register the pattern of the address other than the UK.

A sysadmin’s guide to containerizing applications

Scott McCarty, Red Hat

Argo CD and Tekton: Match made in Kubernetes heaven

Siamak Sadeghianfar and Burr Sutter, Red Hat

  • A web page with embedded Webinar that explains how to combine the power of Tekton Pipelines with ArgoCD to achieve a declarative approach to CI/CD based on GitOps principles.

4 ways to run Kubernetes locally

Mike Callzo,

A fireside chat to demystify KEPs

Amanda Katona, VMware

  • A CNCF article interviewing an overview of the Kubernetes Enhancement Proposal(KEP) and efforts to renew the Approval Plugin.

How Discord (somewhat accidentally) invented the future of the internet

David Pierce, Protocol

  • The path that Discord has walked through is very interesting.

Upcoming CNCF webinars

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

Member Webinar: Kubernetes in the context of on-premises edge and network edge computing
Amr Mokhtar, Network Software Engineer @Intel Corporation
Nov 10, 2020 10:00 AM Pacific Time

Member Webinar: MicroK8s HA under the hood: Kubernetes with Dqlite
Konstantinos Tsakalozos, Senior Software Engineer @Canonica
Nov 11, 2020 7:00 AM Pacific Time

Member Webinar: The what and why of distributed tracing
Dave McAllister, Sr. Technical Evangelist @Splunk
Nov 13, 2020 10:00 AM Pacific Time

Member Webinar: Discover, analyze, and secure your APIs…anywhere
Pranav Dharwadkar, VP of Products
Jakub Pavlik, Director of Engineering
Dec 1, 2020 10:00 AM Pacific Time

Member Webinar: Metal³: Kubernetes-native bare metal host management
Maël Kimmerlin, Senior Software Engineer @Ericsson Software Technology
Dec 10, 2020 10:00 AM Pacific Time

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.
Bye now!!

Yoshiki Fujiwara

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #GCP, #Certified AWS SAP

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store