SRE / DevOps / Kubernetes Weekly Collection#7(Week 12)

Image for post
Image for post
  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #481 March 15th,
SRE Weekly Issue #211 March 15th,
KubeWeekly #208: March 20th, 2020

DEVOPS WEEKLY ISSUE #481 March 15th, 2020

A post (fair warning, from one of my colleagues) on the ongoing evolution of application security as cloud adoption shifts more responsibilities towards developers.

  • The title is “How cloud transforms IT security into AppSec”.
  • Based on the view that “the cloud has made the infrastructure part of the app,” he describes “before the cloud” and “after the cloud” as the changes have provided security practices.

Some interesting data points from a recent survey on Serverless adoption. Everything from concerns and benefits to adoption and tooling.

  • The title is “O’Reilly serverless survey 2019: Concerns, what works, and what to expect”.
  • November 2019 article. O’Reilly conducted its first “serverless adoption” survey, with high-level responses from over 1,500 broad range (location, company, industry) respondents. Surveys and results with clear assumptions, targets, and objectives are interesting and informative.

A nice example of building a deployment workflow for network code, in this case with Calico. Lots of demos, including using Conftest and GitHub Actions to verify network policies and ArgoCD to deploy it.

  • The title is “Decentralized Calico Network Security Policy Deployment for GitOps — Part 2.”
  • Part 2 of Tigera’s GitOps blog series. Click here for Part 1 “Enforcing Network Security Policies with Git Ops”. It seems that Part 3, the final part of the trilogy, hasn’t appeared yet as of March 20th(Fri).
  • From defining the scope first (how to enable distributed policy workflows) and then defining the challenges (policy creation complexity, governance checks) related to applying network security policies in Kubernetes using GitOps. First, we are creating an end-to-end policy workflow as an example.

A discussion of where the CNAB (Cloud Native Application Bundles) packaging format fits in, and how it bridges technologies to enable a more PaaS-like experience for users.

A post on the evolution of interfaces in open source infrastructure, looking mainly at Kubernetes, the work done on CNI, CRI, CSI and SMI.

  • The title is “Interoperability of open-source tools: the emergence of interfaces”.
  • From the perspective that “OSS interoperability supports scalability and innovation” in the development of Kubernetes, Container Runtime Interface (CRI), Container Network Interface (CNI), Container Storage Interface (CSI), Service Mesh Interface ( SMI), Cluster API and their respective background and features.

The Bazel build-tool has a high barrier to entry. This post is a good starting point if you’re building Go applications and outputting to container images.

  • “Dependency management and binary building are probably the most frustrating and least satisfying part of software development, and these frustrations get worse as the application grows.” May 2018 article starting from.
  • It mentions that various tools have been developed and open sourced to solve this complexity, the blaze build tool used internally by Google, Facebook’s buck as a derivative tool from it , pants of Foursquare And introduces what this main bazel brings, settings in Go language, application creation/testing, etc.

A look at CIlium Cluster Mesh, including a deep-dive into eBPF and CNI networking stacks.

  • The title is “Kubernetes Multi-Cluster Networking -Cilium Cluster Mesh”.
  • “In a dynamically changing and very complex ecosystem of microservices, traditional IP address and port management tends to cause problems from a management and scale perspective,” said Cilium, BPF (Berkeley Packet Filter). Introducing YugabyteDB.
  • Not only the image diagram, but also the display contents when hands-on are carefully explained, so it looks good.

Monitoror is a wallboard monitoring app to monitor server status; monitor CI builds progress or even display critical values.

  • The web page of the “Monitor” tool that monitors the server status and visualizes it in one screen. There is also a live demo where you can see the screen display .
  • Click here for the GitHub page .

SRE Weekly Issue #211 March 16th, 2020

SREcon20 Asia/Pacific

SRECon20 Asia/Pacific is rescheduled to September 7–9, 2020.

  • Information about SRECon20 Asia/Pacific’s scheduled event in Sydney, Australia.
  • Announcement and registration of all programs would begin in May(at that moment).

Business continuity at Slack: Keeping our customers up and running during COVID-19

This article has a definite marketing slant. It’s nonetheless interesting to see how Slack is handling the situation.

Cal Henderson and Robby Kwok, Slack

  • Slack’s COVID-19 (new coronavirus infection) BCP (business continuity plan) announcement.
  • It supports multiple languages, and Japanese is here .
  • They said that we are continuing our business with the BCP and pandemic plan. Even if all employees move to remote work, there is no hindrance to the work, and even if the capacity and load increase, they can respond sufficiently, so no worries.

Journey into Observability: Glitch’s journey

I love this gem:

I’m not surprised [that] companies that are far into their observability journey start advocating for testing in production — once you have the data and you can slice & dice it as you see fit, testing in production seems like a totally reasonable thing to do.

Mads Hartmann

  • This article focuses on why Glitch started investing in tools to ensure observability, the current situation and how it got there, and finally what remains to be done.

Lessons in Distributed Communication From Incident Response

With many companies suddenly shifting into figuring out how to become distributed organizations overnight, we can learn many lessons by looking at incident response patterns.

George Miranda — PagerDuty

  • In response to COVID-19 (new coronavirus infectious disease), the transition to remote work is rapidly progressing, but in the past on-site on-site operation to cloud distributed operation, corona correspondence and PagerDuty’s system and The explanation is given while referring to the correspondence.
  • GitLab’s “ Communication Practices “, post-disaster response “Non-blame Postmortem”, PagerDuty’s blog “ Effective Remote Work “ etc. are introduced.

When correlation (or lack of it) can be causation

Today’s post is a double header. I’ve chosen two papers from NSDI’20 that are both about correlation.

Paper #1 is a tool that helps identify when files A and B are often changed at the same time, and warns you if you forgot B. Mehta et al. — NSDI’20 (original paper #1)

Paper #2 is a tool for finding correlated failure risks that threaten reliability.

Zhai et al. — NSDI’20 (original paper #2)
Adrian Colyer — The Morning Paper (summaries)

Great Incident Response Requires 3 Major Components

The components from the article are:

Ability to recognize how bad the situation really is, and prioritize it
Effective communication skills
Compassionate responses to mistakes and a learning mindset

Hannah Culver — Blameless

  • As remote work becomes more familiar and distributed teams become the norm, troubleshooting becomes more tricky.
  • The following three elements are explained as necessary elements for dealing with disabilities regardless of where they work.

Announcing Failover Conf

We’re pleased to announce Failover Conf, a conference focused on building resilient systems. The conference will be held online on April 21 and session submissions will be accepted through March 23.

CFP open through March 23.


  • Information that the Failover Conf will be held online on April 21, and that CFP will be accepted until March 23. As an unexpected situation happened to be planned offline, as the name implies, it was decided to hold it online as a failover.
  • I feel the pride of the author as a practitioner in the word “Practicing resilience”.

Grow your blame-free culture with these postmortem best practices | FireHydrant

There are some good tips in here, especially if you’re new to this.

Mandy Mak

  • A proposal to disseminate “a culture without blame” in one’s organization through three post-mortem best practices.

1. The “blameless post-mortem” focuses on learning
→ allowing engineers to respond with better information.

2. Make efforts in groups
→ Collect various viewpoints asynchronously. Give voice to members outside the context of disability response, such as the CS team, and include information such as what the customer has reported. Keep everyone informed, including new members.

3. Tolerate mistakes.
→ Keep in mind that “all members tried their best at the time and tried to make the best choice, regardless of the size of their disability”.
→ Postmortem encourages honesty and transparency and works to ensure psychological safety.

How network automation helps Fastly support the world’s biggest live-streaming moments

Fastly’s APS tool (Auto Peer Slasher) detects when a link is nearing saturation and automatically reroutes traffic through a different interface.

Ryan Landry — Fastly

Full disclosure: Fastly is my employer.

  • A story about how Fastly’s network automation and team of experts are supporting live streaming of Superball, the big event that generated the most traffic in the US in a year.
  • They did everything. They conducted direct peering with many domestic ISPs so that communication could be made as close to the end users as possible to the companies that were interconnected.

KubeWeekly #208: March 20th, 2020

Editor’s pick of the highlights from the past week.

Join SIG Scalability and Learn Kubernetes the Hard Way

Alex Handy,

Contributing to SIG Scalability is a great way to learn Kubernetes in all its depth and breadth, and the team would love to have you join as a contributor. I took a look at the value of learning the hard way and interviewed the current SIG chairs to give you an idea of what contribution feels like.

  • Invited contributors to SIG (Special Interest Group) and introduced Learning the Hard Way. And I recommend SIG Scalability as a realistic method of “Learn Kubernetes the Hard Way”.
  • If you are interested, you can register here .

Kong Ingress Controller and Service Mesh: Setting up Ingress to Istio on Kubernetes

Kevin Chen, Kong

Kubernetes has become the de facto way to orchestrate containers and the services within services. But how do we give services outside our cluster access to what is within? Kubernetes comes with the Ingress API object that manages external access to services within a cluster.

  • How to deploy Kong Ingress controller as Ingress layer for Istio mesh.

Weekly recap of CNCF member and project webinars that you might have missed.

CNCF Member Webinar: Small Is Not Always Beautiful — Moving Enterprise Applications to the Cloud

Paul Jenkins, Product Manager @Oracle Cloud Infrastructure (OCI) Cloud Native Services, and Tony Vertenten, Co-Founder and CTO @Intris

  • Oracle Product Manager Paul Jenkins and Intris Co-Founder and CTO Tony Vertenten explain “small things aren’t always beautiful” based on Intris apps cloud migration cases Webinar video.

CNCF Member Webinar: Democratizing Analytics with Cloud Native Data Warehouses on Kubernetes

Robert Hodges, CEO @Altinity, and Vladislav Klimenko, Senior Software Engineer @Altinity

CNCF Member Webinar: Calico Networking with eBPF

Chris Hoge, Developer Advocate @Project Calico, and Shaun Crampton, Core Developer @Project Calico

  • Project Calico and Tigera Developer Advocate Chris Hoge and Project Calico Core Developer and Tigera Principal Engineer Shaun Crampton explain Calico’s new eBPF ((Berkeley Packet Filter)) based data plane. Webinar videos.

Tutorials, tools, and more that take you on a deep dive into the code.

On the state of Envoy Proxy control planes

Matt Klein

  • Lyft’s Software Engineer and Envoy Creator Matt Klein ‘s personal blog on the Envoy Proxy control plane and its analysis over the next few years.

Introducing istiod: simplifying the mesh control plane

Craig Box, Google

  • Introducing Istiod, which aggregates control plane functions into a single binary with Istio 1.5 by Craig Box of Google. A supplemental information article on 3/19 about Istio 1.5 that I mentioned in last week’s article.
  • He touches on the history of the Istio control plane, the costs of past complexity, the benefits of consolidating Istiod, additional expert information, and future comments.

Introducing the Calico eBPF dataplane

Shaun Crampton, Tigera

  • Shana Crampton, Tigera’s Principal Engineer, introduced the contents of Calico’s eBPF ((Berkeley Packet Filter)-based new data plane on the day (2/25).
  • This is the same as the above Webinar “ CNCF Member Webinar: Calico Networking with eBPF “. Webinar videos have the latest Q&A, so I think it’s good to see them.

Directing Kubernetes traffic with Traefik

Lee Carpenter

  • An article that describes how to do two simple deployments, use Traefik to pass traffic from the outside to the internal cluster, and then remove the Kubernetes resources.

Your own Kubernetes controller — Laying out the work

Nicolas Fränkel

  • Part 1 of the trilogy. This article describes how to get started implementing your own controller in languages ​​other than Go.

Migrating from Helm v2 to v3

Tutorial Cloud Native

  • It explains security risks of Tiller up to version 2 of Helm, introduces new features of version 3, explains how to migrate from version 2 to 3, and notes on migration.

Show Me Your Code with Walter Dal Mut: Extend Kubernetes in NodeJS

Gianluca Arbezzano

  • YouTube video on “Kubernetes development on NodeJS” by Gianluca Arbezzano , SRE and CNCF Ambassador of InfluxData , and Walter Dal Mut , Solutions Architect of Corley . The first place Gianluca is talking to is hard to hear, so it seems better to skip it.

5 tips for troubleshooting apps on Kubernetes

Alex Ellis

  • 5 useful options for troubleshooting using the kubectl command and how to use them.

Our failure story with Redis operator for K8s (+ a brief look at Redis data analysis tools)

Flant staff

  • An article shared the story of the failure and the lessons of Redis operators at Flant .
  • He also touched on six OSS tools to analyze Redis data.

Introduction to Security Contexts and SCCs

Alexandre Menezes, Red Hat

  • Article that proposes setting security context and SCCs (Security Context Constraints) as a means to prevent the following scenarios.
  • On container platforms, created objects are protected by good RBAC practices, but Nodes may not be protected.

Creating Workspaces with the HashiCorp Terraform Operator for Kubernetes

Rosemary Wang, Hashicorp

  • Hashicorp’s article about the alpha release of HashiCorp Terraform Operator. A YouTube demo video is also embedded in the article.

Recommended Steps to Secure a DigitalOcean Kubernetes Cluster

Damaso Sanoja

  • DigitalOcean Kubernetes (DOKS) A DigitalOcean article on how to keep your cluster secure.

13 Kubernetes tutorials in Spanish, 20 Kubernetes tutorials in Portuguese, and 14 Kubernetes tutorials in Russian

Digital Ocean

  • Multilingual (English, German, Spanish, Portuguese, Russian) Tutorial page for developers and system administrators on the community page of DigitalOcean. I’m doing a keyword search for Kubernetes.
  • However, with the keyword Kubernetes, German is not available as of March 21 (Sat).

Articles, announcements, and more that give you a high-level overview of challenges and features.

etcd, with Xiang Li

Adam Glick and Craig Box, Kubernetes Podcast from Google

Managing Harbor at cloud scale : The story behind Harbor Kubernetes Operator

Maxime Hurtrel

  • A story that OVHcloud created a Kubernetes operator based on the Harbor project and made it OSS at the goharbor project under CNCF .

Securing Kubernetes Networking with Nicole Hubbard, Hashicorp

The New Stack Makers

  • Podcast by Nicole Hubbard, Developer Advocate of HashiCorp.
  • There is also a YouTube video .
  • A YouTube video explains how to use Envoy and Consul Connect to securely maintain data communication between different Kubernetes and microservices.
  • I used to listen to podcasts on “The New Stack” all the time, but I realized that there was a video this time. It is an easy-to-see slide, so we recommend watching the video.

Day 2 for the Operator Ecosystem

Gerred Dillon and Matt Jarvis,

  • From the background of the need for operators in Kubernetes, I touched on the build tools Kubebuilder , Metacontroller , and KUDO , and explained the differences in the required specialties and differences.
  • It also introduces Kuttl, a tool for writing tests that check whether the behavior of operators is correct .

4 ways to manage Kubernetes resources

Tomasz Cholewa

  • From the viewpoint of simplicity and complexity, the resource management method of Kubernetes using the following four tools is explained.

1. kubectl command and yaml file

2. kustomize command

3. Helm chart

4. operator

Interoperability of open-source tools: the emergence of interfaces

Katie Gamanji

  • I will skip it because it was taken up in DEVOPS WEEKLY ISSUE #480 above.

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

Lowering the Barrier to Kubernetes Proficiency — Navigating the Stormy Seas of Information
Chris Black, Sr. Solutions Engineer @CircleCI
Member webinar
March 25, 2020 10:00 AM Pacific Time

Continuous profiling Go application running in Kubernetes
Gianluca Arbezzano, Site reliability engineer @InfluxData
Ambassador webinar
March 27, 2020 10:00 AM Pacific Time

Container Security at Scale: Lessons Learned from the Front Lines with ABN AMRO and Palo Alto Networks
Wiebe de Roos, CI/CD Consultant @Flusso and ABN Amro
Keith Mokris,Technical Marketing Engineer @Palo Alto Networks
Member webinar
April 1, 2020 10:00 AM Pacific Time

Taming Your AI/ML Workloads with Kubeflow The Journey to Version 1.0
David Aronchick @Microsoft
Elvira Dzhureava, Technical Product Engineer AI/M @Cisco
Johnu George, Technical lead @Cisco Systems
Member webinar
April 2, 2020 9:00 AM Pacific Time

Welcome to CloudLand! An Illustrated Intro to the Cloud Native Landscape
Kaslin Fields, Developer Advocate @Google
Ambassador webinar
April 3, 2020 10:00 AM Pacific Time

Best Practices for Deploying a Service Mesh in Production: From Technology to Teams
Member webinar
April 8, 2020 10:00 AM Pacific Time

Declarative Host Upgrades From Within Kubernetes
Adrian Goins,Director of Community and Evangelism @Rancher Labs
Dax McDonald,Software Engineer @Rancher Labs
Jacob Blain Christen, Principal Software Engineer @Rancher Labs
Member webinar
April 14, 2020 10:00 AM Pacific Time

杨雨 Alex Yang, 解决方案架构师 Solution Architect @Mirantis
张文墨Larry Zhang, 解决方案架构师 Solution Architect @Mirantis
Member webinar
This webinar will be delivered in Chinese
April 23, 2020 10:00 AM China Standard Time

Kubernetes 1.18
Kubernetes team
Project webinar
April 23, 2020 9:00 AM Pacific Time

Tracy Ragan, CEO of DeployHub and CDF Board Member
Member webinar
June 30, 2020 10:00 AM Pacific Time

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

Written by

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #GCP, #AWS SAP, #National Tour Guide for English

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store