SRE / DevOps / Kubernetes Weekly Collection#5(Week 10)

Yoshiki Fujiwara
15 min readJun 24, 2020
  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #479 March 1st, 2020
SRE Weekly Issue #209 March 1st, 2020
KubeWeekly #206: March 6th, 2020

DEVOPS WEEKLY ISSUE #479 March 1st, 2020

News

An excellent talk from RSA on the intersection of governance, risk and compliance with devops practices.

  • The title is “How to GRC Your Dev Ops”. GRC stands for Governance, risk management and compliance.
  • YouTube video of the content of the RSA Conference held in San Francisco from 2/24 to 28th.
  • The Speaker is Susan Allspaw Pomeroy , Technology Compliance Manager at Fastly.
  • It’s amazing that her career from vendors doing government-related jobs to adapting to completely different values ​​by changing jobs to startups.
  • At the start, she was effectively calling out to listeners and gaining empathy. A good presentation in which the character of the speaker is reflected.

Conftest, the Open Policy Agent based tool for testing infrastructure as code, now has a handy plugin model. This post covers a few examples, for Kubernetes and AWS, and explains how to build your own.

  • The title is “Extending conftest with plugins”.
  • For those of you who are not familiar with conftest , he introduced his previous blog “ Building in compliance in your CI/CD pipeline with conftest”.
  • The conftest is a tool that helps you create tests for structured configuration data. You can write tests for Kubernetes configurations or Tekton pipeline definitions, Terraform code, serverless configurations, or other structured data.

A look at Gandalf; an intelligent, end-to-end analytics service for safe deployment in cloud-scale infrastructure.

  • The title is “Gandalf: an intelligent, end-to-end analytics service for safe deployment in cloud-scale infrastructure”.
  • An article introducing Gandalf, a software deployment monitoring tool for Microsoft Azure.
  • It’s a large Azure-like scale, analyzing more than 20TB of data (270K platform events [up to 770], 600 million API calls) and more than 2,000 fault types per day.
  • Gandalf reminds me of the “Lord of the Rings” wizard. Is that a reliable and wise image?

An interesting set of examples and exercises around Kubernetes security, looking at built-in Kubernetes capabilities.

  • Links from the Exercises page, a hands-on resource for a live workshop by Connor Gilbert from StackRox , at the BSidesSF 2020 conference in San Francisco ,from February 22 to February 24 .
  • It seems nice that the explanation and CLI are carefully written for each item with the theme “Whether the application on Kubernetes is protected by using Built-in Kubernetes controls to secure your applications”. I want to run it.

Another RSA talk, this one looking at the potential for attackers who know how Kubernetes works under-the-hood. Some pretty nefarious ideas demonstrated well.

  • The title is “Advanced Persistence Threats: The Future of Kubernetes Attacks”.
  • YouTube video of the content of the RSA Conference held in San Francisco from 2/24 to 28th.
  • Speakers are Lead Platform Security Engineer Ian Coldwater of Salesforce and Brad Geesaman of DARKBIT .
  • As Kubernetes becomes more popular, cyber attackers are becoming more sophisticated and more and more countermeasures are needed.
  • Guidance on practical attack detection and defense methods, with examples of new attacks and proven effects for Kubernetes.
  • Easy to enter from introduction. The materials/demonstrations are easy to see, the way they speak is easy to hear, and the tempo is good. I wish I could make a presentation like this.

One of the advantages of Kubernetes as a platform is it’s extensibility. This post looks at two mechanisms for this; adding your own scheduler and creating an operator.

  • The title is “Extending Kubernetes for our needs”.
  • Two examples are given as extended patterns of Kubernetes.
  • A pattern that creates a custom scheduler: The author’s other blog has been introduced in it.
  • A pattern for making operators and controllers: This article focuses on this.

A low-level look at how the logging framework Fluentd gathers metadata from Kubernetes.

  • The title is “How Fluentd collects Kubernetes metadata”.
  • The author recently needed to stream Fluentd’s logs to his company, Zebrium’s platform for self-monitoring logs, and he shared how he looked at how Kubernetes metadata is collected by Fluentd.

Jobs

env0 makes Infra-as-Code easy, empowering every dev and test case to have its own environment, while minimizing maintenance effort, costs and risk. We are a rapidly growing and well-funded startup based both in the San Francisco Bay Area and in Tel Aviv. We believe software development is a team effort, and are looking for people who strive for excellence, and enjoy the journey getting there.

  • env0 was looking for DevOps Relations Advocate in San Francisco Bay Area(at that moment).

Tools

Dispatch is an open source crisis management orchestration framework. IT integrates with Slack, Google Apps, Jira, etc. to make it easier to react to assembling participants, sending out notifications, tracking tasks, and assisting with post-incident reviews.

  • The title is “Introducing Dispatch”
  • The article for the story for open-sourcing and introduction of Netflix in-house tool “Dispatch”.
  • Dispatch is a tool that effectively manages security incidents by deeply integrating existing tools used in the organization (Slack, Jira, G Suite, etc.)
  • Click here for the GitHub page .
  • They were looking for OSS contributors and colleagues(at that moment).

Ever wanted to query your Kubernetes cluster using SQL? Kube Query provides a bridge between osquery and Kubernetes to do just that.

  • The title is “Kube-Query: A Simpler Way to Query Your Kubernetes Clusters”.
  • A broker that extends OSS “osquery”. It Visualizes the Kubernetes cluster. Still experimental level.
  • Click here for the GitHub page .

Efficient management of SQL schema evolutions allows DevOps professionals to deploy code quickly and reliably with little to no impact. Learn how modern teams are building out zero impact SQL database deployment workflows here:

  • The title is “Zero Impact SQL Database Deployments.”
  • Article with best practices to help you get past and overcome app and SQL database connections that create interesting and complex challenges when updating your schema and trying to maintain a consistent user experience.

SRE Weekly Issue #209 March 1st, 2020

Articles

Gandalf: an intelligent, end-to-end analytics service for safe deployment in cloud-scale infrastructure

Azure developed this tool to sniff out production problems caused by deploys and guess which deploy might have been the culprit. Its accuracy is impressive.

Adrian Colyer — The Morning Paper (summary)

Li et al. — NSDI’20 (original paper)

  • I skipped this one, because I covered it on DevOps Weekly.

fork() can fail: this is important

This one made me laugh out loud. Better check those system call return codes, people.

rachelbythebay

  • Conversationally asking the danger when fork() fails.

Managing the Hidden Costs of Coordination

This caught my eye:

In addition, what is seen as the IC maintaining organizational discipline during a response can actually be undermining the sources of resilient practice that help incident responders cope with poorly matched coordination strategies and the cognitive demands of the incident.

Laura M.D. Maguire — ACM Queue Volume 17, Issue 6

  • It started from the place where the service community Slack flooded with user reports of “502 errors” and the response started, and entered each discussion.
  • Following the title, she discussed “Controlling coordination costs when multiple, distributed perspectives are essential”, “The Need for Coordination Design”, “Hidden Costs of Coordination”, “Attempts at Supporting Coordination”, and “Conclusion”.
  • “Coordination remains an integral part of a large distributed work system, but the lack of coordinated coordination design continues to add hidden cognitive costs to the practitioner,” he said. He gives me ideas from the perspective of organizational theory, software platforms, etc. This is some content I want to read firmly, so this is my bookmark article this week.

How much money do SREs make?

A guide on salary expectations for various levels of SRE, especially useful if you’re changing jobs.

Gremlin

  • This is a theme that everyone is interested in! “How much does SRE earn?” It should be taken into account that US surveys and statistical data include those from 2018 as well as private and government releases early 2020.
  • In his opinion, “If you are the sort that enjoys the challenge of keeping things running, you won’t regret making the move from a financial standpoint.”.

3 microservices resiliency patterns for better reliability

The flipside of microservices agility is the resiliency you can lose from service distribution. Here are some microservices resiliency patterns that can keep your services available and reliable.

  • It explains the patterns (Retry/Circuit Breaker/Correlation ID) with resilience in three microservices for higher availability.

Joydip Kanjilal

It’s time for smart home devices to have local failover options during cloud outages

There have been several recent failures of consumer devices based on a cloud service outage, and this author argues for change.

Kevin C. Tofel — Stacey on IoT

  • Due to a 17-hour failure of Nest , a subsidiary of Google , Nest Cam both indoor and outdoor and Nest Cam IQ cameras were able to record video “because the software update of the storage server did not behave as expected”. In addition, there was an effect such as not being able to see live streaming.
  • The author argues that a company that handles smart home products that enables smart failover to save to a local capacity even if there is a failure on the cloud side is necessary.
  • Google and Amazon announced in 2019 that “we will bring more localized devices and smart devices to the edge side,” but no progress can be seen.

Human error, miscommunication and lack of training behind false alarm at Pickering nuclear station

This sounds familiar

Durham Radio News

  • Thousands received false alerts about a possible “nuclear accident” at the Pickering Nuclear Power Plant in Ontario, Canada due to human error .
  • The first report was sent at 7:23 am and the second report (corrected report) was sent at 9:11 am. The reason why it took time to send the correction information was that neither the person in charge nor the superior could understand the setup.

Friday deploys: comfort, not pressure

Essentially, you’re taking that risk of the Friday afternoon deployment, and spreading it thinly across many deployments throughout the week.

Ben New

  • Starting from “No deployments please, it’s Friday!”, and he proceeded with more comfortable and pressureless deployment, automation and talk. I want to read it all at once.

Outages

KubeWeekly #206: March 6th, 2020

The Headlines

Editor’s pick of the highlights from the past week.

KubeCon + CloudNativeCon Europe postponed

The health, safety, and wellbeing of our attendees and staff are our highest priority, and we know that what makes KubeCon + CloudNativeCon such a great event is the people who gather there. Thus, after discussions with many community members, we have made the difficult decision to postpone KubeCon + CloudNativeCon Europe (originally set for March 30 to April 2, 2020) to instead be held in July or August 2020. (We’re finalizing the date and will announce it shortly.) We expect that by mid-summer, there will be more clarity on the effectiveness of control measures to enable safe travel to industry events like this one.

  • KubeCon + CloudNativeCon Europe and other co-sponsored events in Amsterdam have been postponed to July or August. The official schedule will be decided by the situation.
  • KubeCon + CloudNativeCon + Open Source Summit China in July is discontinued. Stay tuned for 2021.
  • North America in Boston remains unchanged from 11/17 to 11/20.
  • The behavior of sponsors/speakers/participants can be found on the above page and the links above. If you have any questions that are not in the FAQ, please contact us at the address according to your position.
  • The hotel room officially held by the event organizer is said to have been automatically canceled by 3/6 with no cancellation fee. Click here for confirmation page .

Contributor Summit Amsterdam Postponed

Dawn Foster, VMware and Jorge Castro, VMware

  • In addition, they are considering Virtual contributor activities. You will talk about the final plan in the above blog post and through regular channels.

The Technical

Tutorials, tools, and more that take you on a deep dive into the code.

Bring your ideas to the world with kubectl plugins

Cornelius Weig, TNG Technology Consulting GmbH

  • A blog in kubernetes.io that briefly explains the procedure and period (Kubernetes Enhancement Proposal) required to add the desired function of kubectl and explains the effectiveness of the plug-in.

What is circuit breaking?

Peter Jausovec, Learn Cloud Native

  • It literally explains “What is Circuit breaking?” with YouTube videos and sentences, commands/YAML files/display results, etc. Very easy to hear.
  • You can see the demo and the explanation together, so it would be better to see it with a monitor.

Everyone might be a cluster-admin in your Kubernetes cluster

Jeff Geerling

  • People operating Kubernetes clusters may be unknowingly dangerous because they have the “cluster-admin” role, which is the administrator privilege of the cluster.
  • The irony is that the YAML file of the spact8 cluster visualization tool of Kubernetes also has a cluster-admin role set to all pods in the default Namespace (!!) in the YAML file.
  • Even though the deletion authority is enclosed in the RBAC settings, it is explained that the above cluster-admin role has the authority to delete the entire default Namespace. Explains how to verify by yourself. Even though it’s a verification environment, I find it scary when I try to remove it from installation.

Kubernetes & Kublr Architecture Infographic

Kublr team

  • Documentation for multiple components such as logging, monitoring, and RBAC integration needed to use Kubernetes in production.
  • It’s interesting that the navigation arrows of the embedded image make it easy to zoom in and out and the explanation appears.

Kubeflow goes 1.0

Thea Lamkin, Google Cloud

  • Jeremy Lewi (Google), Josh Bottum (Arrikto), Elvira Dzhuraeva (Cisco), David Aronchick (Microsoft), Amy Unruh (Google), Animesh Singh (IBM), and Ellis Bigelow (Google) are co-authors of the article. Introducing the fact that multi-vendor OSS Kubeflow has become a major release v1.0, its historical background, current/added/future features, etc.
  • Being an open community, we are looking for slack channels , mailing lists and weekly community meetings .

Implementing FaaS in Kubernetes Using Kubeless

Mohamed Ahmed, Magalix

  • A story of trying to implement Function as a Service (FaaS) with Kubernetes native of OSS and serverless framework Kubeless.
  • I describe Kubeless as an add-on to Kubernetes, and explain that you can create custom resources and controllers and issue commands with an easy-to-use CLI.
  • Explains from installation to execution with a very simple minimum Faas model in a web application. It looks interesting, so I want to run it.

Spotify Open-Sources Terraform Module for Kubeflow ML Pipelines

Anthony Alford

  • It starts with the introduction of the Terraform module for the Kubeflow ML pipeline that runs on GKE, which is OSS of Spotify, and the result of switching from an in-house ML platform to Kubeflow.
  • While touching on a part of the transition of Spotify’s ML platform, I am endlessly interested because it has various information links.
  • He touches on ML-related projects that Spotify has turned into OSS, and concludes with a page on GitHub .

Kubernetes Namespaces Explained in 15 mins (YouTube)

TechWorld with Nana

  • YouTube video explaining the Namespace of Kubernetes by Nana, senior freelance devops and software engineer.
  • When I watched the YouTube channel, I have been making tutorial videos such as Docker & Kubernetes and Jenkins in the last few months. It sounded more powerful than I had imagined.

Advanced Persistent Threats: The Future of Kubernetes Attacks (YouTube)

Ian Coldwater and Brad Geesman

  • I skipped this one, because I covered it on the above DevOps Weekly.

Our migration journey from AWS to Google Cloud

Part 1
Part 2
Tim Little

  • A story about the transition from AWS of Kudos to GCP. The following three are the main migrations.
  1. An application written in Ruby from Amazon EC2 instances to Kubernetes Pods of GKE.
  2. AWS Application Load Balancer (ALB) to Istio.
  3. AWS Aurora MySQL database to Google Cloud SQL.
  • The aim was to treat the infrastructure, which is said by the legend of Pets vs Cattle , like livestock, not pets.
  • It’s interesting to compare and test each service, so I’d like to take a moment and read it carefully.

Multicluster Kubernetes with Service Mirroring

Thomas Rampelberg, Linkerd

ICYMI: CNCF Webinars

Weekly recap of CNCF member and project webinars that you might have missed.

CNCF Member Webinar: Getting Started with Containers and Kubernetes

Wayne Warren, Software Engineer @DigitalOcean

  • A webinar video for beginners of Kubernetes, a container by Wayne Warren, Software Engineer of DigitalOcean.
  • It’s very easy to understand, and I think it is for people who want to hold down keywords and images.

CNCF Member Webinar: Service Mess to Service Mesh

Kavya Pearlman, Cybersecurity Strategist @Wallarm, and Rob Richardson, Technical Evangelist @MemSQL

  • Webinar video of Service Mesh by Kavya Pearlman, Cybersecurity Strategist at Wallarm, and Rob Richardson, Technical Evangelist at MemSQL.
  • You can compare Linkerd and Istio, have demos, etc.
  • If you feel that the introduction or interaction between speakers is redundant, you can skip it, or scroll down and check the slide with DOWNLOAD SLIDES in the lower right, and skip the parts that you find unnecessary.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Kubeflow 1.0, with Jeremy Lewi

Adam Glick and Craig Box, Kubernetes Podcast from Google

  • Guest is Jeremy Lewi, a Google software engineer and one of the founders of the Kubeflow project and core contributor .
  • There a lot of news of the week are’t covered by KubeWeek.

Starting work on an Operator Working Group

Alois Reitbauer

  • A thread at cncf.io looking for a contributor to start the Operator Working Group and actual active contributor at the call of Alois Reitbauer of Dynatrace.

Helm and Operators on OpenShift Part 2

Daniel Messer, Red Hat

  • Part 2 of 2 blog discussing Helm and operators as an option to deploy software are on OpenShift.

Google Cloud goes after the telco business with Anthos for Telecom and its Global Mobile Edge Cloud

Frederic Lardinois, TechCrunch

  • An article about TechCrunch’s announcement of GCP’s Anthos for Telecom. The focus is on expanding edge computing, a service for Google carriers.
  • Google Cloud also announced Global Mobile Edge Cloud (GMEC). This gives operators access to more than 130 edge locations worldwide, as well as Google’s 20+ data center regions.
  • Looking optimistically, I’m looking forward to future collaborations and technological innovations. Since I have been mainly touching networks, it was particularly interesting news.

Kubernetes operators: Embedding operational expertise side by side with containerized applications

Scott McCarty, Enable Sysadmin

  • The subtitle “Kubernetes isn’t complex, your business problem is. Learn how operators make it easy to run complex software at scale.” is “It’s not complicated Kubernetes, but your business problem is complicated.”
  • If I’m still thinking about applying Kubernetes as a problem-solving tool, I think the problem to be solved is not well organized.

7 best practices: Building applications for containers and Kubernetes

Kevin Casey, The Enterprisers Project

  • The subtitle is “Let’s examine key considerations for building new applications specifically for containers and Kubernetes, according to cloud-native experts”.
  • This article focuses on important considerations for building new applications dedicated to containers and Kubernetes.
  • I’ve focused on this article above because the so-called “greenfield” approach can be a better starting point for new-comers to orchestration and teams.

Migrating applications to containers and Kubernetes: 5 best practices

Kevin Casey, The Enterprisers Project

  • The subtitle is “Let’s examine key considerations for migrating existing applications to containers and Kubernetes, according to experts”.
  • If you open the link of this article and feel “deja vu feeling”, it is a great answer. Kevin Casey’s article above. Kevin, writing articles in a short period of time is too awesome.
  • This time, I will explain the 5 best practices of migrating apps to containers and Kubernetes over 2 pages in an easy-to-understand manner.

Kubernetes wallpapers

Daniel Weibel

  • Free Kubernetes wallpapers for PC. Now you can taste the feeling of summoning Kubernetes as a magician and the assortment of sweet Kubernetes tools.

Webinar Registration

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

What’s New in Linkerd 2.7
Linkerd team
Project webinar
March 6, 2020 10:00 AM Pacific Time

Use Open Source, Bare Metal, & 5G to achieve autonomous drone delivery!
Cody Hill, Field CTO @Packet
Member webinar
Mar 10, 2020 10:00 AM Pacific Time (US and Canada)

Kubernetes Security Best Practices for DevOps
Connor Gorman, Principal Engineer @StackRox
Member webinar
March 11, 2020 10:00 AM Pacific Time

Immutable Infrastructure for your Kubernetes Host Environment
Timothy Gerla, CEO @Talos Systems
Member webinar
March 12, 2020 9:00 AM Pacific Time

Welcome to CloudLand! An Illustrated Intro to the Cloud Native Landscape
Kaslin Fields, Developer Advocate @Google
Ambassador webinar
March 13, 2020 10:00 AM Pacific Time

Democratizing analytics with cloud native data warehouses on Kubernetes
Robert Hodges, CEO @Altinity
Vladislav Klimenko, Senior Software Engineer @Altinity
Member webinar
March 18, 2020 10:00 AM Pacific Time

Small Is Not Always Beautiful — Moving Enterprise Applications to the Cloud
Paul Jenkins, Product Manager @Oracle Cloud Infrastructure (OCI) Cloud Native Services
Tony Vertenten, co-founder and CTO @Intris
Member webinar
March 19, 2020 9:00 AM Pacific Time

How to migrate a MySQL Database to Vitess
Liz van Dijk, @PlanetScale
Project webinar
March 20, 2020 10:00 AM Pacific Time

Argo CD, Flux CD and the GitOps Revolution
Jay Pipes Principal, Open Source Engineer @Amazon Web Services
Member webinar
March 24, 2020 10:00 AM Pacific Time

Lowering the Barrier to Kubernetes Proficiency — Navigating the Stormy Seas of Information
Chris Black, Sr. Solutions Engineer @CircleCI
Member webinar
March 25, 2020 10:00 AM Pacific Time

Best Practices for Deploying a Service Mesh in Production: From Technology to Teams
Buoyant
Member webinar
April 8, 2020 10:00 AM Pacific Time

Kubernetes 1.18
Kubernetes team
Project webinar
April 23, 2020 9:00 AM Pacific Time

Pivoting Your Pipeline from Legacy to Cloud Native
Tracy Ragan, CEO of DeployHub and CDF Board Member
Member webinar
June 30, 2020 10:00 AM Pacific Time

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

--

--

Yoshiki Fujiwara

・Cloud Solutions Architect - AWS@NetApp in Tokyo, Japan. #AWS Certified Solution Architect&DevOps Professional, #Kubernetes, ・Opinions are my own.