SRE / DevOps / Kubernetes Weekly Collection#59(Week 11, 2021)

Yoshiki Fujiwara
12 min readMar 22, 2021
  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #533 March 14th, 2021
SRE Weekly Issue #261 March 14th, 2021
KubeWeekly #255 March 19th, 2021

DEVOPS WEEKLY ISSUE #533 March 14th, 2021

News

Infrastructure and operations. Two words that folks typically consider as a moment in time, rather than as something more fundamental. A good post on the changes in infrastructure, and in how we operate it, and why change is always part of a career in operations.

  • The title is “The Future of Ops Careers”.
  • While recommending watching “Lambda: A Serverless Musical”, the content of the title is explained according to the following items. The video was interesting.
    ○ Where Does Ops Fit, Anyway?
    ○ What Is Infrastructure?
    ○ *-As-a-Service Is Really Just Code for “Outsourcing”
    ○ How to Outsource Things Well
    ○ What This Means For Operationally Minded Engineers

A look at the scale of secrets and sensitive data leaking into source code repositories, and where it’s most commonly found across different types of files.

  • The title is “File types that most commonly contain sensitive information”.
  • It reviews the data collected by security company GitGuardian in 2020 and explains how to prevent secret leaks and the following 10 file extensions where secrets are leaked most frequently.
  1. py — Python files
  2. js — Javascript
  3. env — Environment files
  4. json — JSON files
  5. properties — Properties files
  6. pem — PEM files
  7. PHP — PHP files
  8. xml — XML ​​files
  9. yml & .yaml — Yaml files
  10. ts — TypeScript files

The reason I’m a fan of Open Policy Agent is just how general purpose it is. Here’s a great example. Using OPA itself to lint OPA rego policies.

  • The title is “LINTING REGO WITH … REGO!”.
  • The title and the comments from the Editor above speak eloquently. I feel pressure to need to learn REGO.

A detailed overview of different open source chaos engineering tools for Kubernetes.

  • The title is “Open Source solutions for chaos engineering in Kubernetes”.
  • It introduces a little history of chaos engineering and the following tools.
  1. be a monkey
  2. chaoskube
  3. Chaos Mesh
  4. Litmus Chaos
  5. Chaos Toolkit
  6. KubeInvaders (and similar projects)
    Cuba DOOM
  7. Other tools
    PowerfulSeal
    Pod-reaper
    It was entropy
    Fabric8 Chaos Monkey
    Kubernetes by Gremlin
    Mangle by VMware

Sigstore is a new project aiming to lower the barrier to entry to using signing to ensure the integrity of software releases. At the prototype stage but interesting to track progress.

An argument for using Mage (a Go build tool) in place for Make, along with a basic introduction to Mage.

  • The title is “Mage is My Favorite Make”.
  • It explains the goodness of the author’s favorite build tool “Mage”.

Lots of practical tips for running Vault on Kubernetes. High availability, end-to-end encryption and more.

  • The title is “5 best practices to get to production readiness with Hashicorp Vault in Kubernetes”.
  • As the title suggests, it describes the following five best practices to help you deploy, run, and configure your Vault server securely and securely on Kubernetes.
  1. Initialize and bootstrap a Vault server
  2. Run Vault in isolation
  3. Implement end-to-end encryption
  4. Ensure traffic is routed to the active server
  5. Configure and manage Vault for tenants with Terraform

Tools

Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a git repository. Connect to Dolt just like any MySQL database to run queries or update the data using SQL commands.

  • A GitHub page of SQL database “Dolt” with the function in the above Editor comment.

PacBot is is a platform for continuous compliance monitoring, compliance reporting and security automation for the cloud. Custom rules in Java, as well as a plugin model for ingesting other data sources.

  • A web page for PacBot (Policy as Code Bot), a platform for continuous cloud compliance monitoring, compliance reporting, and security automation.
  • Click here for the GitHub page.

Mockintosh is a framework for mocking microservices, with support for performance testing, asynchronous communications, multiple-services and more.

  • The above link is the web page of the title “Open Source Microservice Mocking: Introducing Mockintosh”.
  • Click here for Mockintosh’s web page.
  • A tool that provides regular HTTP mock service functionality with a small resource footprint and is suitable for microservices applications.

SRE Weekly Issue #261 March 14th, 2021

Articles

What Do Fighter Pilots and Incident Management Have in Common?

I find it really refreshing that fighter pilots have a retrospective about every single mission, successful or not. There’s always something to learn.

Jessica Abelson — Transposit

  • There was much to be learned from fighter pilots like the importance, ideal, and influence of the retrospective scenes.

Incident Response at Heroku

Heroku applies the Incident Management System, designating an Incident Commander who keeps the incident on track and oversees communications, both external and internal.

Guillaume Winter — Heroku

  • A post in 2020 that updates the post on the same theme in 2014. In addition to the Editor’s comments above, I got the impression that the document itself is more readable.

How Khan Academy Successfully Handled 2.5x Traffic in a Week

This story is becoming common: Khan had a sudden influx of traffic when pandemic lockdowns began. Their strategy involved the use of the cloud and a CDN.

Marta Kosarchyn — Khan Academy
Full disclosure: Fastly, my employer is mentioned.

  • Along with the content of the title, it describes aspects of the architecture that play an important role in the scalability of the site.

Under the Hood: Ensuring Site Reliability
Here’s a great summary of how Squarespace does SRE.

Franklin Angulo — Squarespace

  • The author explains with the aim of this post as follows.
    ○ Read on to learn more (and nerd out) about all that goes into making sure every Squarespace site is reliable and running smoothly.

[Increment: Reliability] Reliability at scale

Leaders at Deliveroo, DigitalOcean, Fastly, and Headspace share how their organizations think about reliability and resiliency and their advice to engineering orgs embarking on reliability journeys.

The leaders each answer a series of questions about how their organization handles reliability, giving an interesting compare-and-contrast overview.

Increment

Full disclosure: Fastly is my employer.

  • As shown in the above article excerpt, the answers of the leaders of each company on the following themes are summarized.
    ○ How does your organization think about resilience and reliability writ large?
    ○ Does your organization have dedicated reliability engineers?
    ○ What measures or metrics do you use to capture investment in reliability?
    ○ When it’s been a while since your last incident, how do you keep your teams sharp and ensure continued investment in reliability?
    ○ How do you think about and/or fund projects to address low-probability but high-risk events?
    ○ What would you share with rapidly growing tech companies to help them on their own reliability journeys?
  • The bottom line is as follows.
    ○ The bottom line: Have transparency and clear communication around decisions and trade-offs, while having the flexibility to align with business priorities and meet hard deadlines.

[Increment: Reliability] Case study: Resilience as adaptability at Freshworks

Using a disaster plan created after a devastating hurricane, Freshworks survived and thrived during the pandemic, delivering a major new product by its pre-pandemic deadline.

Ipsita Agarwal — Increment

  • An interesting case study, including the situation of COVID-19 in India.

What Is a Canary Deployment?

This one explains what a canary deployment is, how it can help you, and how canary deployments differ from blue/green deployments.

LaunchDarkly

  • As the title suggests, Canary Deployment is explained with the following points with figures.
    ○ Benefits of a canary release
    ○ Canary deployments give you more control and confidence over releases
    ○ Visual Example: Canary deployment followed by a progressive rollout based on geography
    ○ Blue-green deployments and what a canary release is not
    ○ Can you do Kubernetes canary deployments?
    ○ An effective deployment strategy

How to Build an SRE Team with a Growth Mindset

This article explains the meaning of a growth mindset and shows how it applies to SRE.

Emily Arnott — Blameless

  • The following is explained with a comparison table of Growth mindset and Fixed mindset.
    ○ What a growth mindset is and why it helps your SRE team
    ○ How to hire for a growth mindset
    ○ How to develop people into SREs with a growth mindset
    ○How a blameless culture empowers a growth mindset

Outages

  • Fastly
    Full disclosure: Fastly is my employer.
  • OVH Cloud
    This week, there was a major fire at an OVH Cloud datacenter. As a result, Rust (an MMOG) permanently lost data, according to its creators.
  • All domains containing “t.co” in Russia
    It appears that Russia tried to impair access to Twitter’s URL-shortening domain t.co, but their pattern-matching was overzealous and affected any domain that contained “t.co” (think reddit.com, microsoft.com, and many others).
  • Dyn
    Dyn had a DNS outage. I noted impact to Heroku, but I didn’t see any other related outage postings.
  • Chef
  • GitHub

The Headlines

Editor’s pick of the highlights from the past week.

Statement from CNCF General Manager Priyanka Sharma on the unacceptable attacks against AAPI and Asian communities

CNCF

  • As mentioned above, CNCF GM Priyanka Sharma has called for the importance of solidarity and diversity with AAPI and Asian colleagues in the face of rising racism and attacks on the global Asian community.

The technical

Tutorials, tools, and more that take you on a deep dive into the code.

Cloud Native CI/CD with Tekton — Laying The Foundation

Martin Heinz, IBM

  • As the title suggests, a series on building CI/CD using Tekton. This time, it starts by deploying, installing, and customizing Tekton, and then introduces you to a native CI/CD journey to the cloud on Kubernetes.
  • The “Tekton CI/CD Kickstarter”, an environment where the author has put together all the resources, scripts, and files needed to use CI/CD with Tekton, can be found in this GitHub repository.

Cosign — Signed Container Images

Dan Lorenc, Google Cloud

  • It introduces “cosign”, a tool for signing container images created by the author.
  • Cosign is developed as part of the “ sigstore “ project featured in DEVOPS WEEKLY ISSUE #533.

Is Crossplane the Infrastructure LLVM?

Daniel Mangum, Crossplane

  • Crossplane is explained using LLVM as a comparison material.

Migrate from Docker to Containerd in Kubernetes

Dennis Kruyt

  • It explains how to migrate Kubernetes’s Container Runtime Interface (CRI) from “docker shim” to “containerd”.
  • The layer between Kubernetes and containerd, dockershim, has been deprecated as a container runtime since Kubernetes v1.20 and will be removed after v1.22.

Kubernetes: What Are Endpoints

Arseny Zinchenko, ITNext

  • An article explaining the resources of Kubernetes. It explains the “Endpoint” that runs behind the “Deployment” when using the “Service”.

Migrating from Docker Compose to Skaffold

Vic Iglesias, Google Cloud

  • It explains how to migrate the development tool Docker Compose to Skaffold and Minikube from the background it tried to execute.

Open Source Solutions for Chaos Engineering in Kubernetes

Vasily marble breeze

  • I will skip it because it is taken up in the above DEVOPS WEEKLY ISSUE #533.

Prometheus Definitive Guide Part I — Metrics and Use Cases

Deepankur Singh Baliyan, Infracloud

  • It explains general monitoring and Prometheus from a beginner’s point of view. In the next post, It will explain the PromQL query language in detail.

How I Test OpenFaaS Changes with Kubernetes

Alex Ellis, OpenFaaS

  • OpenFaaS Founder Alex Ellis explains:
    ○ What is OpenFaaS
    ○ What is deployed when you install OpenFaaS
    ○ How to test OpenFaaS changes using Kubernetes
    ● An overview of approaches that will help new and potential contributors to experiment quickly for the first time.

Creating a Single-node Kubernetes Cluster on Atomic Pi with kubeadm

Kristijan Mitevski, Mitevski Tech Blog

  • It explains the procedure and commands for building the environment described in the title.

Configure Your Kubernetes Ingress with Ingress Builder

Viewer Fadeyi, Jetstack

  • It Introduces “Ingress Builder” , a tool for correctly configuring Ingress resources developed by Jetstack.

OpenFaaS and GKE Autopilot

Johan Siebens

  • It deploys OpenFaaS on a GKE Autopilot cluster and explains how the two work together.
  • The tools needed to carry out hands-on articles are:
    ○ gcloud: the CLI tool to create and manage Google Cloud resources
    arcade : portable Kubernetes marketplace
    ○ kubectl: the Kubernetes command-line tool, allows you to run commands against Kubernetes clusters
    faas-cli: the official CLI for OpenFaaS
    hey: (optionally) a tiny program that sends some load to a web application

ICYMI: CNCF online programs this week

A weekly summary of CNCF online programs from this week.

Data Protection in a Kubernetes Native World

Michael Cade, Cabinets

  • It discusses seven key considerations for Kubernetes native backup and demonstrates its importance in implementing a cloud-native backup strategy to protect business-critical data on a developer-centric platform.

Your Own Kubernetes Castle

Adam Kozlowski, GrapeUp

  • It explains how to make and assemble your own choices for themes such as which one to choose from the various options of the Kubernetes ecosystem, what is the difference, etc.
  • Each slide was easy to see and the image was conveyed, so I want to learn from it.

Hacking Kubernetes

Ben Hirschberg, ARMO

  • It uses “Ezuri”, a memory loader, to break into the K8’s live environment, load fileless malware that is not detected by common security tools, and steal SSL keys. It also explains how measures can be taken against it.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Take the Puppet State of DevOps 2021 survey

Nigel Kersten, Puppet

Who Needs Open Policy Agent?

Christopher Tozzi, Fix me

  • The following points are explained along with the content of the title.
    ○ What Is Open Policy Agent?
    ○ What Problem Does OPA solve?
    ○ Do Enterprises Need OPA?
    ○ When Should You Adopt OPA?
    ○ Which Skills Do You Need to Adopt OPA?
    ○ Learn More about OPA

Kubernetes Autoscaling, Explained

Kevin Casey, Red Hat

  • It explains what Kubernetes autoscaling is, and this automation feature eliminates the need to manually provision and scale down resources as demand changes.

Tinkerbell, with Gianluca Arbezzano

Craig Box, Kubernetes Podcast from Google

  • Kubernetes Podcast by Google employees. The hosts this time are Craig Box and Guest Host Vic Iglesias(a Solutions Architect at Google Cloud and a maintainer of the Helm charts repository).
  • Gianluca Arbezzano, Principal Engineer of Equinix Metal and Maintainer of “Tinkerbell”, is invited as a guest.
  • The topics I was interested in in the News of the week are as follows.
    Fairwinds introduces Saffire
    Announcing Private Clusters on Oracle Cloud Infrastructure (OCI) Container Engine for Kubernetes (OKE)
    Left hand 2.10

CD Foundation Announces Industry Initiative to Standardize Events from CI/CD Systems

  • It introduces the new Special Interest Group (SIG) for Events, “Events SIG” from the Continuous Delivery Foundation (CDF).

The Evolution of Kubernetes Dashboard

Marcin Maciaszczyk & Sebastian Florek, Kubermatic

  • The following points are explained along with the content of the title.
    ○ How It All Began
    ○ Growing Up — The Big Migration
    ○ Where Are We Standing in 2021?
    ○ What’s Next
    ○ The Kubernetes Dashboard in Numbers
    ○ Join Us

Unironically Using Kubernetes for my Personal Blog

Marcus Buffett

  • The author, who uses Kubernetes on his blog and small side projects, explains why he likes Kubernetes.

Take the CNCF Microsurvey on Kubernetes at the Edge

  • It introduces a Microsurvey on CNCF edge computing and Kubernetes.
  • This survey provides insights into Kubernetes IoT/Edge scenarios, challenges, and opportunities. The CNCF will summarize the data and share its insights at the KubeCon + CloudNativeCon EU — Virtual 2021 and Kubernateson Edge Day co-located events on May 4.

Join CNCF and Docker for the experimental “Container Garage” series starting April 1st!

Ihor Dvoretskyi, CNCF

  • CNCF and Docker collaborate on an experiment called “Container Garage”. It focuses on a specific theme for each event, such as runtime, image, security, etc.
  • Docker Captains and CNCF Ambassadors will lead the planning and execution of Container Garage events. This includes recruiting speakers for lectures, demos and live panels.
  • The first event will take place on April 1st on the topic of container runtimes. Click here to register.

Upcoming CNCF Online Programs

Cloud Native Live: Crossplane — GitOps-based Infrastructure as Code through Kubernetes API
Viktor Farcic @CodeFresh

March 24, 2021 at 12pm PT
Register Now

Automating SRE from “Hello World” to Enterprise Scale with Keptn
Jürgen Etzlstorfer & Andi Grabner @Dynatrace

March 25, 2021
Register Now

Flux is Incubating + The Road Ahead
Stefan Prodan @Weaveworks

March 25, 2021
Register Now

Securing Access to your Kubernetes Applications — Using Dex for Authentication and Role Based Access Control (RBAC) for Authorization
Deepika Dixit & Onkar Bhat @Kasten by Veeam

March 25, 2021
Register Now

Scaling Monitoring at Databricks from Prometheus to M3
Martin Mao @Chronosphere

March 25, 2021
Register Now

CNCF Online Programs Playlist on YouTube

Check out our playlist for more curated content you don’t want to miss! New content is added every Friday.

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

--

--

Yoshiki Fujiwara

・Cloud Solutions Architect - AWS@NetApp in Tokyo, Japan. #AWS Certified Solution Architect&DevOps Professional, #Kubernetes, ・Opinions are my own.