SRE / DevOps / Kubernetes Weekly Collection#46(Week 51)

  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.

DEVOPS WEEKLY ISSUE #520 December 13th, 2020
SRE Weekly Issue #248 December 13th, 2020
KubeWeekly #244 December 18th, 2020

DEVOPS WEEKLY ISSUE #520 December 13th, 2020


Databases have limits that if you build a popular service and run it for a long time you’ll undoubtedly hit and need to plan for. This post talks about one such case, migrating a single table with 70 billion records and growing at more than 100 million rows a week.

  • The title is “The Boring Option”.

I wrote a post this week for SecAdvent, an introduction to the topic of software bill of materials, applicable use cases and in particular looking at the CrystalDX set of specifications and tools.


Lots of teams have small home-grown monitoring services that sometimes see less testing and automation than the services they monitor. Sometimes changes to those services can lead to unexpected downside like with this interesting incident report.

  • The title is “It’s Just a Monitoring Change”.

Large organisations are rapidly changing how they work, adopting lots of devops practices and better integrating previously separate business units. This post summarises some of that towards a new operating model.

  • The title is “The New Operating Model Is Upon Us”.

Terraform, or other infrastructure as code tools, provide a programming language. But how often do we apply patterns and practices learned from other programming languages? This post takes us through a nice refactoring exercise to make the point.

  • The title is “Infrastructure-as-code-as-Software”.

Another post on applying structure and patterns to Terraform. This post looks at the roles and profiles pattern which evolved from the Puppet community and applying it to Terraform and Terragrunt.

  • The title is “The Role-Profiles Pattern Across Infrastructure as Code”.

How do we relate conversations about digital transformation at the business level to devops practices and to agile software development? This post takes a run at providing an answer, and discusses why this is relevant to leaders at different levels of an organisation.

  • The title is “Accelerating Digital Transformation: What Every CEO Needs to Know About Software Delivery Automation”.
  1. Software delivery success ultimately depends on decisions made by the CEO of an organization

Access control systems can be complex, and Kubernetes RBAC is no exception. This post covers some behind-the-scenes details as well as pitfalls to avoid.

  • The title is “Kubernetes RBAC Security Pitfalls”.

Lots of organisations deploy to Linux but develop on other platforms. Sometimes it’s useful to have a local linux VM. This post covers how one open source project has been adopting multipass.

  • The title is “containerd development with multipass”.


Workplace culture often gets relegated because it’s so intangible, but it will make or break your Cloud Native transformation. Join Holly Cummins and Jamie Dobson for insights, conversations and of course, industry gossip. Sign up for Container Solutions’ last WhatTheFinar of the year: Tuesday 15th Dec, 11am CET.

  • The event “WTF Is Cloud Native Culture?” By Container Solutions was featured.


Bicep is a new and experimental declarative language which compiles down to Azure Resource Manager (ARM) templates.

  • The GitHub page of the Domain Specific Language (DSL) project “Bicep” that deploys Azure resources declaratively.

Localizer is a no-frills development tool for Kubernetes that aims to let you mainly ignore Kubernetes, instead proxying services so they appear as local services on the host.

  • The GitHub page of “localizer”, a new CLI tool for plain local development for developer environments using Kubernetes.

A handy small utility for any AWS admin. Provide an AWS IP address and digaws will return details like region, AWS service and more.

  • It is the Dig tool for admin for AWS. The “digaws” GitHub page where you can dig AWS-owned IP addresses, regions and other information displayed.

SRE Weekly Issue #248 December 13th, 2020


SLOs That Lie — SRE Journal

It’s really easy to get an “uptime” SLO wrong, and a lying SLO can give you a false sense of security.

Piyush Verma — Last9

  • As an option to measure the service down time Prometheus , Operations (formerly known as Stackdriver) commentary touched on, such as three of the following as options for measuring the service down time.
    ○ Option 1: SDK (Measure at each caller)
    ○ Option 2: Uptime (actually Downtime) Monitors
    ○ Option 3: State-based Monitors

It’s Just a Monitoring Change

I love this quote. I feel like this is the “root cause” of every incident:

As for the underlying cause of the incident (or the “root cause” if you insist on using such language), that has to be the fact that our assumptions as teams or individuals are ultimately formed by our past experiences.

Oliver Leaver-Smith — Sky Betting & Gaming

  • Since it was mentioned in DEVOPS WEEKLY ISSUE # 520 above, I will skip it here.

Complexity Has to Live Somewhere

I really love the concept of requisite complexity. This article has me thinking about a big project I’m working on in a new light.

Fred Hebert

The Boring Option

They expected to max out an integer primary key column sometime in 2021. Then the pandemic hit and their timetable suddenly accelerated along with their traffic.

Jeff Pollard — Awesome

  • I will skip it because it is covered in DEVOPS WEEKLY ISSUE # 520 above.

Scary sysadmin Halloween stories

I shouldn’t enjoy reading these so much… got any of your own to share?

Dean Wilson

  • Gremlin’s recent Twitter hashtag challenge called “#talesfromtheNOC” shares a story that invited people to share the story of a scary sysadmin.

Borrow Expertise With Runbook Automation

The idea of borrowing expertise makes me think of Bainbridge’s Ironies of Automation.

Bath Walls — PagerDuty

  • According to the title, it is explained in the following four items.
    ○ What Is Runbook Automation?
    ○ Borrow Expertise From Your Experts
    ○ The Benefits of Automation
    ○ Learn More About Runbooks and Automation

Heroku Incident #2127 Follow-Up: Issues with starting new dynos

Heroku’s report explains how their service was impacted as a result of the big Amazon Kinesis outage a couple weeks back.


  • As mentioned above, the failure of Kinesis, a service provided by the upstream provider (AWS), affected Heroku users, but Heroku thought that it should not be so, and a remedy (not this manual recovery) , Self-healing recovery plan) is presented.

Setting Business Goals with SLO

This primer focuses on ensuring that your SLOs actually match up with business objectives.

Irving Popovetsky — Honeycomb

  • At the beginning, it mentioned that it is the “season for setting goals for 2021”, and a practical example of how to use (and not use) SLO when setting future annual goals are explained along with the following points.
    ○ Aligning business goals and engineering work
    ○ The common language of SLOs
    ○ Getting started with SLOs in the real world
    ○ Gathering data for your own SLOs
    ○ Setting your company goals this year


KubeWeekly #244 December 18th, 2020

The Headlines

Editor’s pick of the highlights from the past week.

Cloud Native Computing Foundation receives renewed $3 Million Cloud Credit Grant from Google Cloud

Today, CNCF announced that Google Cloud has recommitted $3 million for another year in cloud credits to maintain its support of the Kubernetes project. This grant is a continuation of Google Cloud’s $3 million per year investment in Kubernetes development and distribution, which started back in 2018. The grant has primarily gone to — and will continue to support — scalability testing and maintenance of the infrastructure required to run Kubernetes development, which is indispensable for ensuring Kubernetes remains battle-tested and enterprise-ready.

  • The news of Google’s continuous investment in CNCF for $ 3 million annually.

ICYMI: CNCF Webinars

You can view all CNCF recorded and upcoming webinars here.

CNCF Member Webinar: Reducing your Kubernetes Cloud Spend

Webb Brown, CEO @Kubecost Niko Kovacevic, Founding Engineer @Kubecost

  • The Kubecost team provides hands-on examples and best practices for reducing spending without sacrificing performance or reliability.

CNCF Member Webinar: Implementing automated managed k8s service

Mason Choi(Moonhyuk Choi), Senior Engineer @Samsung SDS Kangsub Song(Kangseop Song), Senior Engineer @Samsung SDS

  • There is a description that “This webinar has passed.”, And there is no video. English slides can be viewed on the linked page, so check if you are interested.

CNCF Member Webinar: Argo: Real enterprise-scale with k8s

Al Kemner, Principal Software Engineer and Architect @New Relic Daniel Jimbel, Staff Engineer @New Relic Caleb Troughton, Product Manager, Telemetry Data Platform @NewRelic

  • It explains how to use ArgoCD and Rollouts to manage the canary deployment process in Kubernetes.

CNCF Member Webinar: Machine learning for K8s logs and metrics

Larry Lancaster, Founder and CTO @Zebrium

  • It describes machine learning techniques for logs and metrics and shows what they actually look like.

The Technical

Tutorials, tools, and more that take you on a deep dive into the code.

Deploying to OpenShift using GitHub Actions

TimEtchells, OpenShift

  • It outlines Red Hat Actions, outlines some of the workflows that can be simplified, and shows how to get your application up and running quickly and easily with OpenShift.

Create a Kubernetes Operator in Golang to automatically manage a simple, stateful application

Priyanka Jiandani, Red Hat

  • It demonstrates how to deploy a stateful application using Kubernetes Operator.

How to use Kubernetes resource quotas

Mike Calizo, Red Hat

  • The following is explained along with the title.
    ○ What are resource quotas?
    ○ Prerequisites
    ○ Set up a resource quota
    ○ Deploy the pods
    ○ Clean up
    ○ Planning your quotas

K8Spin Operator: Kubernetes multi-tenant operator. Enables multi-tenant capabilities in your Kubernetes Cluster.

  • Kubernetes multi-tenant operator “K8 Spin Operator” GitHub page. Feature is below.
    ○ Enable Multi-Tenant: Adds three new hierarchy concepts (Organizations, Tenants, and Spaces).
    ○ Secure and scalable cluster management delegation: Cluster Admins creates Organizations then delegating its access to users and groups.
    ○ Cluster budget management: Assigning resources in the organization definition makes it possible to understand how many resources are allocated to a user, team, or the whole company.

Portainer For Kubernetes

Saiyam Pathak, Civo

  • The YoutTube video explains the following with the following four points.
    ○ Motivation and vision behind Portainer
    ○ We explore Portainer Features and how can we use it
    ○ We discuss both CE and BE edition
    ○ We also talk about the community involvement

How to monitor multi-cloud Kubernetes with Prometheus and Grafana

Inlets blog

  • It aims to help readers understand how easy it is to connect services running in multiple isolated Kubernetes clusters that are distributed across cloud providers or running on-premises.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Akri, with Kate Goldenring

Adam Glick and Craig Box, Kubernetes Podcast from Google

The Level Up Hour (S1E19): Containers, data science and replication

Langdon White, Chris Short, and Matt Micene, Red Hat

  • A YouTube video that the above three people explain according to the title.

Kubernetes Clinic Spotlight on Tabitha Sable: Helping people level up

Kendall Miller, Fairwinds

  • They have Tabitha Sable as a guest and have an interview-style question and answer session.

The Cloud Native Landscape: The orchestration and management layer

Catherine Paganini and Jason Morgan, Buoyant

  • A series of articles focusing on explaining each category of Cloud Native Landscape to non-technical readers and engineers who are just starting out with cloud native.

Upcoming CNCF webinars

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

Thanks again for participating in CNCF webinars in 2020! Stay tuned for our expanded Online Programs in 2021.

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #GCP, #Certified AWS SAP

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store