SRE / DevOps / Kubernetes Weekly Collection#91(Week 43, 2021)
- In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
- Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
- I hope it contributes to the people browsing this kind of information as a reference.
DEVOPS WEEKLY ISSUE #565 October 24th, 2021
News
- It covers three articles on recap of KubeCon + CloudNativeCon NA 2021.
- The title of the first article in the above link is “KubeCon NA 2021 Key Takeaways: DevX, Security, and Community”. The explanation is based on the tweets of the author and other participants.
- The title of the second article is “KubeCon 2021 Los Angeles Wrapup”. Looking back with its tweets.
- The title of the third article is “KubeCon 2021 Top 3 Announcements: APIClarity, HashiCorp Waypoint, and Dell EMC CSM”. The author introduces three new products that caught its eye due to their high potential of pushing back roadblocks that are currently slowing down application modernization.
- The title is “INFRASTRUCTURE IN YOUR SOFTWARE PACKAGES”.
- By detailing the current situation and assessing how software delivery has evolved over time, it analyzes what the future of shipping infrastructure will look like alongside software.
- The title is “Diablo II: Resurrected Outages: An explanation, how we’ve been working on it, and how we’re moving forward”.
- Since the launch of “Diablo II: Resurrected”, the causes of outages that have occurred on multiple server issues and the steps taken by the team in charge to provide some transparency around. It also provides insight into how we’re moving forward.
- The title is “Iterating on how we do NFS at Wikimedia Cloud Services”.
- As the Editor commented above, Wikimedia’s cloud services team reviewed how to run NFS and shared the improvements.
- The title is “What’s in a hostname?”.
- It dives deep about the host name while matching it with the relevant RFCs.
- The title is “Exploring Kubernetes Operator Pattern”.
- Take a closer look at the Operators pattern and use as many images as possible to show which Kubernetes parts are involved in the implementation of the Operator and why the Operator feels like “first-class Kubernetes citizens”.
- It explains that the Kubernetes API is probably the main driver of Kubernetes extensibility.
A post on introducing a production readiness review process, in particular in smaller teams.
- The title is “How we’re building a production readiness review process at Grafana Labs”.
- It describes the Production Ready Review (PRR) process and some of the best practices developed during that process.
- PRR is a process that started at Google and is described in the company’s well-known SRE book as the first step in SRE’s efforts .
Tools
- The GitHub page of “hcltm” that provides a DevOps-first approach for documenting system threat models, focusing on the following targets:
○ Simple text-file format
○ Simple cli-driven user experience
○ Integration into version control systems (VCS)
- As mentioned above, the GitHub page of “Snowcat”, a tool that collects and analyzes the configuration of Istio clusters and audits the possibility of violating security best practices.
- Click here for the GitHub page.
SRE Weekly Issue #293 October 24th, 2021
Articles
The Downside of Hospitals Becoming “Highly Reliable”
It’s one thing to say you accept call-outs of unsafe situations — it’s another to actually do it. This cardiac surgeon shares what it’s like when high reliability organizations get it wrong.
Robert Poston, MD
- In the hospital, the highly reliable organization (HRO) said, “A lack of transparency and passion leaves them with a series of well packaged ideas that end up looking like high reliability but never able to operate like one.“.
- Article dated November 6, 2019.
The game has been a victim of its own success, and the developers have had to put in quite a lot of work to deal with the load.
PezRadar — Blizzard
- Since it is covered in DEVOPS WEEKLY ISSUE #565 above, I will skip it.
An Introduction to Incident Response Roles
This includes some lesser-known roles like Social Media Lead, Legal/Compliance Lead, and Partner Lead.
JJ Tang — Rootly
This article is published by my sponsor, Rootly, but their sponsorship did not influence its inclusion in this issue.
- The following points explain how to define an incident response role in order to build a team that works as effectively and efficiently as possible.
○ What is an incident response team?
○ Structuring incident response roles
○ Other potential incident response roles
○ Conclusion: The best incident response team is a flexible team
There are a couple of great sections in this article, including “blameless” retrospectives that aren’t actually blameless, and being judicious in which remediation actions you take.
Chris Evans — incident.io
- As the title suggests, the following points explain the pitfalls of post-mortem.
○ When blameless postmortems actually aren’t
○ Incidents are always going to happen again
○ Take time before you commit to all the actions
○ Incidents as a process, not an artifact
The danger of hidden functional roles
I love the idea that chaos monkey could actually be propping your infrastructure up. Oops.
Lorin Hochstein
- The story of the introduction, which unintentionally plays the role of a family alarm clock, and how to connect Chaos Monkey in the latter half are good. I had never thought of the possibility that Chaos Monkey swapped instances before the problem occurred by terminating the instance.
I have to say, I’m really liking this DNS series.
Jan Schaumann
- Since it is covered in DEVOPS WEEKLY ISSUE 565 above, I will skip it.
Crew member yelled ‘cold gun’ as he handed Alec Baldwin prop weapon, court document shows
What? Why the heck am I including this here?
First, let’s all keep in mind that this situation is still very much unfolding, and not much is concretely known about what happened. It’s also emotionally fraught, especially for the victims and their families, and my heart goes out to them.
The thing that caught my eye about this article is that this looks like a classic complex system failure. There’s so much at play that led to this horrible accident, as outlined in this article and others, like this one (Julia Conley, Salon).
Aya Elamroussi, Chloe Melas and Claudia Dominguez — CNN
- At first glance, I thought, “Why is this article?” As mentioned in the Editor’s comment above, this is taken up because it looks like a classic complex system failure.
KubeWeekly #281 October 29th, 2021
The Headlines
Editor’s pick of the highlights from the past week.
Kubernetes Podcast from Google: Jasmine James, KubeCon + CloudNativeCon co-chair
Jasmine James is an Engineering Manager within the Engineering Effectiveness organization at Twitter, focused on their internal developer experience. She was also the co-chair of the recent KubeCon + CloudNativeCon. Jasmine talks about the events she’s led and the ones to come, and her feelings about being in a room in front of other people — up to 3,000 of them — for the first time in a long while.
- Kubernetes Podcast by Google employees. This time the Host is Craig Box and Guest Host Jimmy Moore.
- Jasmine James, who is the Engineering Manager of Twitter’s Engineering Effectiveness organization and co-chair of KubeCon + CloudNativeCon NA 2021, is invited as a guest.
- The topics I was interested in in the News of the week are as follows.
○ Google Cloud Next:
● BigQuery Omni is GA
Managed Service for Prometheus
○ KubeCon + CloudNativeCon
● Cilium joins the CNCF
● Cloud Native security microsurvey results
○ Kubernetes documentary trailer
ICYMI: CNCF online programs this week
A weekly summary of CNCF online programs from this week.
Securing your workload communications with Open Service Mesh
Phillip Gibson, Microsoft
- An approximately 46-minute session that introduces the latest integrations and techniques for enhancing workload communication using Open Service Mesh.
Introducing Kubescape — open-source tool to test Kubernetes deployment
Amir Kaushansky, ARMO
- An approximately 50-minute session that explains how to operate Kubescape , supported frameworks, key features, and CI/CD integration.
How to design a multi-cloud deployment
Dave Blakely, Snapt
- An approximately 40-minute session that explains the purpose of migrating to multi-cloud, how to select a cloud provider, how to deploy to multi-cloud, and how to keep multi-cloud secure.
Project Calico network policies
Nigel Douglas, Tiger
- An approximately 41-minute session that explains the content of the title with the following points.
○ How does Project Calico enable network policies in K8s?
○ How to implement basics?
○ Creating and managing policies in your clusters
Abubakar Siddiq Ango, Gitlab
- An approximately 30-minute session explaining GitOps, its use cases, and if/when you need GitOps.
The Technical
Tutorials, tools, and more that take you on a deep dive into the code.
What you need to know about Kubernetes Network Policy
Mike Calizo, Red Hat
- Kubernetes’ Network Policy is explained with the following points with a description example of YAML.
○ The NetworkPolicy concept
○ Applying a network policy
○ NetworkPolicy limitations
○ Summary
The life of an API gateway request (part 1)
Enrique García Cota, Kong
- Part 1 of an article in a series that discusses how Kong Gateway handles requests by breaking the abstraction space into four different layers. About 13 minutes of video is embedded.
- Infrastructure
- Nodes
- Phases
- Plugins
Optimizing Kubernetes applications with Kubecost and Spinnaker
Alex Thilen, Kubecost
- The content of the title is explained with the processing flow and the image of the UI. The following two videos are embedded.
- Demo of Kubecost + Spinnaker integration in action
- Spinnaker Workshop: Cost Optimization with Kubecost’s founders
Announcing HAProxy Kubernetes Ingress Controller 1.7
Ivan Matmati & Zlatko Bratkovic, HAProxy
- The changes in line with the release of version 1.7 of HAProxy Kubernetes Ingress Controller are introduced in detail in the following points.
○ Custom Resource Definitions
○ CRD Examples
○ Distribution of connections to services/pods
○ New ALNP option
○ Implementation specific path type in ingress rules
○ Multiarch Support
○ s6 Init system
○ Nightly builds
○ External mode
○ Contributions
○ Conclusion
Connecting services to Kubernetes clusters with inlets, VPC Peering and direct uplinks
Alex Ellis, OpenFaaS Ltd.
- It explains how to connect services to Kubernetes clusters using Inlets, VPC Peering, and direct uplinks.
Transitioning from Monolith to Microservices
Michael Bogan, Dev Spotlight
- I think it is very good to have a configuration that introduces the transition to microservices as the title suggests, after mentioning the following points in “You might not need microservices architecture if …” at the beginning.
○ You’re not having trouble scaling.
○ Your monolithic architecture is already flexible enough to meet market demands.
○ You’re not having issues with deploying your application.
Securing a Kubernetes pod with Regula and Open Policy Agent
Becki Lee, Fugue
- It shows you how to run Regula in the Kubernetes manifest to detect unsafe pods , and then explain how to protect them.
Structure testing for Docker containers
Tomas Fernandez, Semaphore CI
- As a way to test Docker containers before deploying, Google introduces the open source container test tool “Container Structure Tests”.
Kustomize tutorial: Creating a Kubernetes app out of multiple pieces
Nick Chase, Mirantis
- The content of the title is explained in the following items.
○ What is Kustomize?
○ Benefits of Using Kustomize
○ Installing Kustomize
○ Combining Specs
○ Managing Multiple Directories
○ Changing Parameters for a Component Using Kustomize Overlays
○ Creating a Kustomize Patch
○ Using Kubectl with Kustomize
○ Example: Kustomize Secret Generator
○ Conclusion
Kube-fledged: Cache container images in Kubernetes
Senthil Raja Chermapandian, Ericcson
- It explains how to use the open source project “kube-fledged” to build and manage a cache of container images in a Kubernetes cluster.
Kubernetes logging in production
Kentaro Wakayama
- The content of the title is explained in the following structure. The points are very well organized and the understanding progresses.
○ Logging Architectures
○ Logging Patterns
○ Pros and Cons
○ Putting Theory into Practice
○ Conclusion
How to develop a customer provider in Terraform
Saravanan Gnanaguru, InfraCloud Technologies
- This article is intended for Terraform users who have a basic knowledge of Terraform and how to use it and are likely to develop custom Terraform providers.
Database security best practices on Kubernetes
Johnathan S. Katz, Crunchy Data
- The content of the title is explained in the following items.
○ Run as an Unprivileged User
○ Encrypt your Data
○ Credential Management
○ Keep Database Software Up-to-Date
○ Follow Configuration Best Practices
○ Limit Where You Can Write
○ Securing The “Weakest Link”
○ Conclusion
How Linkerd retries HTTP requests with bodies
Eliza Weisman, Linkerd
- It describes how Linkerd proxies reduce copy and allocation to minimize request body buffering performance overhead, how proxies can determine which requests can be retried, and some edge cases to consider.
The Editorial
Articles, announcements, and morethatgive you a high-level overview of challenges and features.
Kubernetes co-founder Joe Beda interview
euro interview
- Since it is covered it in last week’s DEVOPS WEEKLY ISSUE #564 , so I will skip it.
Kubernetes cost management and analysis guide
Kasper Siig, CloudForecast
- It examines the main reasons why it’s so difficult to manage costs with Kubernetes. And as a way to significantly improve cost management, it shows you how to use the AWS Pricing Calculator to estimate the costs associated with running a workload on a custom Kubernetes cluster compared to running an EKS cluster.
I attended Kubecon 2021 in-person, here are my top six takeaways
Amanda Mitchell, Chronosphere
- The author who participated in KubeCon + CloudNativeCon NA 2021 explains the following 6 takeaways.
- 1) A green light for more (safe) in-person events
- 2) Quantity isn’t everything
- 3) KubeCon 21 felt like old times (aka two years ago)
- 4) Love notes and theCube
- 5) Observability and other key themes
- 6) Inclusivity themes abound at KubeCon 21
KaaS, KPaaS & CaaS: Explained and compared
Lars Larsson, Elastisys
- It Compares managed services for modern and containerized applications.
Alkin Tezuysal, Vitess
- An overview article with the release of Vitess 12. See the release notes for details.
Upcoming CNCF Online Programs
Please note that no Online Programs are scheduled for this upcoming week. Check out our full playlist of content on the button below!
Visit our Online Programs playlist on YouTube.
Learn more about CNCF Online Programs
How about those articles? Do you have any interest in any?
Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.
Bye now!!