SRE / DevOps / Kubernetes Weekly Collection#68(Week 20, 2021)
- In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
- Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
- I hope it contributes to the people browsing this kind of information as a reference.
DEVOPS WEEKLY ISSUE #542 May 16th, 2021
SRE Weekly Issue #270 May 16th, 2021
KubeWeekly #262 May 21st, 2021
DEVOPS WEEKLY ISSUE #542 May 16th, 2021
News
- The title is “Metric Display Standards”.
- It describes standard object models and data presentations that provide a reliable level of utility, ease of use, and accessibility for all metrics in every use case.
- The image of the content / the diagram that promotes understanding is effectively used and I kept it as a good teaching material.
- The title is “Developer portals are a super power”.
- A counter-article to the article “developer portals are an anti-pattern” featured last week.
- In response to “As evidence of the apparent ills of developer portals, Corey offers up the fact that he hasn’t seen Backstage deployed in any company other than Spotify.”, it describes that Expedia Group, Zalando, and American Airlines have specifically chosen Backstage for their internal developer portal. It introduces that the information of the organization that is hiring is published in “ADOPTERS.md” on GitHub as “many more participants listed”.
- It covers two articles together. The title of the first article is “KubeCon Europe 2021 Wrapup”.
○ It looks back on his third serial virtual participation in KubeCon.
○ Last year it was able to watch most of the programs live, including the keynote, but this year the event start schedule was ahead of schedule. He found it inconvenient because the keynote speech started around 1 am in his local time, but he thought it was correct because it was adjusted according to the venue Europe. - The title of the second article is “ KubeCon EU 2021: Developers, Developers, Developers (and Control Planes) “.
○ It focuses on themes like “developer and developer experience within cloud” as following its key takeaways.
● Developers, and developer experience, within cloud is a big deal
● End users are making a big impact in the cloud native world right now
● Networking in the cloud (and K8s) is still evolving
● Open standards are providing key abstractions, extensibility, and innovation
● Control planes are where the most end user value is being created
● Anyone can (and should) contribute to the community: Docs are a great place to start
- The title is “Rule number one: Avoid vendor lock-in”.
- For the author’s position, the following Disclaimer is listed at the beginning.
○ Extra-prominent disclaimer: Views expressed here are my own, and don’t necessarily reflect the views of the Government of Canada. Products mentioned below are examples and not endorsements. - It is explained carefully according to the following items. It is good to be able to imagine what to do concretely and the scene. Especially, “public money? Public code.” is a simple and convincing message.
○ What is vendor lock-in?
○ How do you avoid vendor lock-in?
○ Own your data (and make sure you can move it somewhere new)
○ Own your front-end interfaces
○ Own your software source code
○ Avoid long-term contracts
○ What should vendors do, then? - In the item “What should vendors do, then?”, it proposes to provide high-quality services to vendors in order to meet such demands, and to provide specific examples of using external vendors.
- The title is “DevOps Practices for Continuous Deployment”.
- It describes three DevOps practices and how they were applied using the OSS tool “Batect’’, which performs tasks in Docker containers.
A quick overview of the metrics you should be measuring if you’re running Kubernetes clusters.
- The title is “Key Kubernetes Metrics and Resources to Monitor for Peak Cluster Performance”.
- The following configuration details Kubernetes’ key accessible metrics and how to understand them.
○ Kubernetes Objects
○ Kubernetes Cluster & Node Metrics
○ Kubernetes Deployments & Pod Metrics
● Kubernetes Metrics
● Container Metrics
● Application Metrics
○ Monitoring Kubernetes with Sematext
○ Conclusion
Tools
- As mentioned above, the web page of the tool “vcluster” provides a virtual Kubernetes cluster in a namespace within a Kubernetes cluster.
- Click here for the GitHub page.
- A GitHub page of “Lima (Linux-on-Mac)”.
- An article from the Microsoft Open Source Blog titled “Making eBPF work on Windows”. As mentioned above, eBPF Windows.
- Click here for the GitHub page.
- A GitHub page of “ZX”. It provides a convenient wrapper for child_process, escaping arguments and providing appropriate defaults.
Ahoy is a user-friendly dashboard for managing Helm-deployed applications on Kubernetes.
- As mentioned above, an introductory article on the user-friendly Helm dashboard “Ahoy!” On Kubernetes.
- Click here for the GitHub page.
SRE Weekly Issue #270 May 16th, 2021
Articles
Thundering herds, noisy neighbours, and retry storms
This is an in-progress document about the kinds of patterns we see or use when designing systems. The author warned me that it’s a work in progress and maybe not ready for prime-time, but I think this is exactly the time when I should get it in front of your eyes.
I’d love your help growing this list. If you know of a name that is missing from the list please send me a tweet with the name and a short description of it and I’ll include it in the list with a link to your tweet
Mads Hartmann
- As mentioned above, here is a list of patterns you will see when designing the system. It was created by the author of this article. The term “operational patterns” proposed by Lex Neva, the editor of SRE Weekly in the article, seems to be good.
Whoa, a podcast dedicated to picking apart public incident postings! I love this, because there’s a lot that’s left to shorthand, and a live conversation is a great way to flesh it out.
Tom Kleinpeter and Jamie Turner
- As mentioned above, the introduction of podcasts. From the March 20, 2021 introduction, weekly engineering teams share lessons learned from Outages and postmortems.
Health boss unsure how many hospital patients were overdosed due to Windows upgrade
There’s a really interesting undercurrent in this story about resilience. Nurses can catch these kinds of errors, but this just one layered protection among many. If the system is reduced to relying on that second-layer defense, the overall resilience is diminished.
Daniel Keane — ABC News
- As the title suggests, an error due to the upgrade to Windows 10 occurred and the patient may have received more than 10 times the required amount of medication.
Have you ever seen a car crash test? That’s Chaos Engineering
Of course, before reaching this stage, all of the pieces are tested in isolation. But until they’re all put together, it’s almost impossible to predict the behavior of the finished product during an accident.
Mikolaj Pawlikowski
- As the title suggests, the necessity of chaos engineering is explained by referring to the crash test of cars.
4 attributes of a great site reliability engineer
The attributes discussed are:
* Problem solving
* Awareness building
* Collaboration
* Empathy
Jayne Groll
- It interviewed DevOps Institute ambassadors and experts in the target fields of SRE about “What makes for a great SRE?” and explained the above four main attributes they proposed.
How to hire Site Reliability Engineers (SREs): 5 top qualities
Wait, more attributes? Oh, and by the same author, too:
* “Great SREs have a passion for high-quality automation.”
* “A great SRE ensures SLOs (Service Level Objectives) are set at correct boundaries of service; […]”
* Prize Communication.
* Look for longer-term support experience.
* Look for a person that demonstrates empathy.
Jayne Groll
- I have Exactly the same impression as the above the Editor’s one. The same author as the article above interviewed “What makes for a great SRE?” and explained the five main attributes they proposed.
- Personally, the question arises: “Why / what are the criteria for writing these two articles?” and “Why aren’t they written so that they are connected?”
Site Reliability Engineering for Native Mobile Apps
This one explore the application of SRE principles to mobile app design.
Abhijith Krishnappa
- It describes how to apply the SRE principles to Mobile Apps reliability.
Choosing SLOs that users need, not the ones you want to provide
This two-part series uses a narrative case study format to show how SLOs can be misleading. You might have great numbers, but what are the numbers actually measuring?
Adam Hammond — Squadcast
- Regarding SLO, it approaches the following problems.
- A lot of IT professionals tend to think that they know the best metrics, and they do; the only problem is that they are the best metrics for monitoring systems, not for improving customer satisfaction.
Outages
- A major US oil pipeline
The pipeline was targeted by a ransomware attack. - GasBuddy
This app for finding gasoline prices seems to have been impacted by a flood of user traffic driven by the US oil pipeline outage. In fact, their front page seems to be very slow for me as I write this. - Salesforce
The outage was widespread and even affected their status page. - eBay
- Microsoft Outlook
KubeWeekly #262 May 21st, 2021
The Headlines
Editor’s pick of the highlights from the past week.
Deadline approaching: KubeCon + CloudNativeCon North America 2021 CFP closes on May 23!
KubeCon + CloudNativeCon North America 2021 is happening October 12–15 in Los Angeles, CA along with a virtual experience for those who can’t travel!
Are you ready to see your name in lights and potentially have the opportunity to speak on a real stage again? Apply to speak now — the Call for Proposals (CFP) is open until Sunday, May 23, 11:59 PM Pacific Daylight Time. Since the event will be a hybrid experience, you can submit to speak either in person or virtually.
Is this your first time submitting? Check out our submission guidelines to make your proposal shine.
- KubeCon + CloudNativeCon North America 2021 CDF deadline reminder and information on how to submit CFP.
Take the CNCF Cloud Native Survey — Part 1
We want to hear from you! Be sure to take the Cloud Native Survey — Part 1 to share your thoughts on cloud, containers, and Kubernetes. We have a full conference pass for KubeCon + CloudNativeCon Europe 2022 to give away — complete the survey by June 15 for a chance to win!
- It is conducting a survey as described above, and it seems that if you answer, you will get a pass of “KubeCon + CloudNativeCon Europe 2022”.
- After answering, you will be asked to enter your name / company name / email address. A pass should be presented at a later date.
ICYMI: CNCF online programs this week
A weekly summary of CNCF online programs from this week.
Where to begin your dev-centric cloud infosec journey
Guy Eisenkot & Ashley Ward, Palo Alto Networks
- It explains infosec journey = where to start on the cloud and how to incorporate security into existing processes to minimize interruptions and maximize productivity.
Nabarun Pal & Anna Jung, Kubernetes 1.21 Release Team
- A release webinar with release team leads and enhancement leads to coincide with the release of Kubernetes 1.21. Confirmed at the beginning that the release cycle will be changed from 4 to 3 times a year from 1.22.
Leveling Up Kubernetes with kube-vip
Dan Finneran, Equinix Metal
- It shows you why they developed “kube-vip” for HA and load balancing inside and outside the Kubernetes cluster, and explains how it works with a demo.
The Technical
Tutorials, tools, and more that take you on a deep dive into the code.
Damian Peckett, kloeckner.i
- It briefly describes Cloud SQL, then introduces how to use it at kloeckner.i and the tools built for workflows with low administrative overhead.
How to create a Ubuntu Packer Image and deploy on a Bare Metal Server
Chitrabasu Khare, InfraCloud
- It describes how to use Packer to create a minimal raw Ubuntu image and deploy it to a bare metal server using the provisioning engine “Tinkerbell”.
Workload mobility in a service mesh world
Cody De Arkland, Kong
- It mentions that the aspect of “Zero Trust Security” has been greatly focused on, and explains with the following structure under the theme of “What is the next chapter of Zero Trust for Service Mesh?”.
○ Same Problem, Different Data Center
○ Creating Environment Consistency
○ Progressive Delivery, Migration and Controlling Flow
○ The Next Generation of Service Mesh
Key Kubernetes metrics to monitor for peak cluster performance
Adnan Rahic, Sematext
- Since it is covered in DEVOPS WEEKLY ISSUE #542 above, I will skip it.
Learn how to build functions faster using Rancher’s kim and K3s
OpenFaaS Ltd.
- It focuses on how to test “ kim (The Kubernetes Image Manager) “, a new project by Rancher Labs that allows you to build container images directly into the node’s image library.
Autoscaling Kubernetes clusters
Puja Abbassi, Giant Swarm
- It focuses on “Cluster Autoscaler”, the explanation is based on the following configuration along the title.
○ Cluster Autoscaler
○ Cloud Providers
○ Scaling Up
○ Scaling Down
○ Scaling Latency
○ All the Autoscalers
○ Spot Instances
○ Conclusion
AWS Secrets Manager on Kubernetes using AWS Secrets CSI driver Provider
Theo “Bob” Massard, Particule
- It follows-up with a test of the AWS Secret Store Provider and explains how to use it as a bridge between the AWS Secrets Manager and your app’s environment.
Kube-Prometheus — A complete monitoring stack using Jsonnet
Lennart Jern, Elastisys
- It explains how to use “kube-prometheus” to achieve accurate monitoring of apps.
The easiest way to debug Kubernetes workloads
Martin Heinz
- The content of the title is examined and explained with the following structure.
○ There Might Just Be a Better Way…
○ Configuring Feature Gates
○ Process Namespace Sharing
○Putting It To Good Use
○ Bonus: Debugging Cluster Nodes
○ Conclusion
Kubernetes capacity planning: How to rightsize your cluster
Jesus Ángel Samitier, Sysdig
- It explains how to identify unused resources and how to properly set the capacity of the cluster.
Learn how Istio can provide a service mesh for your functions
Alex Ellis, OpenFaaS Blog
- It starts with a brief introduction to get you started with Istio and OpenFaaS integration, and explains how to measure your cluster’s resource consumption and how to create an Istio Gateway TLS certificate.
GitOps Guide to the Galaxy (E15): Introducing the App of Apps and ApplicationSets
Christian Hernandez, Chris Short, Red Hat
- It explains the concept of “App of Apps” and how to mitigate the issues that exist in GitOps.
The Editorial
Articles, announcements, and morethatgive you a high-level overview of challenges and features.
Pixie, with Zain Asgar and Ishan Mukherjee
Craig Box, Kubernetes Podcast from Google
- A new episode of the Kubernetes Podcast by Google employees. This time the Host is Craig Box and Guest Host Alex Ellis. The previous episode of him is as follows.
○ Independent Open Source, with Alex Ellis - The Guests are Zain Asgar and Ishan Mukherjee, co-founders of Pixie software, which was recently acquired by New Relic.
- The topics I was interested in in the News of the week are as follows.
○ eBPF for Windows
○ GKE Dataplane V2 is GA
○ VMware Tanzu SQL, with MySQL, for Kubernetes, 1.0
It’s Been a Full Year Since we Launched OpenShift TV
Chris Morgan, Red Hat
- The title and article explain that one year has passed since “OpenShift TV” was started. The embedded video is a session of over 2 hours with the theme of “Playing with Prometheus”.
CloudSkiff
- The OSS CLI tool “driftctl” is introduced from the background of necessity.
Yoko Kawamoto
- Another introductory article on “Ahoy!” Also featured in DEVOPS WEEKLY ISSUE #542 above. This has more images than textual information, which is good for visually viewing the contents.
Kubernetes, (Almost) Love at First Sight with Chris Ferreira
Committing to Cloud Native Podcast
- At the time of 2021/05/23 23:21 (JST) and 2021/05/24 23:31 (JST), the URL cannot be accessed with a “DNS_PROBE_FINISHED_NXDOMAIN” error. I might be able to access it last Saturday? I will try it later.
○ At the time of 2021/05/29 10:48(JST), it worked.
ICYMI: CNCF YouTube Channel featuring all talks from KubeCon EU 2021are now available!
- A playlist of published session videos for KubeCon EU 2021. I haven’t seen it at all, so I want to catch up.
Upcoming CNCF Online Programs
Live webinar
- May 25 @10am PT: Service mesh configuration using xDS protocol on gRPC, and using Envoy presented by Megan Yahya, gRPC & Yan Avlaslov, Envoy (member submission by Google) — RSVP
Cloud Native Live
- May 26 @8am PT: Universal Crossplane presented by Dan Mangum, Upbound — RSV
On-demand webinars
- May 27: Containing your microservice sprawl presented by Tracy Ragan, DeployHub — RSVP
- May 27: Defense strategy against Kubernetes attack TTPs presented by Manoj Ahuje, Tigera — RSVP
- Looking for more great curated content? Visit our Online Programs playlist on YouTube.
How about those articles? Do you have any interest in any?
Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.
Bye now!!