SRE / DevOps / Kubernetes Weekly Collection#53(Week 5, 2021)
- In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
- Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
- I hope it contributes to the people browsing this kind of information as a reference.
DEVOPS WEEKLY ISSUE #527 January 31st, 2021
SRE Weekly Issue #255 January 31st, 2021
KubeWeekly #249 February 5th, 2021
DEVOPS WEEKLY ISSUE #527 January 31st, 2021
News
- The title is “A deeper dive into our May 2019 security incident”.
- It shares this because they can explain in more detail as a result of discussions with law enforcement agencies over time; what they did for addressing the underlying issues that caused the security incident in May 2019, what happened, how it happened.
- The title is “2021 is the Year of Reliability”.
- An overview of each expectation for software in 2021 and how to achieve that expectation is given along with the following items.
○ Customers want reliable software.
○ Operators want reliable software.
○ How do we achieve reliable reliability?
○ SLOs
○ Every day is a chance to be more reliable.
- The title is “Securing the NCSC’s web platform”.
- The UK’s NCSC(National Cyber Security Center) explains the points of the Web page that it operates according to the following items.
○ As secure as necessary
○ The cost of security controls
○ Sensible security architecture
○ The web platform will never be ‘done’
○ The balancing act is hard - Proportionate risk management, usability, functionality, cost, Game Days, and other perspectives and ideas that seemed to be more advanced and flexible than the impression given by the organization name were lined up.
- The title is “Bad Pods: Kubernetes Pod Privilege Escalation”.
- It describes the following eight insecure pod configurations and the corresponding ways to perform privilege escalation.
○ Bad Pod #1: Everything allowed
○ Bad Pod #2: Privileged and hostPid
○ Bad Pod #3: Privileged only
○ Bad Pod #4: hostPath only
○ Bad Pod #5: hostPid only
○ Bad Pod # 6: hostNetwork only
○ Bad Pod #7: hostIPC only
○ Bad Pod #8: Nothing allowed - This article and accompanying repositories have been created to help penetration testing testers and administrators better understand common misconfiguration scenarios.
- The title is “Why does it take so long to build software?”.
- The contents of the theme are explained according to the following items.
○ Different types of complexity? That’s complex.
○ Here comes the accidental complexity.
○ How does this apply to software?
○ We are asking more and more of our software.
○ The volume of software within companies is exploding.
○ The pace of new technology adoption is increasing.
○ Is there hope? - In a future post, it is going to discuss the impact of accidental complexity on software projects, and how they can more effectively avoid it while ensuring they are still meeting the needs of the business.
- The title is “Introducing Hex Preview”.
- As the title suggests, it introduces the release of the online tool “Hex Preview” for viewing the source files of the Hex package.
An example of writing unit tests for Helm charts using Go.
- The title is “How to unit-test your helm charts with Golang”.
- I will skip it because it was covered in KubeWeekly#248 last week.
Events
- It introduces “The Container Plumbing Days”, a two-day ‘lower-level’ open source container technologies event.
- The schedule is as follows.
○ Tuesday March 9th, 2021, 15:00 to 19:00 UTC (10am to 2pm Eastern)
○ Wednesday, March 10th, 2021, 15:00 to 19:00 UTC (10am to 2pm Eastern) - The main projects and technologies expected by the events listed in About are as follows.
○ Buildah, CRI-O, Katacontainers, Kubevirt, Clair, Skopeo, Cgroups2, Krustlet, Seccomp, Podman, KIND, Tern, and many others.
Tools
- A GitHub page of the library “simdjson” that parses JSON at high speed.
- A GitHub page of the tool named “Etok”, which stands for Execute Terraform On Kubernetes.
- The “Why” in the README is as follows.
○ Leverage Kubernetes’ RBAC for terraform operations and state
○ Single platform for end-user and CI/CD usage
○ Queue terraform operations
○ Leverage GCP workspace identity and other secret-less mechanisms
○ Deploy infrastructure alongside applications
- A GitHub page of “Litestream”, a standalone streaming replication tool for SQLite.
- It runs as a background process and safely replicates changes to another file or S3 in stages. Litestream communicates with SQLite only via the SQLite API, so the database will not be corrupted.
SRE Weekly Issue #255 January 31st, 2021
Articles
Why It Should Be Service, Not Site Reliability
It really should! Even Google is much more accurately described as a “service” than a “site”.
Chris Riley — Splunk
- S of SRE stands for “site”, but it argues that it should be a “service” that is more consistent with what developers offer today, along with the following points:
○ Subscription-based business models
○ Application architectures
○ Modern delivery chain
○ Cross-platform
○ Customer-centric
Migrations: the sole scalable fix to tech debt.
There are migrations, and then there’s the time between migrations.
Will Larson
- The title tells the story of Uber moving from a Puppet managed service to a fully self-service provisioning model.
- I personally felt it is an important point in this to reduce the cost of migration by slowing down the migration time after confirming that the verification can solve the intended problem at the time of migration.
2021 is the Year of Reliability
2020 was the year mainstream folks realized how important reliability is. Will overall reliability improve in 2021?
Robert Ross — FireHydrant
- I will skip it, because it is covered in DEVOPS WEEKLY ISSUE#527 above.
This SRE attempted to roll out an HAProxy config change. You won’t believe what happened next…
I love this for the click-bait title and the content. An HAProxy feature designed for HA had a surprising and unexpected behavior.
Andre Newman — GitLab
- It details what we discovered while investigating strange behavior from HAProxy.
- TLDR is below
○ HAProxy has a server-state-file directive that persists some of its state across restarts.
○ This state file contains the port of each backend server.
○ If a haproxy.cfg change modifies the port, the new port will be overwritten with the previous one from the state file.
○ A workaround is to change the backend server name, so that it is considered to be a separate server that does not match what is in the state file.
○ This has implications for the rollout procedure we use on HAProxy.
Tyler Wells on building a culture of reliability at Twilio
Twilio builds customer trust through a reliability culture, customer empathy, and accountability.
Andre Newman — Gremlin
- The following points are excerpted from a talk by Tyler Wells, Senior Director of Engineering at Twilio.
○ Reliability is built on customer trust
○ Culture
○ Customer empathy
○ Accountability
○ Reliability is a journey
This WTFinar tackles the beginning of understanding SRE. It focuses on service level indicators (SLIs) and service level objectives (SLOs) — components of error budgets.
Container Solutions
- A Container Solutions’ Webinar “WTF is SRE?” is featured.
- It focuses on SLI and SLO as a starting point for understanding SRE.
- As mentioned above, it will start at 23:00 Japan time because it is 2/9 (Tuesday) 15:00 CET (Central European Time zone).
KubeWeekly #249 February 5th, 2021
The Headlines
Editor’s pick of the highlights from the past week.
Welcome to our 5 new TOC members!
Chris Aniszczyk, CNCF
Help us give a warm welcome to the newest members of the TOC:
* Erin Boyd, Apple
* Cornelia Davis, Weaveworks
* Lei Zhang, Alibab
* Dave Zolotusky, Spotify
* Ricardo Rocha, CERN
Learn more about the TOC and newest members in the latest blog post.
- CNCF’s TOC(Technical Oversight Committee) an article announcing the selection of the above five new TOC members.
- It introduces the position of TOC and the biographies of the members appointed to the Governing Board (GB) and End User Community (EUC).
- It also thanked the following three members who have completed their terms.
○ Brendan Burns (@brendandburns)
○ Matt Klein ( @ mattklein123 )
○ Xiang Li ( @xiangli0227 )
Cloud Native Computing Foundation Announces Open Policy Agent Graduation
CNCF blog
Congratulations to Open Policy Agent (OPA) for hitting graduated status! OPA has demonstrated widespread adoption, an open governance process, feature maturity, and a strong commitment to community, sustainability, and inclusivity to graduate.
- As the title suggests, the article tells the CNCF TOC that OPA’s Maturity Level has reached Graduation.
- OPA was accepted by the CNCF sandbox in April 2018 and was promoted to incubation a year later. More than 90 people from about 30 organizations have contributed to OPA, and maintainers consist of members from four organizations: Google, Microsoft, VMware, and Styra.
- See CNCF Graduation Criteria v1.3 for promotion conditions as of 2021/02/06 according to Maturity Level.
The Technical
Tutorials, tools, and more that take you on a deep dive into the code.
Connor Brewster, Replit
- As a result of their research, it explains how to forcibly terminate the container by themselves and its effect for the problem that “Docker takes more than 30 seconds to forcibly terminate all containers on the VM”.
Kubernetes — How to Debug CrashLoopBackOff in a Container
David Giffin, Release App
- It doesn’t explain how to properly configure k8, but instead focuses on debugging its own and other code when a “CrashLoopBackOff” error occurs in the container.
Hunting for Malware with Falco
And Lorenc
- It explains how to build a platform to look for malicious behavior hidden behind the scenes.
Deliver your applications to edge and IoT devices in rootless containers
Ilkka Tengvall, Red Hat
- It explains how to use systemd, Podman, and Red Hat Ansible Automation to automate software and push it as a container to small edge and Internet of Things (IoT) gateway devices.
Building a Kubernetes CI/CD Pipeline with GitLab and Helm
Dan Slapelis, Nextthink Labs
- It explains how to use the CI/CD pipeline on Kubernetes as a puzzle, bolt the continuous delivery(CD) pieces of the puzzle, build the CI/CD pipeline, and deploy the app to Kubernetes. As a premise, It starts with the explanation of Helm, which is an important part of the puzzle.
Kubernetes vs Docker: Understanding Containers in 2021
Tomas Fernandez, semaphore
- A few weeks ago, the Kubernetes development team announced that they would deprecate Dockershim, but the most common questions are explained from the underlying containers, Docker, and Kubernetes.
- For those who already know about Docker and Kubernetes, I recommend skipping and reading the “ How does the Dockershim deprecation impact you? “ Section in the article.
ICYMI: CNCF online programs this week
A weekly summary of CNCF online programs from this week.
CNCF On-demand webinar: Policy as Code to manage security risk in K8s before & after deployment
Cesar Rodriguez @Accurics
- It introduces the on-demand webinar with the above title. If you are interested, please register and watch. It is open to the public only for registrants, and the release period is February 4, 2021 0:00 — February 10, 2021 23:59 (PST).
- The Kubernetes development team explains how to use open standards such as OPA (Open Policy Agent) and open source IaC scanners such as Terrascan to improve security with policy as code.
This Week in Cloud Native (Livestream): Kubernetes Policies-as-Code
Jim Bugwadia @Nirmata
- A session showing how Nirmata uses the CNCF sandbox project Kyverno to enable “policy as code”.
- As of 02/06/202, there was no video embedding or link on the above page, but “ This Week in Cloud Native: Kubernetes Native Policy As Code “ posted on YouTube should be fine.
The Editorial
Articles, announcements, and morethatgive you a high-level overview of challenges and features.
Backstage, with Lee Mills and Matt Clarke
Craig Box, Kubernetes Podcast from Google
- Kubernetes Podcast by Google employees. The current Co-host is Craig Box. Adam Glick goes to greener pastures. Past guests will be invited as guest hosts for several weeks.
- This week, Google’s Senior SWE, Kubernetes SIG Architecture’s co-chair, CoreDNS project’s Core Maintainer, and O’Reilly’s book “Learning CoreDNS: Configuring DNS for Cloud Native Environments” appearing in Episode#106 by John Belamaric is the guest host.
- The guests are Lee Mills and Matt Clarke from Spotify, the maintainers of “Backstage.”
○ Backstage is a platform for building a developer portal using a centralized service catalog.
○ Open source developed by Spotify and donated to CNCF in 2020 - The topics I was interested in in the News of the week are as follows.
○ Longhorn 1.1
○ Sonobuoy adds reliability scanning
○ Announcing the Linkerd steering committee
Vamp.io Introduces Research Report The 2021 State of Cloud-Native
- In line with the report content in the title, it describes challenges, trends, and opportunities for improvement regarding software release and verification in production as of 2021.
- The study highlights the tough challenges facing small businesses and engineering plastics in their pursuit of cloud, Kubernetes, and microservices journeys.
Upcoming CNCF Online Programs
CNCF Live webinar: How to Manage Kubernetes Application Life Cycle Using Carvel
presented by VMware
February 9, 2021 at 10:00 am PT
Register Now
CNCF On-demand webinar: Debugging Kubernetes On The Fly
presented by Rookout
February 11, 2021
Register Now
CNCF On-demand webinar: Otomi Container Platform Open Source Announcement
presented by Red Kubes RV
February 11, 2021
Register Now
For more information, please visit our updated Online Programs page.
How about those articles? Do you have any interest in any?
Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.
Bye now!!