SRE / DevOps / Kubernetes Weekly Collection#14(Week 19)

  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #488 May 3rd, 2020
SRE Weekly Issue #217 May 4th, 2020
KubeWeekly #215 May 8th, 2020

DEVOPS WEEKLY ISSUE #488 May 3rd, 2020

News

A detailed summary of modern deployment tools, looking at Jenkins-X, Flux and ArgoCD. A nice mix of facts and opinions.

  • The title is “FluxCD, ArgoCD or Jenkins X: Which Is the Right GitOps Tool for You?”.
  • A good article that explains the features of the above tools, pros/cons, usage etc. Jenkins X was good because I didn’t grasp the characteristics at all, especially. Each of them is not a competing tool, but has different use cases, so it seems to be completely different depending on the organization/purpose.

A well documented set or security architecture antipatterns, mainly focused on the risks of management access.

  • The title is “Security architecture anti-patterns”.
  • “Six design patterns to avoid when designing computer systems.” As a glossary, the explanations of anti-pattern, trust, and information technology vs. operational technology were good for fixing the meaning of each word with readers.

A set of posts looking at Tekton, explaining what it is (a platform for building CD tools) and why it’s important.

  • The title is “What’s Going on With Tekton? (Part 1)”. 2 part work. Click here for Part 2.
  • Jenkins X came out in the above article, so it feels like the points are connected. The number of tools I want to try are increasing.

Bash (or shell scripts in general) are still incredibly useful. This post points out a few common problems, and pointers for writing better bash scripts.

  • The title is “Anybody can write good bash (with a little effort)”. The Article on 1/23.
  • “Because almost everyone in the programming community has had a terrible experience with Shell, the development environment and projects are supported by a monster-like shell,” and then I will introduce the tips for writing bash. article. I also want to improve my shell skills.

A comprehensive body of knowledge around modern digital and IT practices, based on Agile, cloud-native, Lean, and DevOps principles.

  • GitLab’s page on Community Edition.
  • I think that it is important that such systematic knowledge is shared and maintained so that it supports development (I did not know its existence).

While it can be easy to think everyone already has automated deploying applications, it’s definitely not the case. This post contains a good list of benefits for those still making the business case.

  • The title is “The big 5 benefits of automated deployment”.
  • The author says that “Every software development team should have a fully automated deployment process.” at everyone at events and conferences, but it isn’t happening in the field. It analyzes that it is inferior to the short-term benefit and presents five benefits. The company’s tools are being advertised smoothly as a support for implementation.

A useful look at extracting binaries from container images without needing to pull the full image. Another example of the flexibility of the OCI spec.

  • The title is “Extracting a single artifact from a Docker image without pulling”.
  • The author of this article was inspired by Mr. Tõnis Tiigi posted in Docker blog and wrote this one.
  • He experimented with pulling a single binary from a Docker image stored in Docker Hub instead of the whole image.

A look at adding policy controls (using Open Policy Agent and Conftest) to Terraform deployments using Atlantis.

  • The title is “Terraform and Open Policy Agent With Atlantis”.
  • An article that introduces the environment of Terraform, the Terraform pull request automation tool Atlantis , and OPA (Open Policy Agent) that used to test OPA gatekeeper rules.

Something that has a lot of bearing on operations is complexity, and I think this post points to one issue with seemingly simple services being complex to operate in aggregate.

  • The title is “Complexity Has to Live Somewhere”.
  • It said that “We try to get rid of the complexity, control it, and seek simplicity. I think framing things that way is misguided. Complexity has to live somewhere”. and discusses the necessity of a place for complexity and system and organization should be designed and and adapted to.

Jobs

King is looking for new members for the Infrastructure engineering team to help manage the streaming data platform and the MySQL based backend for its games.Are you interested in helping games develop faster and scale to global presence, take a look at our open roles.

  • Jobs for King SRE. Mobile game company based in Stockholm, Sweden(at that moment).

Tools

Kubexit is a command supervisor for coordinated Kubernetes pod container termination. The README has a nice set of use cases that explain where it’s useful.

  • The GitHub page of the OSS tool “Kubexit” that manages the termination process of the container included in the Kubernetes pod.
  • When the tombstone application managed by the image carve (tombstone) is Start , when Exit capital is like specifications go carved.

SRE Weekly Issue #217 May 4th, 2020

Articles

Pre-requisites to Practicing Reliability?

Reliability is something you do, not something you buy.

When discussing SRE, I love to pose the question, “What does it mean to engineer reliability?”. That’s what this article is all about.
Russ Miles — ChaosIQ

  • In conclusion, the author’s idea is that “Practicing reliability does not rely on any prerequisites”.
  • SRE Weekly’s editor has also picked it up, and the word “Reliability is something you do, not something you buy.” written in the above TL;DR is a straightforward way of thinking, showing the attitude that it should be.

Thought Leadership Panel: What is a ‘Real’ SRE?

Blameless recently had the privilege of hosting SRE leaders Craig Sebenik, David Blank-Edelman, and Kurt Andersen to discuss how can SREs approach work as done vs work as imagined, how to define SRE and DevOps and the complementary nature of the two, the ethics of purchasing packaged versions of open source software, and more.

Amy Tobey, with guests Craig Sebenik, David Blank-Edelman, and Kurt Andersen — Blameless

  • Blameless invited SRE leaders Craig Sebenik, David Blank-Edelman and Kurt Andersen in a panel format such as “Recruitment (especially in the current market situation)” and “Saas/Vendor Relationships”. An article that describes what you are discussing. There is so much content, so this is my homework.
  • The video of the whole panel discussion can be downloaded by applying.

The inevitable double bind

Whenever an agent is under pressure to simultaneously act quickly and carefully, they are faced with a double-bind. If they proceed quickly and something goes wrong, they will be faulted for not being careful enough. If they proceed carefully and something goes wrong, they will be faulted for not moving quickly enough.

Lorin Hochstein

  • An article that lists three articles about COVID-19 , accepting that you are in a double-bind situation, and preparing to make effective decisions in the event of a similar situation.
  • At first, I thought it was a double blind method and I misunderstood it.

The Post-Incident Review Issue #3

It’s time for another issue already! This one contains a really great essay by Jamie Woo entitled “What Does Fairness Mean for On-call Rotations?”, about how not all on-call shifts are equal.

Jamie Woo and Emil Stolarsky — Incident Labs

  • The next issue of “The Post-Incident Review Issue #2” touched on this blog #10 (4/5~4/10). The illustrations are still cute.
  • This time we’re looking at GitHub’s outage. The contents are related to DB, which I checked on this blog before.

The Tail at Scale

If your frontend has a hard dependency on multiple microservices, their failure rates are compounded. This article fills in the math behind the paper The Tail at Scale and shows that your backends’ SLOs may have to be significantly tighter than the frontend’s.

Bill Duncan

  • The article sheds some light on what objects are needed from the backend to support the already-determined user-level objects.
  • It supplements the missing numerical part of the important article “Tail at Scale”, and if you haven’t read it yet, the original article “Tail at Scale” and a commentary on that article the morning paper Is recommended to read.

Heroku Incident #2021 Follow-up

This post-incident analysis details a case of a hard dependency that needn’t be hard, taking down the Heroku API, along with a fall-back that didn’t work as intended.

  • Follow up information for Redis outage on Heroku.

Why strace doesn’t work in Docker

I love Julia Evans’s ability to teach me something new that I didn’t realize I didn’t know.

Julia Evans

  • Since I touched it on KubeWeekly #214 last week , I will skip it.

Outages

KubeWeekly #215 May 8th, 2020

The Headlines

Editor’s pick of the highlights from the past week.

Kubernetes Podcast from Google: Helm, with Matt Butcher

Matt Butcher created Helm while at Deis, and despite his PhD in philosophy and love of all things Ancient Greek, thankfully gave it a short, easy-to-pronounce English name. He shares the story of Helm with hosts Craig Box and Adam Glick, as well as how an explanation to the Deis finance team led to the canonical Kubernetes children’s book.

With Kubernetes, the U.S. Department of Defense Is Enabling DevSecOps on F-16s and Battleships

Before DevSecOps came to the U.S. Department of Defense, software delivery could take anywhere from three to ten years for big weapons systems.

“It was mostly teams using waterfall, no minimum viable product, no incremental delivery, and no feedback loop from end users,” says Nicolas M. Chaillan, Chief Software Officer of the U.S. Air Force. Plus, “cybersecurity was mostly an afterthought.”

To find out more about the Department of Defense’s cloud native journey, read the full case study and check out the video!

  • Introducing Kubernetes case studies from the US Department of Defense. I do DevSecOps both physically and logically. Kubernetes that works even with fighters. The video ends with the phrase, “It corresponds to our mission and weapon system, so it can correspond to business,” and the US Department of Defense logo appears. Click the link above for videos like movie advertisements.
  • If you would like to know more details, I recommend that the same person who saw the presentation of KubeCon NA last year and plenty of people talked with questions and answers. After finishing the presentation with Join/Contact US!, the US Department of Defense is also looking for “a force capable of handling Kubernetes”.

ICYMI: CNCF Webinars

Weekly recap of CNCF member and project webinars that you might have missed.

You can view all CNCF recorded and upcoming webinars here.

CNCF Project Webinar: What’s New in Kubernetes 1.18

Jeremy Rickard, Enhancements Lead, Jorge Alacron, Release Lead, and Karen Chu, Communications Lead

  • Webinar video introducing changes in Kubernetes 1.18 by the CNCF release team.
  • Logo designed for each release, update of the next 1.1.9 release schedule (target date changed from original 6/30 to 8/4 due to the influence of COVID-19 these days ), each function improvement information etc.

CNCF Member Webinar: Making the Most of Helm 3

Dan Garfield, Full-Stack Engineer @Codefresh and Anna Baker, Software Engineer/Technical Writer, and DevOps Evangelist @Codefresh

  • It explains “Changes from Helm 2 to 3 (the Tiller has finally disappeared)”.

CNCF Member Webinar: Encrypting data in Kubernetes deployments. Protect your data, not just your Secrets

Maksim Yankovskiy, VP of Engineering @Zettaset

CNCF Member Webinar: The KUbernetes Test TooL (kuttl)

Gerred Dillon, Principal Engineer @D2iQ and Ken Sipe, Distributed Application Engineer @D2iQ

  • Using kuttl, you can test Kubernetes operators, Helm charts, Kubernetes distributions, Kubernetes itself, and more.

Alex Chircop, Founder and CEO @StorageOS

  • It explains how Kubernetes manages persistent volumes and integrates them with storage solutions.”
  • The presentation is very polite and gives you a sense of security. The presentation concludes with a live demo running a stateful workload on Kubernetes, followed by a question and answer session.

CNCF Member Webinar: How AWS uses Firecracker and Fargate to run serverless Kubernetes pods in Amazon EKS

Mo Ziyuan 莫梓元 解决方案架构师 @AWS

  • This webinar is delivered in Chinese for “How AWS uses Firecracker and Fargate to run serverless Kubernetes pods in Amazon EKS”.

The Technical

Tutorials, tools, and more that take you on a deep dive into the code.

Creating an Ansible Operator from scratch

Red Hat OpenShift Twitch

  • Webinar video explaining how to make an Ansible Operator from scratch using Twitch.
  • Around 29:48, he said , “The Operator Framework is in the process of being donated to the CNCF.” Red Hat has a great culture to give back to the community. I think the success of Kubernetes is largely due to Google’s early invitation to Red Hat.

Helm & Kustomize Better Together

Povilas Versockas

  • An article that explains both Kustomize and Helm using Loki as an example. He advised that “I think learning Helm & Kustomize is a good way to practice for your Certified Kubernetes Application Developer exam.”

WireGuard on K8s (road-warrior-style VPN server)

Stephen Levine

  • An article explaining how to run the VPN server function of the Linux kernel on WireGuard on K8s (actually on K3s single node cluster).

Domesticating Kubernetes

Vladimir Akopyan

  • He built Kubernetes on his home network and used it as a home server for blogs, media libraries, smart homes, etc.
  • The author said, “The cluster is actually straight-forward to set up, but we, developers are so cuddled, we are forgetting some basic networking and other low-level stuff — I found the experience educational.”
  • I will keep this article because I tend to feel “interesting” and forget about the time when I look at the composition diagram.

Speed up administration of Kubernetes clusters with k9s

Jessica Cherry, opensource.com

  • An article introducing “K9s” , a CLI tool for Kubernetes cluster management It complements the official README and is very easy to see the articles themselves and the cluster resources.

DNS issues in Kubernetes. Public postmortem #1

Amet Umerov, Preply

  • Preply ‘s public post-mortem article on DNS failures.
  • In my opinion, I think it’s important to write “Where we got lucky:” properly when writing a postmortem. Thanking for its work with the improvement activities of the past, as you can pick up operations and features that can be improved.
  • “I didn’t actually check the normality here.” “There was a lack of consideration, but it was covered by the system.” “I tried to perform unnecessary/dangerous operations, but thanks to advice and features, I stopped. Items that cannot be written if there is no psychological safety.

A Hacker’s Guide To Moving Linux Services Into Containers

Scott McCarty, Red Hat

  • An article that carefully explains points to consider, tips, procedures, etc. when migrating a service running on Linux to a container, including his background and bias based on it.

Enhancing Kubernetes Security with Open Policy Agent (OPA) — Part 1

Karen Bruner, StackRox

  • Part 1 of an article that suggests using the Open Policy Agent (OPA) to increase the security of Kubernetes.
  • This time, we will explain OPA itself and OPA components (Rego/Gatekeeper), and in Part 2, theys will discuss more practical contents (longer practical examples of Gatekeeper, importance of comprehensive policy testing, troubleshooting). It is said that it will enter.

Case Study: IT Modernization at Tidepool, an 8 part series

Betty Junod, Solo.io

  • Tidepool, an NPO (nonprofit organization) that was also featured in the KubeWeekly Editorial at the former edition. This time, as a case study, an article that divides the IT modernization journey into 8 parts. Since links to each Part are attached in the article, it seems good to pick up the Part you are interested in.

Nodejs App From Docker To Kubernetes Cluster

Muhammad zarak bin kaleem, Magalix

  • An article that explains the flow of starting a Nodejs application locally, building a Docker image, and deploying it to Kubernetes. Easy to see and simple.

Decoding the Self-Healing Kubernetes: Step by Step

  • An article explaining how Kubernetes’s self-healing works with two verified examples.

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Chris Short from Red Hat talks Operators and Kubernetes

  • Podcast with Chris Short, CNC Native Cloud Native Ambassador, DevOps’ish & KubeWeekly author, Principal Technical Marketing Manager of Red Hat He was talking to the Rad Hat Openshift team in the Twitch video of “Creating an Ansible Operator from scratch” above. Again, they are talking about Operator.
  • I didn’t know DevOps’ish, but it seemed to be good, so I would like to read it.

What is a Service Mesh? — the breakout area chat between an Account manager and a Solutions Engineer

Sachin Jha, Digital Ocean

  • About “What is a Service Mesh?”, an article explaining the service mesh in the setting where the account manager and the solution engineer are talking in the break space.
  • It is written in a conversational style, and there are diagrams so it is easy to read.

Kubernetes: The Universal Control Plane

Cedric Charly

  • Part 1 of an article explaining Kubernetes in two parts.
  • They talked about “What are the key ideas that influence the design of Kubernetes? What about Kubernetes sets it apart from other platforms?”.

Explore Anthos with a sample deployment

Aurelie Fonteny and Tony Pujals, Google Cloud

Kubernetes Governance, What You Should Know

Oleg Chunikhin, Kublr

  • As “what you should know in order to utilize governance with Kubernetes”, He touched upon security policy, image management, network policy management, configuration restrictions and policies, and explained three governance frameworks. We have introduced and tightened our products as a fit for the last framework.

Podcast Takeaways: Cloud Developer Experience, Staging Environments, and Continuous Delivery

Daniel Bryant, Datawire.io

  • An in-depth transcript of takeaways (conclusion/take-away, personally understood as a souvenir) that is being talked about in an episode welcoming four gorgeous podcast guests.
  • Embedded in the page that opens is Nic Jackson, a developer advocate of Hashicorp.

Home buying & selling platform Orchard deploys Kubernetes to AWS with Maestro

Cloud 66

  • Cloud 66 case study article. Introducing how Orchard, a platform for buying and selling homes, deploys Kubernetes on AWS using Cloud 66 Maestro.
  • It’s interesting to see Maestro Kubernetes selected over EKS while running the Platform on AWS. Factors such as modernization of QA environment, ease of deployment, and Fast Deployments are of concern. I want some more visual information in this article.
  • When asked why AWS is the cloud choice for Kubernetes as a plot fork, I understand “because the team is familiar with AWS”, but also the answer that “it fits seamlessly with Maestro Kubernetes”.

Upcoming CNCF webinars

You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.

Member Webinar: How OpenTelemetry is Eating the World
Steve Flanders, Director of Engineering @Splunk
May 8, 2020 10:00 AM China Standard Time

Member webinar: Data Services for Cloud Native Workloads
Diamanti
May 12, 2020 10:00 AM Pacific Time

Member Webinar: Piraeus: Dynamic Provisioning, Resource Management and High Availability for Local Persistent
Philipp Reisne, CEO @Linbit
Sun Liang, 资深存储架构师 @DaoCloud
Alex Zheng, 资深存储工程师 @DaoCloud
This webinar will be delivered in Chinese.
May 13, 2020 10:00 AM China Standard Time

Member Webinar: End YAML engineering with cdk8s!
Elad Ben-Israel, Principal Software Engineer @AWS, Developer Tools
Nathan Taber, Senior Product Manager @AWS, Kubernetes
May 13, 2020 8:00 AM Pacific Time

Member Webinar: The Rosetta Stone Guide to Compliance in a Cloud-Native World
Cynthia Burke, Program Manager @Capsule8
May 13, 2020 10:00 AM Pacific Time

Member Webinar: Navigating the Sea of Local Kubernetes Clusters
Ara Pulido, Developer Advocate @Datadog
May 14, 2020 10:00 AM Pacific Time

Member Webinar: Influencing DevOps without Authority — how “DevOps engineer” can advance real DevOps
Baruch Sadogursky, Head of Developer Advocacy @JFrog
May 15, 2020 10:00 AM Pacific Time

Member webinar: Cloud Native Monitoring: Scaling Prometheus
Aaron Newcomb, Director, Product Marketing, Monitoring @Sysdig
Carlos Arilla Navarro, Technical Marketing Engineer @Sysdig
May 19, 2020 10:00 AM Pacific Time

Member Webinar: How to Keep Your Clusters Safe and Healthy
Shuting Zhao, Software Engineer @Nirmata
Jim Bugwadia, Founder and CEO @Nirmata
May 20, 2020 10:00 AM Pacific Time

Member Webinar: Take Your Monitoring to the Next Level
Liran Haimovitch, Co-Founder & CTO @Rookout
Mickael Alliel, DevOps @Rookout
May 21, 2020 10:00 AM Pacific Time

Project Webinar: Harbor, the trusted cloud native registry for Kubernetes
Michael Michael, VMware
May 28, 2020 10:00 AM Pacific Time

Member Webinar: Trivy Open Source Scanner for Container Images — Just Download and Run!”
Teppei Fukuda, Open Source Engineer @Aqua Security
June 3, 2020 10:00 AM Pacific Time

Member webinar: Kubernetes Cost Allocation Done Right
Webb Brown, Co-founder and CEO @Kubecost
June 24, 2020 10:00 AM Pacific Time

Member Webinar: Pivoting Your Pipeline from Legacy to Cloud Native
Tracy Ragan, CEO of DeployHub and CDF Board Member
June 30, 2020 10:00 AM Pacific Time

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

Written by

An infra engineer in Tokyo, Japan. Grew up in Athens, Greece(1986–1992). #Network, #Kubernetes, #GCP, #AWS SAP, #National Tour Guide for English

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store