SRE / DevOps / Kubernetes Weekly Collection#1(Week 06)

- In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
- Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
- I hope it contributes to the people browsing this kind of information as a reference.
DEVOPS WEEKLY ISSUE #475 February 2nd, 2020
SRE Weekly Issue #205 February 2nd, 2020
KubeWeekly #202 February 7th, 2020
Disclaimers:
- If you have some questions or comments, please feel free to contact me.
- I really appreciate it if you check the original resources from the linked URL too.
- I guess that I have some misunderstandings or blurry phrases, I insist on leaving me some comments even if I am not a professional or am not titled for the content.
- Since there is a lot of information, I pick only words and links(not images).
- Some resources include information from before 2019, authors do not only pick brand new ones.
DEVOPS WEEKLY ISSUE #475–2nd February 2020
News
- Explained what the modern business requires for the CIOs as a new kind of tech leader based on 3 overall vectors of transformation and 5 features of CIO who make innovation.
- Explained the secure(or insecure) containerization with many memorable images.
- Started a series of the implementation of Container manager(The author called a higher-level component controlling multiple OCI runtime instances), and explained through conman.
- This article was written on October 6, 2019 and you can check the following articles on the website.
- The author/editor personally selected the 30 best technical talks from last year.
- It links each presentation video and you can check through casually.
- Each presentation is very high quality and I’m fascinated even by genres which I’m not so sure. It works well as reference.
- It talks about the current container barriers and future. It introduces MicroVMs, Unikernels, and Container sandboxes.
- It introduces cf-operator, Cloud Foundry Quarks, Eirini, and kubecf as tools for running Cloud Foundry on Kubernetes. He performed an imperfect demo because it has a lot of things to be desired when using in production.
Jobs
- The page of hiring information on Stack Overflow at that moment. Arweave wanted a DevOps professional in Berlin.
Tools
- Introduction of Rode. It collects policies supported by OPA(Open Policy Agent) and Grafeas and conducts authentication and adaptation of policies like a software supply chain tool. The README.md chart is easy to view.
- Introduction of the service integration of VictorOps and ServiceNow. I received the impression of “collaboration” rather than said “integration”. It might be easy to understand Splunk’s group product, VictorOps and SNOW(ServiceNOW) cooperation introduction.
SRE Weekly Issue #205 2nd February 2020
Articles
The Myth of the Blameless Retrospective
This article hints at the fact that blame and sanction (punishment) are two different things.
Bonus content: Dr. Richard Cook on blameless vs sanctionless retrospectives
Bob Reselman
- The article title “The Myth of the Blameless Retrospective” is contrasted with “Blameless Postmortem” in the SRE book.
- It says that “Newer companies such as Google, Etsy, and Airbnb are, in many ways, poster children for the Agile and DevOps sensibilities that champion the value of the blameless retrospective”.
- And “But, most IT Departments live in old-school business sectors such as insurance, banking, defense, medicine, retail, entertainment (think Boston Red Sox and Landmark Cinemas), and manufacturing (Maytag, Mack Truck, and Pioneer Seed). These companies are sitting on a pile of legacy code and legacy processes, as well as a pretty entrenched legacy business culture”.
- I guess it is a restraint for people who are blindly trying to adopt different cultures.
(A few) Ops Lessons We All Learn The Hard Way
Here we have a few lessons in operations that we all (eventually) (have to) learn; often the hard way.
Jan Schaumann
- There are 88 operational lessons and those have deep meanings.
What are Service Level Objectives (SLOs)? Lessons Learned
I especially like the emphasis on reducing pager fatigue through thoughtfully selected SLOs.
Emily Arnott — Blameless
- It mentions the merit of good SLO(Service-Level Objective)s.
The four concepts, drawn from a paper by Dr. David Woods, are:
- Rebound
- Robustness
- Graceful extensibility
- Sustained adaptability
Thai Wood — Resilience Roundup
- The article summarizes and explains Dr.Woods one.
- The term “resilience” has come to mean different things in different contexts to different people.
- It explains resiliency with the above 4 concepts.
How an Alleged “Space Strike” Beautifully Demonstrates Work-As-Imagined Versus Work-As-Done
Understanding the difference between work-as-imagined and work-as-done is critical to the reliability of a complex system.
Jaime Woo and Emil Stolarsky — The Morning Mind-Meld
- It discusses the theme of Work-As-Imagined Versus Work-As-Done with an alleged “Space Strike”, 90 minutes silence occurred in space mission.
- When I read it first, I could not get the sense of this story, but @inductor pointed out to me that in Japanese(If it makes no sense, it’s due to my bad translation…).
- “To operate complex systems like that, you need to design a realistic way or the operation might collapse”, “Personally, I think it provides many important aspects to maintain reliability”. → After I got this comment I read again and got the context and understood better. Thank you!!
Tracking toil with SRE principles
There’s a useful survey in here if you’re trying to measure or track toil in your organization.
Eric Harvieux — Google
- Introduction of the way of defining Toil along with SRE principles and the way of tracing it.
- Don’t regard the technical and organizational complexity as Toils.
Site Wide Memory Leak: An On-Call Story
A nice little debugging story hinging on a bug in an upstream library.
Sanket Patel
- The story starts from receiving alerts from different hosts frequently on some specific weekend.
- He found the root cause of Memory Leak which triggered the alert of “Memory Usage Threshold Over” and was satisfied with solving the mystery.
Outages
Pinterest
Microsoft Office 365 Sharepoint Online
TD Bank
Google Drive, Docs, Sheets, and Slides
Facebook and Instagram
Gandi
They posted a quite candid analysis, concluding that they’re not sure what went wrong.
- It picked the above companies outage information
KubeWeekly #202: February 7, 2020
The Headlines
Editor’s pick of the highlights from the past week.
Congratulations to the newest TOC members!
Please help us in welcoming the newest members of the CNCF TOC including Katie Gamanji, Liz Rice, Saad Ali, Sheng Liang, and Justin Cormack.
- Michelle Noorali welcomed the new members of the CNCF TOC(Technical Oversight Committee), Katie Gamanji, Saad Ali, Sheng Liang, Justin Cormack and re-elected Liz Rice and thanked Alexis Richardson、Joe Beda for their time and work on twitter.
Announcing the containerd Project Journey Report
CNCF
CNCF just released the containerd Project Journey Report, the fourth such report issued for CNCF graduated projects. This report attempts to objectively assess the state of the containerd project and how CNCF has impacted the progress and growth of containerd.
- My opinions and aspects are covered in the above comment.
The Technical
Tutorials, tools, and more that take you on a deep dive into the code.
How to Develop and Debug Python Applications in Kubernetes with Okteto
Ramiro Berrelleza, Okteto
- It talked about how to develop and debug Python-based applications using Okteto on Kubernetes.
- Oketeto was open-sourced in 2019. It also has an enterprise version. If you are interested in it you can find #Okteto channel in Kubernetes slack.
Karan Sharma, Zerodha
- It explained the structure of DNS in Kubernetes based on CoreDNS, the default DNS system in the current Kubernetes version.
Continuous Profiling Go applications running in Kubernetes
Gianluca Arbezzano, InfluxDB
- The article’s author picked up some articles like ”Continuous profiling in Go with Profefe”, “Custom pprof profiles” first and introduced new OSS Continuous profiles infractures, Profefe which is made of collectors, repositories, API to store, retrieve and query profiles with summarizing README.md of its GitHub page,
A bit of Istio before tea-time
Alex Ellis, OpenFaaS
- It introduces a very short example demo process for Istio using Public IP address and your own laptop before tea-time.
Latest Jepsen Results against etcd 3.4.3
Xiang Li, Alibaba Group
- The research company, Jepsen conducted test and analysis of etcd 3.4.3 and the etcd community team received good outcomes and useful feedback from them.
- If you want to check the overall report, you can click here.
Konveyor: Open Source, Migration Assistance for Kubernetes
Konveyor Project
- It introduced the OSS Konveyor to migrate existing applications on Kubernetes.
- It includes links of GitHub, Form, Slack, and Get Started. At that time, “Get Started” was incomplete.
Troubleshoot Kubernetes with the power of tmux and kubectl
Abhishek Tamrakar, Opensource.com
- It introduces how to troubleshoot using kubectl and tmux.
- It suggests Alias, if you want to use simple useful aliases for kubectl and the combination of its options, you can check this repository.
Load balancing and scaling long-lived connections in Kubernetes
Daniele Polencic, LearnK8s
- It suggested how to scale and load balance the “long-lived connections” which are not offered on any built-in mechanism for them on Kubernetes.
Emit Datadog monitors based on Kubernetes state
Astro is an operator that emits Datadog monitors based on Kubernetes state.
- The link of GitHub repository of Kubernetes Operator, Astro which simplifies Datadog monitor administration.
- For more details, please check README.md.
The Editorial
Articles, announcements, and more that give you a high-level overview of challenges and features.
Craig Box and Adam Glick, Kubernetes Podcast from Google
- The episode of Weekly Kubernetes Podcast hosted by Community members belongs to Google.
- Marin Jankovski, Engineering Manager of GitLab is the guest.
- Many contents of “News of the week” are covered in KubeWeekly.
HPE acquires zero-trust networking, security firm Scytale
Charlie Osborne, ZDNet
- The news of HPE(Hewlett Packard Enterprise) acquired Scytale.
- You can see the release of HPE here.
- Because I joined SPIFFE Meetup twice in Tokyo, I am interested in this news. zero-trust networking is the genre I want to understand more.
Run Windows Server Containers on GKE
Tim Anderson, The Register
- It explained about the beta support of Windows server containers on GKE(at that moment), Config Connector, and analyzed Google’s aim with Kubernetes and surrounding tech companies.
Kubestone — Kubernetes & OpenShift performance benchmarking
Kubestone is a benchmarking Operator that can evaluate the performance of Kubernetes installations.
- It introduced the operator, Kubestone, using for benchmarking performances of the installation on Kubernetes.
- Due to the title, I expected Openshift will be mentioned with Kubernetes. Nevertheless, I could not find the story of Openshift in the body of this article and reference…
Kubernetes’ Inevitable Takeover of the Data Center
Scott Fulton III, DataCenter Knowledge
- This series regarded Kubernetes as the most disruptive technology of IT that takes over Data Center and as a DCK(Data Center Knowledge), they analyzed carefully from the rise and past few years of the development of it.
If You’ve Got It, Flaunt It — Kubernetes Experience, That Is
Sydney Sawaya, SDxCentral
- It explained the gap of the market between the demand of engineers with experience of Kubernetes and supply and then introduced the CNCF training menu briefly.
Kubernetes Operators: 4 facts to know
Kevin Casey, The Enterprisers Project
- It says in the beginning, “Without real automation, you won’t realize the full potential of containers. That’s where Kubernetes Operators play a growing role” and list 4 facts of Kubernetes Operator to know as an IT leader.
Register Now: KubeCon + CloudNativeCon EU Day Zero Events
Kim McMahon, CNCF
- It introduced co-host events of KubeCon + CloudNativeCon as Day Zero at that moment. Now It’s planned to be held as a virtual event between August 17th — 20th.
How Frame.io Built a Full Security Program Around Its Video Cloud with Falco
CNCF
- As titled, it explained “How Frame.io Built a Full Security Program Around Its Video Cloud with Falco” for Netflix to Fox Sports and Vice, some of the most prominent creators of video and film content.
- To see more details of the use case, check it here.
How about those articles? Do you have any interest in any?
Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.
Bye now!!