- In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
- Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
- I hope it contributes to the people browsing this kind of information as a reference.
DEVOPS WEEKLY ISSUE #501 August 2nd, 2020
- The title is “Introducing Domain-Oriented Microservice Architecture”
- I will skip it since it was covered in KubeWeekly last week
The adoption of devops practices has gone hand-in-hand with organisations embracing digital transformation. Both phrases risk overuse but these posts discuss some useful mental models to help focus conversations.
- The titles are “Mental Models to Clarify the Goals of Digital Transformation, Part 1” and “Mental Models to Clarify the Goals of Digital Transformation, Part 2.
- It proposes “8 mental models” that cover various aspects of digital transformation in two parts
- Gaining Speed
- Using Digital Technology
- Interacting Digitally
- Becoming Customer-Centric
- Being Data-Driven
- Increasing Resilience
- Becoming Future Ready
- Building the Enterprise of the Future
- What impressed me especially was that “speed improves security.” I tended to capture the trade-off between speed and security, so it was an opportunity to change my perspective.
○ Speed improves security because we can respond more quickly to changes in the threat landscape, and we can repeatedly, quickly, and with low overhead test our systems as they are being built and after they are deployed.
What properties of developer platforms lead to adoption? The following post is specifically about a large scale edge platform, but it’s a great read for anyone building platforms of all kinds for developers, including those doing so in internal platform teams.
- The title is “The Edge Computing Opportunity: It’s Not What You Think”.
Starting from the announcement that a series of extensions will be announced in the next week called serverless week on the edge computing platform “Cloudflare Workers”’ announced nearly three years ago, two years after GA. This article was published on 2020/07/26.
- Matthew’s Hierarchy of Developers’ Needs
- Speed As the Killer Feature?
- Speed Alone Is Niche
- Zero Nanosecond Cold Starts
- More Efficient Architecture Means Lower Costs
- From Limits to Limitless
- Ease of Use
- The Bezos Rule
- The Coming Era of Data Sovereignty
- Edge Computing to the Rescue
- Serverless Week
When first embracing devops practices and cloud services it’s common in large organisations to build a centre of excellence. There are some traps to avoid when taking this approach that the following post discusses.
- The title is “Building a Cloud Centre of Excellence in 2020: 13 Pitfalls and Practical Steps”.
- Personally, this is the theme that I was most interested in this week’s title. “As more and more organizations are looking to expand their cloud journey, many companies have set up a Cloud Center of Excellence (CCoE) to manage the initial embedding and launch of cloud adoption within their business. Start from the premise of “choose that.”
- The following article mainly describes “6 Common Traps to AVOID When Establishing a CCoE” and “Practical Next Steps: Overcoming 7 Common CCoE Challenges”.
- 6 Common Traps to AVOID When Establishing a CCoE
- Getting hung up on the term ‘centre’
- Controlling consumption of cloud services and blocking innovation opportunities
- Ignoring developer experience
- Applying traditional architecture standards and processes
- Confusing guardrails with blockers
- Catering for a single cloud maturity level
- Practical Next Steps: Overcoming 7 Common CCoE Challenges
Challenge #1: Building your CCoE — who should be part of it?
Challenge #2: Getting started — How big should it be?
Challenge #3: How much and what involvement should the business have in the CCoE?
Challenge #4: Instilling Modern Ways of working
Challenge #5: How should your CCoE evolve over time?
Challenge #6: Should the CCoE be in charge of the cloud costs being incurred?
Challenge #7: How can the CCoE successfully scale cloud adoption across the organisation?
The videos from the recent DevSecCon online conference are all available, with a range of interesting topics covered including infrastructure as code security, continuous audit compliance, supply chain attacks and more.
- The title is “DevSecCon24–24hr virtual conference”.
- Summary page of virtual conference “DevSecCon” held on June 15th in 3 time zones (APAC/EU/AMERICAS) .
- Thank you for the playlists of the sessions being linked. I was also applying, but I could not see it at all.
- It seems that the video itself is not published if it does not have a link. As of August 3, 2020, it seems that the following has not been published because there is no link.
○ APAC PANEL: The Future of DevSecOps — (Moderator) Mohammed A. Imran, Mohan Yelnadu, Jerome Walter, Stefan Streichsbier & Sarah Young
○ EU KEYNOTE: Design Thinking for Secure Development — Wolfgang Goerlich
○ The Container Security Checklist — Liz Rice
○ GitOps Progressive Security For Kubernetes — Gadi Naor
○ Threat Modeling the Death Star — Mario Areias
○ IGNITES: Unquantified Serendipity: Diversity in Development — Quintessence Anx
- The title is “My monolith doesn’t fit in your serverless”
- The author started his words with “I have come to the conclusion that the problems I’m tasked to solve are tricky to get right using a serverless approach. Here’s my take on why not serverles-all-the-things.” and explained his thoughts.
- He received some questions and answered on the bottom of this article.
- The title is “The Seccomp Notifier — New Frontiers in Unprivileged Container Development”.
- I will skip it since it was covered in KubeWeekly last week.
- RBAC.dev web page. The description is as follows.
- Advocacy site for Kubernetes RBAC
- A site dedicated to good practices and tooling around Kubernetes RBAC. Both pull requests and issues are welcome.
- The title is “STATE OF SERVERLESS SURVEY 2020”.
- The target is as follows. It seems to be done in about 7 to 8 minutes without a bearer.
- This survey is for tech leaders, engineering managers and developers.
- You will receive a free report and one of four Amazon eGift cards (each worth $20).
SRE Weekly Issue #230 August 2nd, 2020
LaunchDarkly started off with a polling-based architecture and ultimately migrated to pushing deltas out to clients.
Dawn Parzych — LaunchDarkly
- It describes how to move from a polling architecture to a streaming architecture and how to address the “build vs. buy question”.
- When they first introduced the streaming architecture, they chose “buy” and partnered with a third party, but as they continued to grow in size, issues began to arise and they built themselves by steering to “build”.
A brief overview of some problems with distributed tracing, along with a suggestion of another way involving AI.
Larry Lancaster — Zebrium
- It proposes two issues of distributed tracing and a simple alternative.
○ Work required to yield results
○ Inadequacy of those results
This is Google’s post-incident report for their Google Classroom incident on July 7.
- Follow-up article on Classroom outage at Google Cloud. Some users (20% at peak) using iOS and Android apps could not access Classroom.
Uber has long been a champion of microservices. Now, with several years of experience, they share the lessons they’ve learned and how they deal with some of the pitfalls.
Adam Gluck — Uber
- It’s appearing again, but I will skip it too, since it was covered in KubeWeekly last week.
- It’s featured in all three e-mail magazines I am checking out on this blog, so you can see how much attention this article has.
This article opens with an interesting description of what the Cloudflare outage looked like from PagerDuty’s perspective.
Dave Bresci — PagerDuty
- The author cited the large-scale failure (Is it Cloudflare one?) due to the router misconfiguration that occurred most recently, and introduced PageDuty’s response when a large-scale failure occurred. Introducing the company’s documentation centered on Slack integration.
This post reflects on two distinct philosophies of safety:
the engineering design should ensure that the system is safe design alone cannot ensure that the system is safe
- The author discusses different perspectives of Nancy Leveson and many others in the resilience engineering community in an interesting way to clarify his thoughts on the safety of Nancy Leveson’s system at the MIT STAMP workshop. Both are important points for me and I will keep these.
○ Leveson believes that depending on human adaptation in the system is itself dangerous. If we’re depending on human adaptation to achieve system safety, then the design engineers have not done their jobs properly in controlling hazards.
○ The resilience engineering folks believe that depending on human adaptation is inevitable, because of the messy nature of complex systems.
You can’t use availability metrics to inform you about whether your system is reliable enough, because they can only tell you if you have a problem.
- The same author as above also participated in Nancy Leveson’s next session, “Safety Assurance (Safety Case): Is it Possible? Feasible?” to summarize and comment on the session at the MIT STAMP workshop..
- Leveson is skeptical about assessing the safety of the system. Instead, they argue that safety can be designed by focusing on generating safety requirements at the design stage rather than performing post-design evaluation.
- I agree with her about some points of availability. However, there are some agonizing points if we find the idea at the operation phase has already started. Perhaps I don’t understand some key ideas of the author and Leverson. Is it okay to have an understanding of that without covering it in operation, it will return to the design and restart from that phase?
- Three closing words of her slide.
- If you are using hazard analysis to prove your system is safe, then you are using it wrong and your goal is futile
- Hazard analysis (using any method) can only help you find problems, it cannot prove that no problems exist
- The general problem is in setting the right psychological goal. It should not be “confirmation,” but exploration.
- Facebook, Instagram and WhatsApp
Also two PoP-specific incidents:
Full disclosure: Fastly is my employer.
KubeWeekly #228 August 7th
Editor’s pick of the highlights from the past week.
Amanda Katona, VMware
With hundreds of hours of programmed content for KubeCon + CloudNativeCon EU 2020 Virtual, it risks becoming an overwhelming (and occasionally numbing) experience. Don’t worry — the CNCF community is here to help! As you start to prepare for the event, Amanda shares her advice for making the most of the virtual experience. Read the blog here.
- The title is 4 Tips, but it actually conveys the following 6 tips. The last tip is to eat in the style of the venue, Amsterdam, so I wanted to enjoy the atmosphere at home.
- Build your agenda
- Prepare questions for your top 5 sessions
- Take incredible notes
- Get your swag on
- Be extra social and extra positive
- Eat french fries with mayonnaise
KubeCon + CloudNativeCon EU Virtual Session Spotlight
The countdown to KubeCon + CloudNativeCon EU Virtual on August 17–20, 2020 is on! As we approach the event, we curated a few recommended sessions that we don’t want you to miss. Please see the feature for this week and be sure to register today!
Presented by Holly Cummins, Worldwide IBM Garage Developer Lead, IBM
The past five years have been the warmest since records began. Human activity, including the IT industry, is driving worrying about climate change. Data centers alone consume 3% of the world’s energy, and more and more of that energy is being used by Kubernetes and workloads running on Kubernetes. Is k8s helping, or making things worse?
The beauty of the cloud is that it makes it easy to run code, virtualized, and scheduled for efficiency… but it doesn’t provide any guarantee that what’s running is useful. Even when the workload is high-value and efficient, Kube sprawl can lead to low utilization, unsatisfactory elasticity, and high costs — but mega-mono-clusters have their own problems around isolation, security, and management. How should these competing requirements be balanced? This talk discusses some of the trade-offs and provides a roadmap to figuring out the right thing.
- KubeCon + CloudNativeCon EU Virtual 8/19 Keynote. Schedule: Wednesday, August 19th 15:58–16:14 CEST (Central European Summer Time).
- It seems that the session will be conducted with the following thema.
○ “The beauty of the cloud is that it makes it easy to run code, virtualized and scheduled for efficiency… but it doesn’t provide any guarantee that what’s running is useful. “
○ “Even when the workload is high-value and efficient, Kube sprawl can lead to low utilisation, unsatisfactory elasticity, and high costs — but mega-mono-clusters have their own problems around isolation, security, and management.”
○ “How should these competing requirements be balanced? “
○ “This talk discusses some of the trade-offs and provides a roadmap to figuring out the right thing.”
ICYMI: CNCF Webinars
You can view all CNCF recorded and upcoming webinars here.
Minghua Tang, Infrastructure Engineer @PingCAP
- It describes the TiKV architecture, the reason for introducing “Follower Read”, and how to implement it.
- TiKV is a strongly consistent key-value database built on the Raft algorithm.
Roko Kruze, Solutions Engineer @Flowmill and Jonathan Perry, CEO @Flowmill
- It uses service meshes such as Istio / Envoy and eBPF to monitor, compare and contrast traffic between microservices.
- It also discusses the types of visibility that each approach can provide, compares their performance implications, and describes how to deploy them in complementary ways.
Chris Splinter, Sr. Product Manager — Developer Solutions @DataStax and Patrick McFadin, VP of Developer Relations @DataStax
- The following describes how Kubernetes and Apache Cassandra work together to solve the two issues.
- Modern application stacks require that the data serving infrastructure be as flexible as all other layers with minimal tradeoffs.
- Companies need to quickly build and deliver their next app
- The topics are as follows.
- Data considerations when modernizing your stack with Kubernetes and microservices
- Examples of best practices that users are deploying to deal with these complexities
- Our experiences of building and using a Kubernetes Operator for Cassandra at scale
Neeraj Poddar, Co-founder and Chief Architect @Aspen Mesh and John Howard, Software Engineer @Google
- Debugging in production with Istio focuses on the following topics:
○ How to debug and diagnose issues with your sidecar proxy Envoy
○ How to monitor and debug the Istio control plane
○ How to use operational tools like “istioctl” to understand issues with your configuration
○ Using profiling to identify bottlenecks
○ Recommendations for a production ready secure Istio deployment
Ryan Allen, Senior Software Engineer @Chronosphere
- It introduces “M3”, an open source distributed metric engine, and details some of the performance optimizations.
- It focuses on how contributors worked to identify bottlenecks, investigate potential solutions, benchmark results, and test defects.
Tutorials, tools, and more that take you on a deep dive into the code.
Alina Ryan, Red Hat and Mohammad Saif Shaikh, Red Hat
- It describes how stateful applications can run on the various cloud platforms supported by OpenShift.
- It has dynamically provisioned storage nodes and are using persistent volumes across mixed node (Windows and Linux) clusters. I want to do hands-on, so I will keep it.
Kevin Crawley, Containous
- Before explaining Ingress Controller, it said that “Out of the complexities that developers of cloud-native applications face, strategically utilizing Kubernetes ingress controllers is among the most difficult components to understand — and among the most important.” and started to explain from “Why is the network important for the development workflow?”.
- It explains sandboxing and workload isolation. I will read this again later.
- He said that “It seems to me like, for new designs, the basic menu of mainstream options today is:” and described the following options.
○ Jailing otherwise-unmanaged Unix programs with nsjail or something like it.
○ Running unprivileged Docker containers, perhaps with a tighter seccomp profile than the default.
○Going full gVisor.
○ Running Firecracker, either directly or, in a K8s environment, with something like Kata.
- The first half describes how to set up a 3-node etcd cluster on an Ubuntu 18.04 server. The second half focuses on securing the cluster using TLS.
- In addition to the components in the title, you will learn the following tools as well.
- Extensive commentary, CLI and options. There are 14 steps, and it feels quite large.
Jagdish Mirani and Adi Atzmony, JFrog
- A summary of the sessions at swampUP 2020 . The video is embedded.
- A session for those who want to implement Kubernetes to the next level, enterprise grade.
Alexey Ivanov and Oleg Guba, Dropbox
- It describes the old Nginx-based traffic infrastructure, its problems, and the benefits of migrating to Envoy.
As I read it, I found other articles such as Bandaid and other interesting items. I will read this again.
Carlos Camacho, Red Hat
- It uses KubeInit to deploy an OKD 4.5 cluster in about 30 minutes with a single command.
- He deploys 3 controllers, 1–10 workers, 1 service, and 1 bootstrap node.
Debugging tool for Kubernetes which tests and displays connectivity between nodes in the cluster.
- A GitHub page of OSS debug tool Goldpinger for Kubernetes.
- It generates Prometheus metrics that can be scraped, visualized and alerted as a Daemonset on Kubernetes.
Tres Vance and Erik Jacobs, Red Hat
- I forgot this collaboration, so I saw the title twice.
- I did not understand that Red Hat Openshift is provided on GCP, Azure, and IBM Cloud as well as “AMAZON RED HAT OPENSHIFT”.
- The first article in the series. It has a very basic, somewhat functional Kubernetes cluster set up on one node.
- In the next post, it’ll be setting up a multi-node cluster to get it up and running.
Articles, announcements, and morethatgive you a high-level overview of challenges and features.
Craig Box and Adam Glick, Kubernetes Podcast from Google
- Kubernetes Podcast by Google employees. The current co-hosts are Craig Box and Adam Glick.
- The guest is Thomas Strömberg of Google Cloud (manager of the Container DevEx team at Google, and a maintainer of Minikube).
- The topics of interest in News of the week are:
○ Spinnaker Operator is GA
○ GKE r25
○ Server side encryption for ECR
- A transcript of the “Livin’ on the Edge” podcast. This week’s guest is Katie Gamanji (Cloud Platform Engineer at American Express and TOC member of the CNCF).
- She explained the developer experience components associated with interacting with a Kubernetes cluster.
- They discuss UI driven tools such as kubectl to k9s and Octant, and the evolution of tools from ApplicationOps to GitOps.
- I think the content with this podcast guest is great every time.
Matt Asay, AWS
- It focuses on Lili Cosic, maintainer of “kube-state-metrics” and Software Engineer of Red Hat , and how she wears hats between the company and OSS.
Ernest Jones, The Enterprisers Project
- It explained the answer of “Why choose an enterprise Kubernetes platform, as opposed to assembling open source Kubernetes tools yourself?” with the following three points.
- Time savings / Time to value
Alex Williams and B. Cameron Gain, The New Stack
- A podcast summary article from The New Stack. Mr. Kunal Parmar (Director of Engineering) of Box Company is a guest. The Podcast is embedded.
- He describes the long and winding road of Kubernetes journey, one of the first companies to introduce Kubernetes, as a case study.
Diane Mueller, Red Hat
- A CNCF article that explains how to analyze the relationship that spans the CNCF community.
- They are focusing on the participants who are the “connectors” between the communities. I became interested in approaches to digitalization and visualization. It links to information that you may be interested in.
Upcoming CNCF webinars
You can check some Recorded Webinars and Upcoming Webinars here. The following are posted as Upcoming CNCF webinars at that moment.
Member Webinar: Hardware for Kubernetes, Peeling Back the Layers
Erik Reidel, SVP Compute & Storage Solutions @ITRenew
Aug 11, 2020 10:00 AM Pacific Time
Member Webinar: The Open-Source Observability Playbook
Hen Peretz, Head of Solutions Engineering @Epsagon
Aug 12, 2020 7:00 AM Pacific Time
Member Webinar: Migrating Real-Time Communication Applications to Kubernetes at Scale: Learnings from 8×8’s Experience
Michael Laws, Sr. Site Reliability Engineer/DevOps at 8×8
Pankaj Gupta, Sr. Director at Citrix
Aug 12, 2020 1:00 PM Pacific Time
Ambassador Webinar: Navigating the service mesh ecosystem
Lachie Evenson, Principal Program Manager @Azure & CNCF Ambassador
Aug 14, 2020 10:00 AM Pacific Time
Member Webinar: MLOps automation with Git Based CI/CD for ML
Yaron Haviv, Co-Founder and CTO, Iguazio
Aug 26, 2020 1:00 PM Pacific Time
REGISTER NOW »
Project Webinar: Kubernetes 1.19
Kubernetes release team
Aug 28, 2020 10:00 AM Pacific Time
REGISTER NOW »
Member Webinar: Getting started with container runtime security using Falco
Loris Degioanni, CTO and Founder @Sysdig
Sept 2, 2020 1:00 PM Pacific Time
REGISTER NOW »
How about those articles? Do you have any interest in any?
Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.