SRE / DevOps / Kubernetes Weekly Collection#67(Week 19, 2021)

Yoshiki Fujiwara
5 min readMay 14, 2021
  • In this blog post series, I collect the following 3 Weekly Mailing List I subscribe to, leave some comments as an aide-memoire and useful links.
  • Actually, I have already published the same content in my Japanese blog and am catching-up in English in this series.
  • I hope it contributes to the people browsing this kind of information as a reference.

DEVOPS WEEKLY ISSUE #541 May 9th, 2021
SRE Weekly Issue #269 May 9th, 2021
KubeWeekly # 262 May 21st, 2021 ← KubeCon + CloudNativeCon Europe 2021, KubeWeekly will be closed for 2 weeks and will resume on May 21st.

DEVOPS WEEKLY ISSUE #541 May 9th, 2021

News

Are developer portals (like those powered by Backstage) an anti-pattern? This post argues yes.

  • The title is “Developer Portals Are an Anti-Pattern”.
  • After explaining what it is like to the open source “Backstage” by Spotify, which was taken up last week , it states its opinions and what made it feel advicing a “wrong direction” approach.

The full playlist of talks from last week’s GitOps Con event with case studies and deep dive technical sessions.

  • A playlist published on YouTube for GitOps Con 2021.

An interesting post on hosting SQLite databases on static hosting, in this case GitHub pages. SQlite databases are just files, and the post shows how to compile SQLite itself to WASM for serving.

  • The title is “Hosting SQLite databases on Github Pages”.
  • As mentioned above, the author shows how to use a SQLite database on a static website with a tool it wrote.

AWS CDK now supports Java, allowing for native deployment tooling for another ecosystem. This post demonstrates deploying a Java app to Lambda using CDK.

  • The title is “Packaging and deploying AWS Lambda functions written in Java with AWS Cloud Development Kit”.
  • As mentioned above, it explains how to build and package a Lambda function written in Java via the AWS CDK that has external dependencies.

Pyston is an alternative Python runtime (technically a fork of CPython) that purports to be 30% faster for common workloads. Originally developed at Dropbox, Pyston is now an open source project.

  • The title is “Pyston v2.2: faster and open source”.
  • It introduces Pyston v2.2, the latest version of Pyston, a faster implementation of Python.

Jobs

Gitpod.io is looking for senior engineers helping to build out our SRE team.

Want to work in open source and fully-remote with some of the world’s most talented K8s and developer tools engineers? You are obsessed with DevX and automating our workflows? We pioneered the concept of dev-environments-as-code and provision automated and ready-to-code development environments that blend in your existing workflow. We’d love to hear from you.

  • As mentioned above, job listings for multiple positions of senior engineers.

SRE Weekly Issue #269 May 9th, 2021

Articles

Edgar: Solving Mysteries Faster with Observability

We built Edgar to ease this burden, by empowering our users to troubleshoot distributed systems efficiently with the help of a summarized presentation of request tracing, logs, analysis, and metadata.

Kevin Lew, Maulik Pandey, Narayanan Arunachalam, Dustin Haffner, Andrei Ushakov, Seth Katz, Greg Burrell, Ram Vaithilingam, Mike Smith and Elizabeth Carretto — Netflix

  • An article on 09/03/2020 that introduces “Edgar”, a self-service tool for troubleshooting Netflix’s distributed systems. At that time, the following answer was given to a “request for open source it”.
  • Unfortunately, we don’t have any short-term plans to open source Edgar, but it’s on our radar as something to consider. A lot of Edgar is very Netflix-specific, and we’d have some work to do to make it abstract enough and consumable enough for open source. But maybe someday!

The Comprehensive Site Reliability Engineering (SRE) PDF

The PDF covers 5 main areas:

  1. Availability
  2. Performance
  3. Monitoring
  4. Incident Response
  5. Preparation

No account required or form to fill out to download the PDF.

Splunk/VictorOps

  • From the guidebook “Resilience First”, the explanation focuses on the “Core components of SRE” of the above 5 main areas that the Editor says.
  • Other reference materials are introduced in the “Additional SRE resources” section at the end.

What are MTTx Metrics Good For? Let’s Find Out.

This one’s especially interesting for the section about what MTTx metrics aren’t good for, and the following section on how to improve them.

Emily Arnott — Blameless

  • As the title suggests, the following points explain the advantages of MTTx Metrics, which represents the Mean Time To x.
    ○ What are common MTTx metrics and why are they used?
    ○ What are some problems with relying on MTTx metrics?
    ○ How can I make MTTx metrics more helpful?
    ○ How do I move away from shallow metrics?
    ○ How better metrics help build a blameless culture

Resiliency and Disaster Recovery with Kafka

If you’re interested in deploying Kafka in a multi-region configuration, eBay has put quite a bit of thought into this and has a lot to share.

Engin Yoeyen — eBay

  • It outlines technical scenarios that require ordered events, highlights some challenges, and describes possible solutions for performing multi-region Kafka setups.

What Chaos Engineering Is (and Isn’t)

Straight from someone who was there from the start. The “what chaos engineering is not” section is especially enlightening.

Casey Rosenthal — Verica

  • Along the title, it explains the historical background and points of chaos engineering.

Heroku incident #2226 follow-up: Private Space apps experiencing domain to SSL cert mapping errors

The last paragraph regarding “unknown unknowns” is noteworthy.

Heroku

  • As commented in the Editor above, the “unknown, unknown” part is the highlight of this article.

Failover Conf follow-up: Your team and culture questions answered!

There are some great questions in here on blamelessness and full service ownership.

James Thigpen — Gremlin

  • As a follow-up to Failover Conf, answered questions that the panelists could not answer due to time constraints, and touched on the theme of “evolution.”
  • The beautiful and colorful illustrations of each session are great.

KubeWeekly # 262 May 21st, 2021 ← Due to KubeCon + CloudNativeCon Europe 2021, KubeWeekly will be closed for 2 weeks and will resume on May 21st.

How about those articles? Do you have any interest in any?

Actually, I have some contents which I can not digest at this stage, I’ll make use of this aide-memoire and links for catching-up for myself too.

Bye now!!

Yoshiki Fujiwara

--

--

Yoshiki Fujiwara

・Cloud Solutions Architect - AWS@NetApp in Tokyo, Japan. #AWS Certified Solution Architect&DevOps Professional, #Kubernetes, ・Opinions are my own.