$ cat post/ping-with-no-reply-/-we-documented-nothing-then-/-the-daemon-still-hums.md
ping with no reply / we documented nothing then / the daemon still hums
Title: July 6, 2020 - A Year of SREs and Internal Portals
Today marks another milestone in the year that has seen so much change. It’s funny how a single day can feel like a chapter in a book when you look back over a whole year. This entry is going to be more personal, and I’ll throw in some tech stuff too.
A Day That Looks Back
It’s July 6, 2020. A few months ago, I wrote about the chaos of a distributed team working remotely for the first time. Now, as we approach the halfway mark of this unprecedented year, I’m reflecting on how much has changed and how much remains the same.
The Journey So Far
Last July, my company was just starting to formalize platform engineering roles. We were in the early stages of creating internal developer portals with Backstage, a tool that aims to simplify our lives as engineers by centralizing everything we need to do day-to-day. SREs (Site Reliability Engineers) had been around for a while, but this year was truly when their importance became undeniable.
Debugging Real-World Issues
This morning, I debugged an issue with eBPF. Our monitoring system flagged a potential problem in one of our services that was acting up under heavy load. The root cause? A race condition in the Linux kernel due to improper handling of file descriptors. It’s those moments when you realize just how deep the rabbit hole goes, and how much we depend on open-source technologies.
The SRE Playbook
Speaking of SREs, our team has been working hard this year to establish best practices for incident response and maintenance. We’ve written a playbook that outlines roles, responsibilities, and procedures. It’s a bit like the Dungeons & Dragons rule book, but in real life. The goal is clear—reduce downtime and improve system reliability.
Remote Infrastructure Scaling
This year also saw us scale our remote infrastructure to support more teams working from home. We’ve had to ensure that everything remains secure and performant while everyone is spread out across the globe. It’s been a challenge, but also an opportunity to rethink our tools and processes.
ArgoCD and Flux GitOps
Speaking of tools, ArgoCD and Flux have become staples in our GitOps journey. We’ve seen how these tools can help us keep infrastructure up-to-date with minimal manual intervention. The transition hasn’t been smooth—there are still moments when I question my sanity after a particularly complex Kubernetes manifest—but overall, they’re making our lives easier.
The News Around Us
While we were busy with internal ops and infrastructure, the tech world was also buzzing. For example, a significant hack on Twitter took place, highlighting just how fragile even the most well-known platforms can be. Meanwhile, the Earn IT Act stirred debates about privacy and security. These events made me think more deeply about the systems I work on and their potential impact.
Reflecting on Personal Growth
Reflecting on my own growth over this year, I’ve realized that there’s a lot of value in being honest and open with others. In one particularly heated argument, I had to admit that I hadn’t fully understood something until someone else pointed it out. That moment was both humbling and eye-opening.
Where We Are Now
As we move forward into the second half of 2020, I’m looking at a landscape where platform engineering is more formalized, internal developer portals are taking shape, and SRE practices are maturing. The road ahead will have its challenges, but with each step, I feel like I’m building something meaningful.
Conclusion
So here we are, mid-year 2020. It’s been a crazy ride, full of ups and downs. But despite the chaos, there’s a lot to be proud of. And who knows what the next year will bring? One thing is for sure—I’ll keep writing about it as I navigate through this tech world.
That’s how July 6, 2020 felt. A day in the life of an engineer reflecting on the journey so far and looking forward to more.