Remote Ops: A Week in Quarantine

April 13, 2020 was a day that felt like the start of a new chapter in my career. The world had taken an unexpected turn, and suddenly, the lines between work and home were blurred by global pandemic. Our team at Platform Engineering was already moving towards remote-first practices, but now it was time to put those principles into action on a massive scale.

I remember logging into our internal developer portal, Backstage, just like any other day. But as I scrolled through the familiar dashboard, something struck me: the notifications and logs were still there, but my office chair was suddenly missing from the picture. It felt surreal, like stepping into a different dimension where infrastructure issues were now being solved without ever leaving your home.

One of the challenges we faced immediately was ensuring that our Kubernetes cluster remained stable under remote-first conditions. We had already begun using eBPF to monitor and tune our systems in real-time, but now we needed to scale this practice. The complexity fatigue mentioned in the industry news was real—every day seemed like a new challenge, from managing stateful workloads to optimizing network latency.

A key moment came when I sat down with my team to discuss how to handle an increase in remote traffic. We were using ArgoCD and Flux for GitOps, which had been maturing nicely over the past few months. But now, we needed to make sure these tools could scale without breaking under the pressure of a sudden influx of users working from home.

I spent some time arguing with myself about whether we should stick strictly to our local development environments or if we needed to set up more robust CI/CD pipelines. The truth was, both approaches had their merits, and deciding which path to take required careful consideration. In the end, I decided that a hybrid approach would work best—using local dev environments for rapid iteration while ensuring that our GitOps practices remained solid.

Debugging issues became a bit of an adventure in isolation. One particularly frustrating bug involved some strange network behavior between our on-prem and cloud-based resources. The logs were cryptic, and I found myself asking “what would the logs look like if I were actually there?” It was a humbling reminder that sometimes being present can make all the difference.

On this day, I also had to deal with the personal struggles of adjusting to remote work. My home office setup wasn’t ideal—there were days when my cat, Mocha, would jump onto my keyboard at just the wrong moment. But despite these small annoyances, there was a strange sense of productivity. Without the distractions of a physical office and the constant flow of people in and out, I found myself more focused on solving complex problems.

The tech world was buzzing with activity during this time. The GitHub team’s move to free-for-all teams felt like a game-changer, while the Google SRE book was providing a lot of new insights into reliability engineering. As for John Conway’s passing, it reminded me that the software giants we often look up to are made up of real people who face their own challenges.

In retrospect, April 13, 2020 wasn’t just another day in my career; it was a turning point where the old ways of working began to fade and new practices were born out of necessity. The technology landscape might have seemed unchanged at first glance, but under the surface, we were building a more resilient, distributed future—one that would serve us well as the world continued to adapt to this unprecedented time.