$ cat post/grep-through-the-dark-log-/-a-shell-history-of-years-/-uptime-was-the-proof.md

grep through the dark log / a shell history of years / uptime was the proof


Navigating the Pandemic’s Shadows in Ops


October 19, 2020. Just a few months ago, we were all buzzing about Kubernetes and platform engineering. Now, it’s a race to keep everything running while half of us are working from home with nothing but hope that our Wi-Fi holds up.

I’ve been dealing with the usual ops stuff—firewalls, load balancers, monitoring systems—but there’s this weird feeling in my bones. The tech world is still chugging along, but everyone seems a bit off kilter. We’re all trying to keep things running while we’re scattered across continents, and I can’t help but wonder if the tools and practices we’ve been talking about have actually made our work easier.

The Day the Network Died

One of those days came last week. Our internal developer portal, built on Backstage, was down. Not just a little hiccup; it went completely offline for most of the day. Panic started to set in as developers tried to access their favorite tools. I had my team working around the clock, but something about this situation felt… different.

It turned out to be a firewall issue. We had updated some rules last week during one of those late-night meetings when we thought everything would go smoothly. But today, it wasn’t so smooth. The problem was that our internal network configuration hadn’t caught up with the changes yet. A couple of services were left exposed and couldn’t handle the traffic.

We traced the issue down to a misconfigured eBPF program that was supposed to manage traffic flow but ended up causing more harm than good. It’s moments like these when I realize how much we rely on these tools, even if they sometimes bite us in the butt.

SRE or DevOps?

As I sat there, staring at the logs, I couldn’t help but think about all the debates around SRE versus DevOps roles. In a remote-first world, does it matter? Or is everyone just going to call themselves an “Engineering Manager” and pretend they know what they’re doing?

I’ve been arguing for more standardized tooling—like making sure eBPF programs are thoroughly tested before deploying them into production. But in this chaos of working from home, the line between dev and ops has become blurred.

The Bright Spots

Despite the chaos, there were some bright spots. Our Kubernetes cluster had been scaling up nicely to support remote workers. We started using ArgoCD more heavily, which really helped us manage deployments even when we weren’t in the same room. And Flux GitOps continued to grow in popularity—though it’s still a bit of a beast to get right.

There was also this interesting debate about whether we should use AWS or not. One of our projects got forked by AWS and launched as its own service, which got me thinking: are cloud providers just becoming their own vendors? It’s a strange concept, but one that’s probably here to stay.

Life in Isolation

The hardest part isn’t the technical stuff, though. It’s the isolation. Working from home has made everything feel more distant. I miss the daily interactions with my team members—those quick hallway conversations or coffee chats where ideas flow organically. Now, most of our communication is via Slack and Zoom, which can be a bit clunky at times.

But here’s something unexpected: productivity actually went up in some ways. Without the distractions of a physical office, everyone seemed to focus more on their tasks. Maybe this remote setup isn’t as bad as I thought.

Looking Forward

As we move forward, I hope we continue to build better tools and practices that can help us stay productive while working remotely. We need to remember that our tech should make life easier, not harder.

The days of the office commute might be changing forever, but let’s ensure that our work culture adapts gracefully. Maybe in a few years, these strange times will just be remembered as an interesting chapter in tech history—before everything went virtual.

Until then, I’ll keep debugging and writing code from my home office, hoping for better days ahead.


That’s how it feels to me, at least. The world of ops is still full of challenges, but the remote-first setup adds a new layer of complexity. Let’s hope we navigate these changes with grace and resilience.