$ cat post/backstage-in-the-cloud:-a-year-of-platform-engineering.md
Backstage in the Cloud: A Year of Platform Engineering
December 6th, 2021—this feels like a good day to reflect on the past year. It’s been a whirlwind of changes and learnings, all wrapped up with the backdrop of platform engineering becoming more formalized.
The Era: Backstage and Beyond
Platform engineering has really started to take shape this year. I’ve spent quite a bit of time setting up an internal developer portal using Backstage. It’s amazing how much functionality it can provide, but boy, is there a learning curve! I’m still figuring out the best ways to integrate all our services and tools into one cohesive platform. The initial setup was a bit of a challenge, but now that I’ve got my feet wet, I’m starting to see the benefits.
On the SRE front, we’re seeing more and more people adopt roles like Site Reliability Engineers. It’s great to see the recognition for this role, as it truly encapsulates what good infrastructure management is all about. But with more responsibility comes more pressure—every outage feels heavier now that there’s a team dedicated to ensuring stability.
Remote Work: A Double-Edged Sword
The pandemic forced us into remote-first infra scaling almost overnight. We’re now dealing with a different kind of monitoring and debugging, especially when everything is scattered across multiple environments. Tools like Prometheus and Grafana have become essential for real-time visibility, but they can only do so much when you don’t have physical access to the machines.
Kubernetes Complexity: A Love-Hate Relationship
Kubernetes has been a game-changer, no doubt about it. But as we scale and add more services, the complexity starts to show. I’m still wrestling with how best to manage our stateful applications in a Kubernetes cluster. The community is moving fast, which is great, but sometimes that means breaking changes too.
eBPF: A Tool for the Future?
eBPF has started to gain more traction this year. I’ve been experimenting with it a bit—mainly using it for some network tracing and performance optimizations. The power of being able to inject code directly into the kernel is incredible, but it’s also a double-edged sword when it comes to debugging and managing those tweaks.
Debugging Log4j RCE: A Reality Check
Speaking of double-edged swords, this year taught us all about the importance of proper vetting. The Log4j RCE vulnerability hit us hard. It’s a stark reminder that we need to stay vigilant and ensure our systems are up-to-date with the latest security patches.
A Year in Review
This year has been a rollercoaster of ups and downs, but I wouldn’t trade it for anything. From the stealth bombers appearing on Google Maps (I still can’t wrap my head around that one) to the ongoing debate about whether developers are really as busy as they say, every day brings something new.
And who knows what 2022 will bring? With ArgoCD and Flux GitOps, I’m excited to see how we can continue to streamline our infrastructure management. But for now, let’s just enjoy the holiday season and hope that next year brings a bit more stability.
Happy debugging!