$ cat post/the-blinking-cursor-/-the-deploy-went-sideways-fast-/-the-secret-rotated.md
the blinking cursor / the deploy went sideways fast / the secret rotated
Title: Debugging FinOps Nightmares in a World of LLMs
September 5, 2022. It’s been an interesting month for tech, but as I sit down to write this post, my mind is still reeling from the events of this summer and early fall. The world seems to be shifting under our feet with every passing day—AI/ML models getting bigger and more powerful, platforms becoming mainstream, and FinOps pressures mounting like never before.
The AI/LLM Wave Hits
The past few months have seen a surge in attention around large language models (LLMs). ChatGPT’s release was just the tip of the iceberg. Every tech company from Google to Microsoft is jockeying for position, pouring resources into these giant neural networks. As an engineer, it’s both exciting and a bit daunting—exciting because there are new tools at our disposal, but daunting because managing infrastructure for these models can be incredibly complex.
My Recent Debugging Adventure
A few weeks ago, I was knee-deep in one of those typical ops nightmares. Our team had just rolled out a new service that leveraged an LLM to generate synthetic data for testing purposes. It was supposed to be straightforward—just another API call here and there. However, things took a turn when the costs started mounting up like crazy.
Turns out, our service wasn’t as efficient as we thought. We had inadvertently left some heavy-weight operations running in the background that were eating up resources without much need. The bill for this month alone was over $5,000, and it kept climbing. This was a stark reminder of why FinOps is such an important discipline.
Embracing FinOps
FinOps isn’t just about saving money; it’s about understanding the true cost of running your services and making informed decisions. As more companies move to cloud-native architectures, the ability to manage costs effectively becomes even more crucial. I spent a lot of time reviewing our AWS billing reports, trying to figure out where the leaks were.
I ended up writing a script that would automatically monitor and alert us when certain thresholds were hit. It’s not rocket science, but it does require vigilance. After all, every penny saved in cloud costs is a dollar more that can go into developing new features or improving existing ones.
Developer Experience and Platform Engineering
Speaking of platform engineering, I’ve been spending a lot of time thinking about developer experience lately. We’re working on building out some internal tools to make it easier for developers to set up their environments and manage dependencies. It’s not just about making things faster; it’s also about ensuring that everyone has the right resources they need without having to worry too much about infrastructure details.
One of the tools I’m excited about is WebAssembly (Wasm) on the server side. The idea of running compiled JavaScript in a sandboxed environment seems pretty cool. It could potentially reduce our footprints and improve performance, but there are still plenty of kinks to work out.
Self-Hosted Email and the Art of Giving Up
On a different note, something I’ve been wrestling with is my decision to self-host email for so many years. Running your own mail server can be a lot of work—managing spam filters, ensuring security, etc. After twenty-three years of doing it myself, I realized that maybe it was time to just give up and move to a hosted service. The pain points were getting to me, and honestly, the benefits weren’t as significant anymore.
Wrapping Up
Looking back at the stories from Hacker News this month, they paint a picture of tech in transition—full of excitement but also filled with challenges. Whether it’s dealing with the financial pressures of cloud infrastructure or navigating the complex landscape of AI/ML models, there’s no shortage of work to be done.
For now, I’m just focusing on the tasks at hand: improving our platform tools, debugging cost overruns, and ensuring that we stay on top of DORA metrics. The future is uncertain, but one thing is certain—there will always be more to learn and improve upon.
Until next time,
Brandon