$ cat post/bios-beep-sequence-/-the-socket-never-closed-right-/-i-wrote-the-postmortem.md
BIOS beep sequence / the socket never closed right / I wrote the postmortem
Title: Dealing with the AI Tsunami and Serverless Snags
August 21, 2023. The tech world is in a whirlwind of excitement as AI/LLM infrastructure explodes post-ChatGPT. It’s like the internet itself is rewriting its protocols for smarter communication. As an engineering manager, I find myself constantly juggling between keeping our services up and running while trying to integrate the latest advancements.
Debugging the AI Inferno
One of the biggest challenges we’ve faced recently is dealing with the sheer volume of AI requests hitting our servers. With LLMs like Anthropic’s Claude and Microsoft’s Qwen becoming more mainstream, users are suddenly expecting real-time responses that can handle complex queries. Our infrastructure wasn’t designed for this kind of load, and I found myself spending countless nights trying to optimize our APIs.
We had some bottlenecks in our caching layers, and the latency was unacceptable when under heavy load. It’s not just about making things faster; it’s about making them robust enough to handle spikes without crashing. We’re now using a combination of Redis and Memcached for caching, along with AWS Lambda to offload some of the compute-intensive tasks. The trick is finding the right balance between cost and performance.
Platform Engineering and FinOps
On the platform engineering side, we’ve been exploring ways to automate more of our infrastructure. With the rise of FinOps and cloud cost pressure, it’s crucial that we maintain a lean operation. We’re using tools like Spinnaker for CI/CD and leveraging Terraform for provisioning resources dynamically. The transition has been smooth but not without its hiccups.
A recent argument we had in our team was around whether to stick with Terraform or move towards OpenTF, the fork of Terraform announced by OpenTF. We weighed the pros and cons, considering factors like community support, stability, and long-term maintenance. In the end, we decided to stick with the original Terraform for now, but keeping an eye on OpenTF’s progress.
Serverless and WebAssembly
Speaking of Lambdas, they’ve become a game-changer in our operations. We’re starting to experiment more with serverless architectures, especially with functions that handle short-lived tasks like email sending or generating thumbnails. The idea of “write once, deploy anywhere” is compelling, but we’re still figuring out the best practices for error handling and resource management.
One of the snags I encountered was with WebAssembly on the server side. While it sounds promising for running JavaScript code in a server environment, the reality is that performance gains come at the cost of complexity. We’re using Deno as an entry point to explore this space, but it’s still early days. The community seems excited, but there are plenty of kinks we need to iron out.
Developer Experience and DORA Metrics
Developer experience has become a first-class citizen in our organization. We’re focusing on reducing the friction between idea and code by adopting tools like VS Code with extensions for debugging and testing. It’s not just about writing better code; it’s also about making sure that the tools we use are intuitive and efficient.
In line with this, we’ve been tracking DORA metrics to understand where we stand in terms of deployment frequency, lead time, mean time to recovery, and change failure rate. The goal is to continuously improve our release cycles and ensure that when things do go wrong, we can recover quickly.
Wrapping Up
August 2023 has been a whirlwind. As the tech landscape continues to evolve rapidly, it’s crucial for us to stay adaptable and agile. From dealing with AI infernos to navigating serverless snags, each day brings new challenges. But that’s what makes this job so rewarding—continuously learning and pushing boundaries.
If I were to sum up my reflections, I’d say we’re on the cusp of a major shift in how we build and maintain our infrastructure. The tools are here, but they require meticulous attention and thoughtful implementation. Here’s to hoping that next year’s tech landscape is just as exciting!