March 13, 2023 - AI's Big Bang & My Serverless Odyssey

March 13, 2023

Today marks the midpoint of a tech year that feels like it’s been speeding up. I’m sitting in my home office, looking at my latest debugging session—this time, wrestling with serverless functions and their quirks. It’s 2023, and AI is everywhere, but there are still plenty of lessons to be learned.

AI/LLM Infrastructure: The Big Bang

Just a few weeks ago, GPT-4 dropped like a meteor, lighting up the tech world. While everyone was busy marveling at its capabilities, I had to focus on something more down-to-earth—infrastructure. ChatGPT’s success highlighted a critical issue we faced here: how do you scale and manage an LLM with real-time requests without breaking the bank?

One of my team’s recent projects involved building a platform for hosting conversational AI models. We decided to leverage serverless functions, like AWS Lambda and Google Cloud Functions, because they promised scalability and cost-efficiency. However, as anyone who has worked with serverless knows, it’s not all sunshine and rainbows.

Debugging Serverless Hell

Let me tell you about one of those dark moments. We deployed a batch of AI models to our serverless functions, only to find out that the cold start times were killing us. Cold starts are like waking up from a deep sleep; your function has to warm up before it can handle requests. This latency was unacceptable for our real-time chatbot.

I spent hours troubleshooting, checking logs, and profiling different aspects of our code. One day, I hit a breaking point and decided to just start over. It’s funny how you can get so entrenched in a problem that you miss the simplest solutions. In this case, adding caching layers between requests significantly reduced the cold start time. It’s like when you finally realize the key is under the doormat.

FinOps and Cloud Cost Pressure

Speaking of costs, we’re feeling the pressure from our FinOps team. DORA metrics are in full swing, and our service-level objectives (SLOs) need to be met while staying within budget. It’s a delicate balancing act between performance and cost, especially when you’re running a serverless platform.

One particularly frustrating day, I was trying to optimize our Lambda functions for both execution time and cold start times. Our FinOps team pulled me into a meeting where they laid out the financial impact of our current setup. The truth is harsh: we need to get more efficient or face budget cuts. It’s a stark reminder that while technology can solve many problems, it also brings new challenges.

Learning and Reflecting

Reflecting on this journey, I’m reminded of how much I’ve learned about serverless in the past few months. The initial excitement of moving to serverless was real, but dealing with its quirks has been a humbling experience. Debugging cold starts and optimizing function layers is not just about writing better code; it’s about understanding the entire system.

This week also brought another wave of industry news. The FDIC takeover of Silicon Valley Bank (SVB) caught everyone off guard. It’s a stark reminder that even in the tech world, financial stability matters. I couldn’t help but think about how these events might affect our company’s plans and investments.

Wrapping Up

As I sit here reflecting on March 2023, it feels like the year is just getting started. AI is pushing boundaries, FinOps is keeping us accountable, and serverless continues to challenge my engineering skills. But that’s what makes this field so exciting—there’s always something new to learn.

Stay tuned for more adventures in tech!

P.S.: If you’ve got a tip on optimizing cold start times or managing serverless costs, I’m all ears.