$ cat post/telnet-to-nowhere-/-the-service-mesh-confused-us-all-/-it-ran-in-the-dark.md

telnet to nowhere / the service mesh confused us all / it ran in the dark


Title: Reflections on a Wild Month: AI Overload and FinOps Pressure


May 20, 2024. I’m sitting at my desk, sipping a lukewarm coffee, looking over the latest Hacker News stories as they scroll by like a feverish dream. GPT-4o, Daylight Computer’s e-paper tablet, TTE Terminal Text Effects—these are just more fuel for the AI fire that has been burning since ChatGPT first showed up. It feels like I’ve been living in a perpetual tech storm.

The AI Tsunami

Last month, I was deep into tuning our LLM (Language Model) infrastructure at work. We’re trying to build something robust and scalable, but it’s a constant battle with the bleeding edge. GPT-4o might be a step up from 3.5, but there are always new papers and frameworks popping up. Every day feels like I need to learn another API or library. It’s exhausting, but exhilarating.

Just last week, we finally got our in-house model running on multiple GPUs without crashing the server. But then came the question: how do we manage this beast? Do we go full-on Kubernetes orchestration, or are there simpler tools that can get the job done with less overhead? The conversation around FinOps is heating up too—every dollar counts when it comes to AI infrastructure.

Platform Engineering and Cost Control

Platform engineering has become a mainstream discipline. We’ve been working on optimizing our platform services, which means diving into web assembly (Wasm) for server-side functionality. It’s fascinating how this technology can bring more flexibility and performance, but the learning curve is steep. Plus, every optimization needs to be balanced against the operational costs.

We’re also wrestling with DORA metrics in our daily standups. How fast are we deploying? Are we delivering value through customer feedback loops? The pressure is real; cloud providers like AWS are always pushing for more usage, and we can’t afford to overspend on infrastructure that isn’t being used efficiently. Our developers spend a lot of time arguing about the trade-offs between cutting-edge tech and practicality.

Real Work: Debugging and Shipping

Last week, I was working late trying to debug an issue with our GraphQL API. After six years, I’ve come to understand the pain points of GraphQL better than ever before. The schema evolution is a nightmare, and keeping track of mutations can be like chasing shadows. But we shipped a critical update on time, which felt good. It’s moments like these that remind me why I got into this field—seeing your hard work pay off.

Personal Learning Journey

On the side, I’ve been trying to stay up-to-date with all the latest AI developments. The OpenAI documents leak was interesting; it gave us a peek behind the curtain of one of the most successful AI companies in the world. It’s sobering to see how they operate and manage their resources.

I also started reading more about WebAssembly for server-side applications, which has been growing in popularity. It’s exciting to think about what can be done with this technology—potentially reducing our dependencies on complex VMs or containers while still getting the performance benefits of compiled code.

Wrapping Up

As I look out the window at the aurora lighting up the sky outside, it feels like we’re living through a tech revolution. The pace is relentless, and every day brings new challenges. But that’s why I love this job—there’s always something to learn, and there are never dull moments.

This month has been about debugging, shipping, and arguing about the best ways forward in a world where AI is everywhere and FinOps pressures loom large. It’s a wild ride, but one I wouldn’t trade for anything else.


That’s my take on how things are shaping up right now. Time to hit the books again; there’s always more to learn.