$ cat post/the-swap-filled-at-last-/-the-pipeline-hung-on-step-three-/-the-log-is-silent.md

the swap filled at last / the pipeline hung on step three / the log is silent


Title: Kubernetes Complexity Fatigue and the Quest for Simplicity


May 10, 2021 - Today’s reflection is a bit of an internal monologue. I’ve been wrestling with the same thought loop lately: how to make our Kubernetes setup less complex without compromising on stability or functionality.

The Setup

Our platform has grown organically over several years, and it’s showing. We have multiple clusters, custom scripts for cluster creation and upgrades, a dozen Helm charts scattered across various repositories, and a confusing array of CI/CD pipelines. Every new hire spends their first few months deciphering the tangled web we call our Kubernetes ecosystem.

The Problem

The main issue is complexity fatigue. Developers are increasingly frustrated by the steep learning curve, the number of moving parts, and the lack of clear, consistent documentation. They feel like they’re constantly jumping through hoops to deploy new features or fix issues.

The Search for Simplicity

I’ve been looking into various tools and best practices that could help us streamline our setup. Here are a few I’ve considered:

  • ArgoCD: We use it in some clusters but not consistently across all of them. It’s great for GitOps, but integrating it properly with our existing CI/CD pipelines is non-trivial.
  • Kustomize: This tool aims to simplify the configuration management of Kubernetes resources. However, I’m still unsure if it will be enough on its own to reduce complexity without adding another layer of abstraction.
  • E2E Testing Tools: We’re already using some basic testing tools, but more comprehensive E2E testing could help catch issues early and make our development process smoother.

The Experiment

I decided to experiment with a minimal setup for a new feature. Instead of creating an elaborate Helm chart or deploying everything via a custom script, I set up the bare minimum using kubectl directly from within our CI/CD pipeline.

What Worked

  • Simplicity: By using just kubectl, we reduced the number of steps required to deploy and manage resources.
  • Flexibility: We retained full control over our deployments without adding any unnecessary complexity.
  • Consistency: This approach allowed us to write more straightforward and consistent code, which is easier for everyone on the team to understand.

What Didn’t Work

  • Manual Errors: With fewer tools to rely on, we had to be extra careful. A single typo in a kubectl command can cause significant issues.
  • Documentation: While it’s simpler in theory, documenting everything manually proved challenging. We need better practices for ensuring that every step is well-documented.

The Future

I’m still not sure if this minimal approach will scale or be sustainable as our platform grows. However, it has sparked a conversation among the team about the trade-offs between simplicity and robustness. We’re now considering whether we should standardize on certain tools and practices to strike a better balance.

As I type this, I can hear my colleagues discussing the pros and cons of different approaches in the background. It’s clear that there isn’t one-size-fits-all solution here. But at least we’re having these discussions openly, which is progress.

The Conclusion

For now, I think our goal should be to find a balance between keeping things simple enough for everyone to use comfortably and ensuring that our infrastructure remains robust and scalable. It’s a journey, and there will be setbacks along the way. But every step forward brings us closer to a more sustainable and maintainable setup.

Stay tuned for updates on how this experiment unfolds. Until then, keep your tools sharp and your curiosity alive!