$ cat post/sudo-bang-bang-run-/-that-script-still-runs-somewhere-deep-/-disk-full-on-impact.md

18SEP17

sudo bang bang run / that script still runs somewhere deep / disk full on impact

September 18, 2017 - Platform Engineering in the Kubernetes Era

September 18, 2017, found me sitting in my home office late at night, surrounded by open tabs on my computer—links to news articles that seemed more like a time capsule of internet outrage and tech drama. The day was quiet after a long week, but my mind was buzzing with thoughts about the state of platform engineering.

Platform Engineering in 2017

The term “platform engineering” was starting to gain traction around this time. I remember having discussions with colleagues where we’d debate the definition: is it just DevOps at scale? Or is there something more specific, like a focus on automation and infrastructure as code?

Kubernetes had clearly emerged as the winner in the container wars, and its adoption was exploding. We were moving our development teams over to k8s clusters, but that meant we needed robust tooling around it. Helm for charts, Istio for service mesh, and Envoy for network proxying—all of these were becoming essential parts of the Kubernetes ecosystem.

I spent a good portion of my time trying to get our CI/CD pipelines integrated with Kubernetes, making sure developers could deploy their code into production without breaking anything. It was a bit of a nightmare at first; we had to deal with all the nuances of k8s and its many moving parts.

GitOps: The New Frontier

Then came the buzz about GitOps—this idea that infrastructure as code should live in your version control system, just like application code does. It felt like a no-brainer, but implementing it was far from easy. We started using tools like Spinnaker for deployments and Flux for syncing our cluster state with a Git repository.

One of the biggest challenges we faced was making sure everyone understood what GitOps meant in practice. We had to go through several iterations of documentation and training sessions. It wasn’t just about moving code around; it was about changing how people think about infrastructure and treating it as first-class citizens alongside application logic.

Terraform and the Hype Train

On the infrastructure side, we were still dealing with Terraform 0.x, which felt like a version number from another era. We had to juggle between different versions while trying to figure out best practices for using it effectively. The community was growing quickly, but the API was evolving rapidly, making stability a constant concern.

One of my favorite moments during this period was arguing with one of our senior engineers about whether we should use Terraform or Ansible for infrastructure changes. He believed in the Ansible way; I thought Terraform would give us more control over how and when changes were applied to our clusters. The debate ended up being quite heated, but ultimately, we decided on a hybrid approach—using Terraform for critical changes and Ansible for less risky tasks.

Prometheus + Grafana: The New Watchmen

On the monitoring side, we had fully embraced Prometheus and Grafana. Nagios was outdated, and the move to these modern tools felt like night and day. We could now get real-time insights into our systems that were far more granular than what we had before.

However, setting up these new tools wasn’t without its quirks. I spent a fair amount of time debugging strange anomalies in our monitoring dashboards—discrepancies between what the metrics showed and what was actually happening in production. It took some trial and error to get things just right.

The Hype and the Reality

Looking back, September 2017 felt like we were riding a wave of hype surrounding Kubernetes, Helm, Istio, and Envoy. There was so much excitement about serverless computing, but our team wasn’t quite ready for that yet. We focused on getting k8s clusters up and running, building out CI/CD pipelines, and implementing GitOps practices.

The news stories from the time reflect a broader sense of tech community upheaval and uncertainty. React’s relicensing, privacy concerns with Facebook, and the ongoing drama around Firefox all seemed like distant distractions compared to the everyday challenges we faced in moving our infrastructure forward.

In the end, it was a mix of excitement and frustration—excitement about what was possible with Kubernetes and new tools, but frustration with the constant learning curve. I remember feeling grateful for these experiences, knowing that they would shape the way we approached platform engineering for years to come.

That’s where I left off on September 18, 2017. The path forward was clear, yet full of unknowns—just like it is today.