I moved my sites from Google Kubernetes Engine to Netlify and saved $1000 / year (plus numerous hours of maintenance)
This month I moved from Google Cloud’s hosted Kubernetes (Google Kubernetes Engine or GKE) to Netlify. My migration onto GKE was one of my big projects last year and this move to Netlify was a big move this year so I wanted to take the time to sit down and reflect on the journey - the decisions that were made and the resultant outcomes.
Last year I migrated all my sites to Google Kubernetes Engine from running adhoc on a machine from Digital Ocean. I did this for a few reasons, but the biggest ones I can remember were:
- I wanted more structure and less dealing with individual machines
- I wanted to learn about the hot shit that was Kubernetes
- I wanted experience with a mainstream cloud provider
Given these goals, the experience was valuable and served its purpose. I did learn a ton about k8s and do prefer the code-first configurations it provides over configuring individual machines. It’s saved me lots of time and lots of headaches.
I get k8s at a basic level - I understand why its approach is so scalable both from a performance and maintenance standpoint.
I got experience with Google Cloud and can see how one could scale their creation to 100s of thousands of users should the need arise.
moving away from GKE
As the post title suggests, though this experience was extremely valuable for me, it wasn’t enough to keep me there.
gke was overkill for my sites
I think my sites are cool and valuable (which is part of the reason I keep building them) but they aren’t really that popular. Last month, one of my posts hit the front page of Hacker News bringing in the most traffic I’d ever gotten - ~1,500 visitors in one day. This brought my total monthly visitors up to 3,801.
A screenshot of what that spike looked like in my stats
This was obviously a lot of traffic for me. But if we sit down to do the math real quick, 3,800 visitors each month means just 0.001446 per second, 0.08676 per minute, or 5.205 per hour (see the Wolfram Alpha calculation). Which isn’t much at all.
A modern laptop could easily handle that. A 5 year old laptop probably wouldn’t have trouble. I think even a new Raspberry Pi could do it with no problem (assuming you keep it from overheating).
The point of all this is that while my setup was good, it was also way overkill for what I was actually using it for. We can see this in my cpu utilization for the last two weeks in November (the same time period where my Hacker News traffic spike occurred) where my utilization didn’t spike despite the extra load.
My Kubernetes cluster CPU utilization for the last half of November
It’s great that my cluster could support such a high load without even trying, but when things like that happen you have to wonder - am I doing too much?
gke was expensive for what I needed
To that, my answer is: probably.
You can play with cloud provider pricing using Digital Ocean’s pricing calculator but basically they’re all in the same range until you move up from the starter tier.
For me, I was running on 3 N1 standard machines, as that was listed as the recommended minimum for running k8s in the docs. I don’t know if they’ve changed that, but I’m now positive I could’ve run on a lower tier of machines while supporting my performance constraints - but I wasn’t sure then and just wanted to get up and running.
My 2019 Google Cloud expenses for my sites
So in 2019 I used approximately $1000 to host my sites, with ~$600 coming out-of-pocket due to credits and discounts I had accrued. In 2019, my sites had about 9,400 visitors which means that each visitor cost ~$0.106 to serve. Which is a bit ridiculous when you consider that the average amount of money one makes per visitor from ads is just $0.0025 (as calculated in a previous post comparing ad revenue to browser-based crypto mining revenue).
Was I doing too much? Yes.
gke is not maintenance free
The final thing for me - and really the straw that broke the Bantha’s back - was that kubernetes is not quite a set-it-and-forget-it system. I learned a lot about k8s through this process, but I would still say I’m an amateur. As such, when things go wrong I rarely have the ability to diagnose and fix it fast and independently.
I’d guesstimate that I spent somewhere between 20 and 40 hours in 2019 troubleshooting problems in my cluster - from why pods weren’t updating to why pods would die once every quarter or so to why my automatic ssl renewal didn’t go through (the most recent of my trials). This isn’t that much in the grand scheme of my side projects - I love creating projects and try to work on them each and every day - but it is an opportunity cost in the projects I can work on. I think most people would agree with me that building a new project is more exciting than troubleshooting why
cert-manager spontaneously combusted in the past month and the recommended path of purging and re-installing fails cert renewal silently, particularly when you had a cool new project you were really excited to work on instead.
Had I needed the extra oomph that k8s could provide for me, then this extra maintenance wouldn’t have been a big deal - it’s a cool project and the work was worth it - but once I realized that I really didn’t need this, the extra work became meaningless and inefficient (read: ripe for the chopping block).
So I moved to Netlify. The main reasons I did this were 1) I knew it was simple to set up, 2) I knew it worked (both from online chatter and personal experience in the building of Mine for Good), and 3) I figured that my use cases would fit squarely within the free tier.
So far I’ve been right about all of them. I was able to move off of GKE in < 2 hours (including builds, deploys, subdomain routing, and SSL cert renewal - though your routing may vary ofc) and haven’t had any problems - in fact I think my sites might actually be a little faster now, maybe due to the extra cacheability of my different sites now that they aren’t all sitting behind a reverse proxy on a single IP, but idk.
From their pricing page, I am a little concerned about the bandwidth contraints (at 100GB) but it’s $20 for another 100GB so I can deal with that when the time comes (and it’s still less than what I was paying all the time on GKE) and that they don’t list a 99.99% SLA as available for us commoners, but I’m not sure that really matters for me and can always revisit if it becomes a problem.
The big implicit win for me moving to Netlify is that the infra is almost totally managed. I don’t have to worry if the infra goes down, their engineers will handle it. This is a slight paradigm shift from what I typically like to do - I really, really like the infrastructure of big architectures, how they’re built and maintained - but for my personal projects where the focus isn’t on the infra but what the projects actually do, this is a huge win for my focus and, logically, for my projected output.
My main takeaways from this experience were that it’s okay to optimize for the needs of the here and now, but for that to work you also need to implement a recurring habit of continual optimization so you don’t stick to an old optimization past its usefulness. I chose k8s for good reasons last year and when I began to realize that the decision no longer served me as well as it had, I considered changing said decision - by doing so, I’ve probably saved myself ~$1,000 in 2020.
It’s only been a few days since I moved off of GKE to Netlify but, like I said, it’s been smooth sailing thus far and I’m saving about $60 each month. If something comes up with Netlify, particularly with bandwidth and / or uptime limits, I’ll report back here and likely undertake another move.
Until then, thanks for reading!