Kubernetes

Diary of a GKE Journeyman - Running your personal Kubernetes cluster for (almost) free on GKE

Mehdi El Gueddari

22 Mar 2019 • 8 min read

The first thing you'll need when starting developing apps on Kubernetes is your own Kubernetes cluster to play with, learn and experiment. Docker Desktop makes this easy by shipping with a pre-configured Kubernetes cluster that you can enable by checking a tick box in its preference pane.

Running Kubernetes locally comes at a (large) cost however. Docker and Kubernetes are CPU and battery hogs, even when sitting idle. Expect to loose well over half of your battery life to them.

Thankfully, there's another way: running your personal Kubernetes cluster in the cloud. On Google Kubernetes Engine (GKE), it turns out to be surprisingly cheap.

If your production app will be hosted on GKE, developing on a GKE cluster rather than locally will make the behaviour of your development environment much closer to that of the production one - always a good thing. It also lets you take advantage of all the tools GCP offers to manage and troubleshot your app, such as fully-managed Istio or Stackdriver. Cherry on the cake: getting used to all the tools GCP and GKE offer to manage and troubleshoot your app during development means that when the time comes to troubleshoot a production issue, you'll be ready.

Creating your personal K8S cluster on GKE

If you haven't got a GCP account yet, go create a trial account. You'll get $300 of free credit valid for a year. Then install and initialize the Google Cloud SDK on your machine. You'll also need to enable billing for your GCP project (you can use the default project GCP created for you. Or create your own project. Apart from the name of the project, it makes no difference). Don't worry - billing will only use up your free $300 credit. You card card will never be charged.

With this done, time to create your cluster:

# Enable the GCP beta features if not already done
gcloud components install beta

# Enable the GKE API if not already done
gcloud services enable container.googleapis.com

# Create a personal cluster
gcloud beta container clusters create personal \
--zone=europe-west1-b \
--preemptible \
--num-nodes=1 \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=6 \
--maintenance-window=03:00 \
--enable-ip-alias \
--enable-autoupgrade \
--enable-autorepair \
--enable-stackdriver-kubernetes \
--no-enable-basic-auth

# Check that it worked
kubectl get nodes

In details: digging into the configuration of our cluster

Our goal is to create a cluster that:

Just Works.
Is as close to free as possible.
Is fully-managed and 100% no-ops.
Can be trivially torn down and re-created in a single glcoud command. Experimentation and learning involves breaking many things. You want it to be quick & easy to start from scratch again.
Doesn't involve any additional tooling to create or manage.

Our goal is not to create a hardened, highly-available or production-ready cluster (don't use this cluster to host public-facing apps!).

This is how we do it:

Beta features

We're using gcloud beta to enable Stackdriver Kubnernetes Monitoring on our cluster. This allows the logs and metrics generated by both GKE and your own apps to be automatically shipped to Stackdriver. This is one less thing you'll have to worry about.

Stackdriver Kubnernetes Monitoring will become the default and only option in future versions of GKE but is currently only available as a beta feature.

Zonal cluster

GKE clusters can be created as either zonal clusters (running in a single compute zone) or regional clusters (highly-available across multiples zones in a given region).

In our case, in order minimize costs, we'll restrict ourselves to a single zone. This means that in the unlikely event of this zone going down, our cluster will be unavailable. Which shouldn't be an issue in practice: just create a new cluster in a different zone.

For minimal latency, you'll want to pick the zone closest to you. You'll find the list of zones here or by running gcloud compute zones list.

Preemptible VMs

Preemptible VMs are up to 80% cheaper than standard VMs, making them ideal for our use-case. The downside is that they run for a maximum of 24 hours and may shut down at any time without notice.

Only one node by default

Running a cluster with a single node means that whenever that node goes down because of planned maintenance (e.g. Kubernetes update) or an unplanned outage, our cluster will be unavailable. Potentially annoying. But cheaper than having multiple nodes. We can always increase the number of minimum nodes later if it becomes an issue in practice.

Auto-scaling

Auto-scaling may sound like it goes against our goal of keeping costs down. But it actually perfectly fits our goal of creating a cluster that Just Works while keeping costs as low as possible.

When creating pods on your cluster, each container will request a slice of CPU and a certain amount of RAM. On GKE, this defaults to 0.1 CPUs and 0MB RAM per container. You can increase or decrease these resource requests in your pod's manifest. Keep in mind that some system containers will be running on your nodes and using up resources in addition to your app's containers.

What this means in practice is that once you have a certain number of containers running on your cluster, new deployments will start failing due to lack of resources.

Our single node cluster is running a standard machine type with 1 vCPU. This means that deployments will start failing when we have just 10 containers with the default resource allocation running.

Auto-scaling solves this problem nicely by automatically provisioning new nodes as and when needed. But it works the other around too: when you delete pods that you're no longer using on your cluster, auto-provisioning will automatically de-provision nodes that are no longer needed, ensuring that you're not charged for nodes that you don't use. No-ops and cost optimization at its finest.

There is of course always a risk of forgetting to delete unused pods from your cluster, leading to charges for the nodes that were provisioned to host these pods. Using a tool like Skaffold, which makes it easy to delete an entire app from a cluster and can even automatically delete your app from your cluster at the end of each development session, can help mitigate this.

Maintenance window

Since we have a single-node cluster, setting a maintenance window at 3am UTC (the middle of night for me) helps minimize downtimes during the working day.

Auto-upgrade

With auto-upgrade, Google will take care of keeping the version of Kubernetes on our cluster up-to-date. That's one less thing to worry about.

Auto-repair

Auto-repair means that Google will continuously monitor the health of the VMs in our K8S node pool and automatically repair them if anything goes wrong. What's not to love?

Stackdriver Kubernetes Monitoring

With Stackdriver Kubernetes Monitoring, the logs and metrics of both your GKE cluster and your apps running on it will be automatically shipped to Stackdriver. This means that you'll get logging, metrics and error reporting for free out-of-the-box.

Disable Basic Auth

There's no reason to have Basic Auth enabled. It's an unnecessary risk. Basic Auth will be disabled by default in future versions of GKE. For now, you'll need to explicitly disable it.

Alias IPs

Alias IPs is arguably not strictly necessary for a personal, development cluster (until you need it that is). It doesn't cost anything though. And it may make things easier once you dig deeper into the entrails of GCP networking. These articles will get you started on your journey:

Upgrades and downgrades

More nodes for more reliability

If you find that your cluster is being unavailable more often than you'd like because your single node is down, you can increase --num-nodes and --min-nodes. There will be additional costs for each additional node of course.

Even cheaper with smaller VMs

As we didn't specify a machine type when creating our cluster, our K8S nodes will use a standard machine type with one vCPU and 3.75GB of RAM.

If you'd like to cut costs even further, you could instead use a small or even micro machine type that come with as little as 0.2 vCPU and 0.6GB of RAM.

In practice, the inconvenience of frequently running out of resources on these small machines may defeat the cost savings. But if you're really like to do it, use this option when creating your cluster:

--machine-type=g1-small

Istio

If you'd like to use Istio, add this option when creating your cluster:

--addons=HorizontalPodAutoscaling,HttpLoadBalancing,Istio

(the HorizontalPodAutoscaling and HttpLoadBalancing add-ons are enabled by default and aren't related to Istio. But since we're specifying a custom list of add-ons in order to add Istio, we have to explicitly include the default ones).

This will install Google's fully-managed Istio add-on to your cluster. With this add-on, Google will:

Install and configure Istio for you.
Keep Istio up-to-date.
When upgrading either Istio or Kubernetes, ensure that the version of Istio and Kubernetes on your cluster are compatible with each other.

How much does it cost?

The Kubernetes control plane on GKE is always free.
The Istio add-on, if you use it, is free.
Each node (if using preemptible standard machine types) costs between $7 and $11 / month depending on the zone you chose (see here). You could make it free by using a non-preemptible micro instance in certain US regions instead (see here)
Stackdriver Monitoring: GCP metrics collected by default are free. If your apps record custom metrics, you'll be charged a small fee (see here).
Stackdriver Logging: free as long as you don't log more than 50GB / month (logs on Stackdriver are only retained for 30 days). See here.
If you want to expose your GKE-hosted apps to the public, you'll need a public IP address and a load balancer (see here for the different options available to expose apps to the public on GKE). That will cost between $18 and $28 / month depending on your cluster's region (see the Load Balancing section here).
If you only need to access your apps from your local machine, there is no need for a public IP address or load balancer. You can instead use kubectl port-forward to forward connections to a local port to your GKE-hosted app, which will let you access your app via localhost. See here for more details.

TOTAL: about $8 / month in an average region for a single node cluster. Or $29 / month if you also want to expose your GKE-hosted apps to the public.

Will I get a surprise bill one day if I set this up in a trial account and then forget about it?

If you create your GKE cluster in a GCP trial account and either use up your $300 of free credit or reach the end of your 12-month trial period, all paid-for GCP resources will be stopped by Google. Google will not automatically start to charge you - even if you provided your credit card details when you signed up. You'll have to go the GCP Console and explicitly request to have your credit card charged if you wish to start paying.

With a GCP trial account, there is no risk to accidentally spend real money at the end of your trial.

I have my personal GKE cluster. Now what?

A few example apps to try out your GKE cluster.

They use Skaffold to make the build and deployment to your cluster a one-liner affair. Once you're done with them, run skaffold delete from the root of their git repo and all traces of them will be removed from your cluster.

.NET and Go Hello World apps with GCP Stackdriver integration. A good starting place for your own apps.
The Google Microservice Demo app. Made of 10 microservices, it's a great way to see auto-scaling in action. Install it on your GKE cluster. Then check your K8S nodes: kubectl get nodes. GKE should have automatically provisioned an additional 3 nodes to allow hosting the app.
Delete the app with skaffold delete, wait a few minutes and watch auto-scaling automatically remove the now unused extra nodes.