Workload container for autoscaling test with kubernetes


The Idea

Every now and then you want to test your installation, your server or your setup. Specially when you want to test auto scaling functionalities. Kubernetes has an out of the box auto scaler and the official descriptions recommends a test docker container for testing with a apache and php installation. This is really great for testing a web application where you have some workload for a relatively short time frame. But I would also like to test a scenario where the workload runs for a longer time in the kubernetes setup and generates way more cpu workload then a web application. Therefore I hacked a nice docker container based on a c program load generator.

The docker container

The docker container is basically a very very simple Flask server with only one entry point “/”. The workload itself can be configured via two parameters:

  • percentage How much cpu load will be generated
  • seconds How long will the workload be active

The docker container itself uses nearly no CPU cycles as Flask is the only python process being active and waits for calls to start using CPU cycles.


I use a very nice open source tool called lookbusy from Devin Carraway which consumes memory and cpu cycles based on command line parameters. Unfortunately the program has no parameter to configure the time span it shout run. Therefore I call it the unix command timeout to terminate its execution after the given amount of seconds.

The Flask python wrapper

The only program is a python Flask one, very short and only takes the get call to its root folder, checks for the two parameters and starts a thread with the subprocess. The get call immediately returns as it also supports long run workload simulations.

The Dockerfile

The docker container is based on python latest (at this time 3.6.4). I put all the curl, make, install and rm calls into a single line in order to have a minimal footprint for the docker layer as we do not need the source code any more. As Flask is the only requirements I also call it directly without the requirements.txt file. The “-u” parameter for the python call is necessary to prevent python from buffering the output. Otherwise it can be quite disturbing when trying to read the debug log file.

Building and pushing the docker container

Building and pushing it to is straightforward and nothing special.

Testing it on a kubernetes cluster

I have chosen the IBM cloud to test my docker container.

Requesting a kubernetes cluster

Requesting a kubernetes cluster can be done after login with

This command uses the bluemix CLI with the cluster plugin to control and configure kubernetes on the IBM infrastructure. The parameters are

  • –name to give your cluster a name (will be very important later on)
  • –location which datacenter to use (in this case dallas). Use “bx cs locations” to get your possible locations for the chosen region
  • –workers how many worker nodes are requested
  • –kube-version which kubernetes version should be used. Use “bx cs kube-versions” to get the available versions. “(default)” is not part of the parameter call.
  • –private-vlan which vlan for the private network should be used. Use “bx cs vlans <location>” to get the available public and private vlans
  • –public-vlan see private vlan
  • –machine-type which kind of underlying configuration you want to use for your worker node. Use “bx cs machine-types <location>” to get the available machine types. The first number after the “.” is the amount of cores and one after “x” the the amount of RAM in GB.

This command takes some time (~1h) to generate the kubernetes cluster. BTW my bluemix cli docker container has all necessary tools and also a nice script called “” to query all parameters and start a new cluster. After the cluster is up and running we can get the kubernetes configuration with

Starting a pod and replica set

We start the pod and replica set without a yaml file because the request is very straight forward. Important here is the parameter “–requests“. Without it the autoscaler can not measure the cpu load and it never triggers.

Exposing the http port

Again because the call is so simple we directly call kubectl without a yaml file to expose the Port 80. We can check for the public IP with

In case the cloud runs out of public IP addresses and the “EXTERNAL_IP” is still pending after several minutes we can use one of the workers public ip addresses and the dynamic assigned port. The port is visible with “kubectl get svc” at the “PORTS” section. The syntax is as always in docker internalport:externalport. The workers public IP can be checked with

So instead of calling our service with a official public ip address on port 80 we can use


Kubernetes has a build in horizontal autoscaler which can be started with

In this case it measures the cpu load and starts new pods when the load is over 50%. The autoscaler in this configuration never starts more than 10 and never less than 2 pods. The current measurements and parameters can be checked with

So right now the cpu load is 0 and only one replica is running.


Time to get call our container and start the load test. Depending on the URL we an use curl to start the test with

and check the result after some time with

As we see the load increases and autoscaler kicks in. More details can obtained with the “kubectl proxy” command.

Deleting the kubernetes cluster

To clean up we could either delete all pods and replica sets and services but we could also delete the complete cluster with