Image Recognition with Tensorflow classification on OpenWhisk

The big picture

Image classificationAs described in a previous article we (Niklas and I) are going to use Tensorflow to classify images into pre-trained categories. The previous artikel was about  on how to train a model with Tensorflow on Kubernetes. This article here now describes how to use the pre trained model which is stored on Object Storage. Similar to the training we will also use docker to host our program but this time we will use OpenWhisk as a platform.

Like the first part I also use the Google training Tensorflow for Poets. This time not the code itself but I copied the important classification parts from their script into my python file.

OpenWhisk with Docker

OpenWhisk is the open source implementation of an so called serverless computing platform. It is hosted by apache and maintained by many companies. IBM offers OpenWhisk on their IBM cloud and for testing and even playing around with it it the use is for free. Beside python and javascript OpenWhisk also offers the possibility to run docker containers. Internally all python and javascript code is executed anyhow on docker containers. So we will use the same official Tensorflow docker container we used to build our training docker container.

Internally OpenWhisk has three stages for docker containers. When we register a new method the execution instruction is only stored in a database and as soon as the first call approaches OpenWhisk the docker container is pulled from the repository, then initialised by an REST call to ‘\init‘ and then executed by calling the REST interface ‘\run‘. The docker container keeps active and each time the method is called only the ‘\run‘ part is executed. After some time of inactivity the container is destroyed and needs to be called with ‘\init‘ again. After even more time of inactivity even the image is removed and need to be pulled again.

The setup

The code itself is stored on github. Let’s have a look first on how we build the Docker container:


FROM tensorflow/tensorflow:1.4.0-py3

WORKDIR /tensorflow
COPY requirements.txt requirements.txt
RUN  pip install -r   requirements.txt


CMD python -u

As you can see this Docker is now really simple. It basically installs the python requirements to access the SWIFT Object Store and starts the python program. The python program keeps running until the OpenWhisk system decides the stop the container.

We make heavy use of the idea of having a init and a run part in the execute code. So the python program has two main parts. The first on is init and the second run. Let’ have a look the init part first which is basically setting up the stage for the classification itself.


@app.route('/init', methods=['POST'])
def init():

        message = flask.request.get_json(force=True, silent=True)

        if message and not isinstance(message, dict):

        conn = Connection(key='xxxxx',
                          os_options={"project_id": 'xxxxxx',
                                      "user_id": 'xxxxxx',
                                      "region_name": 'dallas'}

        obj       = conn.get_object("tensorflow", "retrained_graph.pb")
        graph_def = tf.GraphDef()
        with graph.as_default():

        obj    = conn.get_object("tensorflow", "retrained_labels.txt")
        for i in obj[1].decode("utf-8").split():

    except Exception as e:
        print("Error in downloading content")
        response = flask.jsonify({'error downloading models': e})
        response.status_code = 512

    return ('OK', 200)

Unfortunately it is not so easy to configure the init part in a dynamic way with parameters from outside. So for this demo we need to build the Object Store credentials in our source code. Doesn’t feel right but for a demo it is ok. In a later article I will describe how to change the flow and inject  the parameters in a dynamic way. So what are we doing here?

  1. 10-16 is setting up a connection to the Object Store as described here.
  2. 18-22 is reading the pre trained Tensorflow graph directly into memory. tf is a global variable
  3. 24-26 is reading the labels which are basically a string of names separated by line breaks. The labels are in the same order as the categories in the graph

By doing all this in the init part we only need to do it once and the run part can concentrate on classifying the images without doing any time consuming loading any more.

Tensorflow image manipulation and classification

def run():

    def error():
        response = flask.jsonify({'error': 'The action did not receive a dictionary as an argument.'})
        response.status_code = 404
        return response

    message = flask.request.get_json(force=True, silent=True)

    if message and not isinstance(message, dict):
        return error()
        args = message.get('value', {}) if message else {}

        if not isinstance(args, dict):
            return error()


        if "payload" not in args:
            return error()

        with open("/test.jpg", "wb") as f:

        file_reader      = tf.read_file("/test.jpg", "file_reader")
        #file_reader      = tf.decode_base64(args['payload'])
        image_reader     = tf.image.decode_jpeg(file_reader, channels=3, name='jpeg_reader')
        float_caster     = tf.cast(image_reader, tf.float32)
        dims_expander    = tf.expand_dims(float_caster, 0)
        resized          = tf.image.resize_bilinear(dims_expander, [224, 224])
        normalized       = tf.divide(tf.subtract(resized, [128]), [128])
        input_operation  = graph.get_operation_by_name("import/input")
        output_operation = graph.get_operation_by_name("import/final_result")
        tf_picture       = tf.Session().run(normalized)

        with tf.Session(graph=graph) as sess:
            results = np.squeeze([0], {input_operation.outputs[0]: tf_picture}))
            index   = results.argsort()
            answer  = {}

            for i in index:
                answer[labels[i]] = float(results[i])

            response = flask.jsonify(answer)
            response.status_code = 200

    return response

How to get the image

The image is transferred base64 encoded as part of the Line 24-25 request. Part of the dictionary is the key payload. I choose this because Node-red is using the same name for some kind of most important key. Tensorflow has a function to consume base64 encoded data as well but I could not get it to run with the image encoding I use. So I took the little extra step here and write the image on file and read it back later. By directly consuming it I think we could same some milliseconds processing time.

Transfer the image

  • Line 27 reads the image back from file
  • Line 29 decode the jpeg into an internal representation format
  • Line 30 cast the values to an float32 array
  • Line 31 adds a new dimension on the beginning of the array
  • Line 32 resizes the image to 224, 244 to have a similar size with the training data
  • Line 33 normalize the image values

Classify the image

  • Line 34-35 gets the input and output layer and stores it in the variables
  • Line 36 loads the image into Tensorflow
  • Line 39 here is the magic happening. Tensorflow processes the CNN with the input and output layer connected and consumes the Tensorflow image. Furthermore numpy is squeezing out all array nesting to a single array.
  • Line 40 has an array with probabilities for each category.

Mapp the result to labels

The missing last step is now to map the label names to the results which is be done in line 43 and 44.

Build and deploy it in OpenWhisk

The docker container can be build with

docker build -t <namespace>/tensorflow-openwhisk-classify:latest .

and pushed with

docker push <namespace>/tensorflow-openwhisk-classify:latest

Run it in OpenWhisk

After configuring the command line tool wsk the action itself can be created with

wsk action create tensorflow-classify --docker <namespace>/tensorflow-openwhisk-classify:latest

For testing we need an image base64 encoded as file on our local hard disk. Then we can invoke the call with

wsk action invoke --result tensorflow-classify --param payload `cat test.base64`

The first execution will take up to 15 seconds because the docker container will be pulled from docker hub and the graph will be loaded from the Object Store. Calls later should be around 150 milliseconds processing time. The parameter –result will force OpenWhisk to wait for the function to end and also show you the result on your command line.

    "daisy": 0.9998985528945923,
    "dandelion": 0.00007187054143287241,
    "roses": 4.515387388437375E-7,
    "sunflowers": 0.000029122467822162434,
    "tulips": 4.63972159303605E-11

If you want to get the log file and also an exact execution time try this command:

wsk activation get `wsk activation list | grep tensorflow-classify | cut -f 1 -d " " |head -n 1`
  • First call results in  “duration”: 3805. Your call itself took way longer in the first call because 3805 is only the execution of the docker container (including init) not the time it tooks OpenWhisk to pull the docker container from docker hub.
  • Second call results in  “duration”: 156.

Build a web UI

Well UI is nothing I can talk about. But have a look at Niklas blog post on how to build a web UI. An test installation can be found here:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.