Python model operator in a streams flow
The Python model operator provides a simple way for you to run Python models to do real-time predictions and scoring. The operator now enables to select a model to be loaded from IBM Cloud Object Storage or IBM Watson Machine Learning.
How it works
Loading from Cloud Object Storage
File objects are the external file resources that are required by your code at execution time. Your Machine Learning
process() function can expect these resources
to be available on the runtime-local file system before it gets invoked for the first time.
You can specify more than one file object, such as when you want to also use a tokenizer or a dictionary for text analysis. For each file object, you specify its location path in Cloud Object Storage and a typically short reference name that is used in its callback function. Clicking Generate Callbacks appends a callback function stub to your code, for each file object.
When the flow starts running, each specified file object is downloaded from Cloud Object Storage and placed at a unique location on the runtime-local file system. At that point, your callback function is called with the runtime-local file path as an argument. Your callback function then instantiates and keeps the respective object for usage in subsequent processing.
All specified file objects must be available on Cloud Object Storage before your
process() function is called for the first time.
Until then, any incoming events are held back.
Cloud Object Storage is continually scanned and it is checked if the file object was updated. If so, the file is reloaded to the runtime-local file system. Then, its callback function is called again, which redeserializes the respective object and updates the state with the new model object, without restarting the flow.
The Python objects that you load into Cloud Object Storage must be created with the same version of packages that are used in the streams flow. To see the list of preinstalled and user-installed packages, go to the canvas, click , and then click Runtime.
Goal: Run predictive analysis by using a tokenizer and a model that were uploaded to Cloud Object Storage. After you define the file objects and click Generate Callbacks, we generate stubs for your load_model() and load_tokenizer() callback functions.
import requests import sys import os import pickle def init(state): pass def process(event,state): text = event[‘tweet’] text_t = state[‘vectorizer’].transform([text]) y_pred = state[‘classifier’].predict(text_t) # predicts class as number labels = [‘irrelevant’, ‘negative’, ‘neutral’, ‘positive’] event[‘sentiment’] = labels[y_pred] return event def load_classifier(state, path_classifier): state[‘classifier’] = pickle.load(open(path_classifier, “rb” )) def load_vectorizer(state, path_vectorizer): state[‘vectorizer’] = pickle.load(open(path_vectorizer, “rb” ))
Loading from Watson Machine Learning
When you select Watson Machine Learning as your source for the model to be loaded, all Python models from all associated Machine Learning instances are listed. Select the model you want to load.
The load_model function is called when the model is loaded for the first time. It is continually checked if the model was updated in Watson Machine Learning and if it is, the load_model function is invoked again with the updated model. The model is now part of the state to be later used in the process function.
Until the model is loaded for the first time, any incoming events are held back.
Ensure that the Python packages used in this streams flow are compatible with the packages that you used to create the model. To see the list of preinstalled and user-installed packages, go to the canvas, click , and then click Runtime.
Below you see an example of a Python model code for the Watson Machine Learning option.
import sys def init(state): pass def process(event, state): image = event['image'] model = state['model'] event['prediction'] = model.predict(image) return event def load_model(state, model): state['model'] = model