SPSS Model

The SPSS Model operator in the streams flow canvas applies a predictive model to your streaming data.

A predictive model refers to the prepared scoring branch of an SPSS modeler flow in IBM Watson Machine Learning. The scoring branch itself might contain trained instances of a specific predictive model algorithm and other processing that is required to generate analytics.

The SPSS Model operator has input and output ports.

Important

The accuracy of the predictive model will change over time. Refresh the SPSS model in Watson Machine Learning from time to time, and then rerun your streams flow with the updated model.


Dates

The SPSS Model operator in a streams flow converts all timestamps to Coordinated Universal Time (UTC) format. Therefore, when you train an SPSS modeler flow in Watson Machine Learning, all dates must be in UTC format.

The following table shows date and time format in an SPSS model and in a streams flow:

Data type in SPSS model in Watson Machine Learning Meaning Data type in SPSS Model operator in streams flow
Time Number of seconds since midnight Number
Date Date at midnight Date
Timestamp Date + time Date

In a streams flow, input into the SPSS Model operator can come from a Code operator. The Code operator can convert number and date to time, date, and timestamp for consumption by the SPSS Model that you select in the SPSS Model operator.

The following code snippet in the Code operator shows how to convert numbers and dates to time, date, and timestamp for the SPSS Model operator:

event = {
            "time": 3600,
            "date": datetime.datetime.strptime('31122008', "%d%m%Y"),
            "timestamp": datetime.datetime.strptime('31/12/2008 14:00','%d/%m/%Y %H:%M')    
        }

submit(event)

For more information about how streams flow handles dates, see Dates.

   


Example

Goal: You need to create a streams flow that ingests streaming data in the columns that the SPSS model expects, and then apply the SPSS model to that data.

 

To create a streams flow that uses the SPSS Model operator, do these steps:

1.Go to the Project page of the project that contains the model, and then click Assets > Watson Machine Learning.

2.Click the name of the SPSS model to show the information about the model itself and its input schema.

3.Scroll down to the Input Schema section, and then check the column names and data types.

This information lists what the model expects as input. The schema can be viewed in table format and in JSON format. Suppose that the data contains the following health metrics for patients: age, sex, blood pressure (BP), sodium (Na), potassium (K), cholesterol, and drug. The Input Schema shows the following fields.

Input Schema Modeler fields

4.In the canvas, drag a Code (in Sources) operator to the canvas to simulate streaming data.

The Code operator sends data to the SPSS Model operator to run the predictive model that you created in Watson Machine Learning. The Code operator must send data in the format that the SPSS model expects, so add the following code.

    # YOU MUST EDIT THE SCHEMA and add all attributes that you are returning as output.
    #
    # Preinstalled Python packages can be viewed from the Settings pane.
    # In the Settings pane you can also install additional Python packages.

    import sys
    import time
    import pandas as pd 

    # init() function will be called once on pipeline initialization
    # @state a Python dictionary object for keeping state. The state object is passed to the produce function
    def init(state):
    # do something once on pipeline initialization and save in the state object pass

    # produce() function will be called when the job starts to run.
    # It is called on a background thread, and it will typically invoke the 'submit()' callback
    # whenever a tuple of data is ready to be emitted from this operator.
    # This allows for using asynchronous data services as well as synchronous data generation or retrieval.
    # @submit a Python callback function that takes one argument: a dictionary representing a single tuple.
    # @state a Python dictionary object for keeping state
    # You must declare all output attributes in the Edit Schema window.

    def produce(submit, state):
        while True:
            df = pd.read_csv('https://github.com/pmservice/drug-selection/blob/master/data/drug_batch_data.csv')
            for i, Age in enumerate(df.Age):
                #"Age","Sex","BP","Cholesterol","Na","K"
                dd = { "Age" : df.Age.iloc[i],
                    "Sex" : df.Sex.iloc[i],
                    "BP" : df.BP.iloc[i],
                    "Cholesterol" : df.Cholesterol.iloc[i],
                    "Na" : df.Na.iloc[i],
                    "K" : df.K.iloc[i]}

                submit(dd)
                time.sleep(0.5) # Simulates a delay of 0.5 seconds between emitted events

5.Click Edit Output Schema, and then add the column names that you saw in the input schemas of your new model. Make sure that the columns and their data types match what the model expects as input.

Output Schema fields

6.Drag the SPSS Model operator to the canvas, and then connect the two operators.

The SPSS Model operator applies the predictive analytics of the model to the streaming data from the Code operator.

   

Learn more