Known issues and limitations for Watson OpenScale

The following list contains the limitations and known issues for IBM Watson OpenScale.

 

Limitations

  • Watson OpenScale does not support models where the data type of the model prediction is binary. You must change such models so that the data type of their prediction is a string or integer data type.
  • Support for the XGBoost framework has the following limitations for classification problems: For binary classification, Watson OpenScale supports the binary:logistic logistic regression function with an output as a probability of True. For multiclass classification, Watson OpenScale supports the multi:softprob function where the result contains the predicted probability of each data point belonging to each class.
  • Fairness and drift metrics are not supported for unstructured (image or text) data types.
  • Having an equals sign (=) in the column name of a dataset causes an issue with explainability and generates the following error message: Error: An error occurred while computing feature importance. Do not use an equals sign (=) in a column name. It is not supported.
  • The database and IBM Watson Machine Learning instance must be deployed in the same account.
  • Watson OpenScale uses a PostgreSQL or Db2 database to store model-related data (feedback data, scoring payload) and calculated metrics. Lite Db2 plans are not currently supported.
  • The free Lite plan database is not GDPR-compliant. If your model processes personally identifiable information (PII), you must purchase a new database or use an existing database that does conform to GDPR rules.

 

Known issues

Watson OpenScale has the following known issue:

 

There’s a difference between Spark and Python payload tables

The payload table for a Spark classification model is different from a Python one. An Apache Spark payload table has three columns for the predicted results (prediction, probability, and prediction_probability) while the Python payload table only has two columns (prediction and probability). For the Spark engine, the probability field receives an array, such as [0.1,0.9] as a string column. The prediction_probability field requires a numeric value, such as 0.9 and is most similar to the Python probability field.

Unexpected data type causes automatic payload logging to fail

If your model output includes a field with a probability value, it must be a vector. Otherwise, automatic payload scoring is disabled.

 

Limit on the number of features for a model

Scoring payloads for a model must fit within the maximum width allowed for the table created by payload logging in the datamart database (with some buffer for the internal-use columns that Watson OpenScale itself adds). In addition, apart from the width there is also a hard-coded limit of 1012 features.

The following table summarizes what this means for models with different sizes of features:

Table 1: Feature column limits

Feature type Feature # limit
int64 or float64 or string length 1-64 1012
string length 65-2048 444
string length 2048-32K 28

Because many models have features of mixed types, the following sample configurations can be used for planning purposes:

  • For int64 or float64 or strings of length 64 or less, count as 64.
  • For strings from 65 to 2048, count as 2048.
  • For strings from 2048 to 32K, count as 32K.
  • The total length of all features should be no more than ~900K.

 

Not all Db2 instances function identically

Watson OpenScale supports Db2 Warehouse add-on, Db2 Advanced Enterprise Server Edition add-on, as well as Db2 Enterprise Server Edition (v. 11.5.1 or later) installation that is accessible to the cluster. Be aware of the following limitation:

  • Watson OpenScale requires a tablespace with a page size of 32k or larger.

 

Drift configuration errors prevent configuration of drift monitor

The flexibility of the model configuration screen can also lead to problems later on when you want to configure monitors, such as the drift detection monitor. Because you can choose the data types, you must ensure that your choices match the input schema of the model. The following error may occur if the prediction column type is not properly selected:

error: AIQDD2003E:
Message: "The {0} model predictions are different from class names in the {1} training data for the {2} subscription of the {3} datamart and the {4} service binding."

The following cases are the most-likely cause:

  • The class label is of string type and modeling_role prediction is assigned to the prediction column as a double type because that is how the output data schema is defined.
  • You select the prediction column of double type in the UI, which is not restricted.

Payload formats

For proper processing of payload analytics, Watson OpenScale does not support column names with double quotation marks (“) in the payload. This affects both scoring payload and feedback data in CSV and JSON formats.

 

Microsoft Azure ML Studio

  • Of the two types of Azure Machine Learning web services, only the New type is supported by Watson OpenScale. The Classic type is not supported.

  • Default input name must be used: In the Azure web service, the default input name is "input1". Currently, this field is mandated for Watson OpenScale and, if it is missing, Watson OpenScale will not work.

    If your Azure web service does not use the default name, change the input field name to "input1", then redeploy your web service and reconfigure your OpenScale machine learning provider settings.

  • If calls to Microsoft Azure ML Studio to list the machine learning models causes the response to time out, for example when you have many web services, you must increase timeout values. You may need to work around this issue by changing the /etc/haproxy/haproxy.cfg configuration setting:

    • Log into the load balancer node and update /etc/haproxy/haproxy.cfg to set the client and server timeout from 1m to 5m:

        timeout client           5m
        timeout server           5m
      
    • Running systemctl restart haproxy to restart the HAProxy load balancer.

If you are using a different load balancer, other than HAProxy, you may need to adjust timeout values in a similar fashion. {: note}

  • Of the two types of Azure Machine Learning web services, only the New type is supported by Watson OpenScale. The Classic type is not supported.

 

Amazon SageMaker

  • BlazingText algorithm is not supported: The Amazon SageMaker BlazingText algorithm input payload format is not supported in the current release of Watson OpenScale.

 

Custom machine learning service instance

  • The Watson OpenScale Python Client SDK does not currently have Explainability working for the Custom serve engine. This is because the Custom serve engine requires a numerical prediction in the response data, which is not included with the module script.

 

Browser support

The Watson OpenScale service tooling requires the same level of browser software as is required by IBM Cloud. See the IBM Cloud Prerequisites topic for details.

 

Drift configuration is started but never finishes

Drift configuration is started but never finishes and continues to show the spinner icon. If you see the spinner run for more than 10 minutes, it is possible that the system is left in an inconsistent state. There is a workaround to this behavior: Edit the drift configuration. Then, save it. The system might come out of this state and complete configuration. If drift reconfiguration does not rectify the situation, contact IBM Support.

 

Next steps