Score

Score

A PMML-compliant scoring engine supports:

PMML-compliant models (4.2 and earlier versions) produced by various vendors, except for Baseline Model, ScoreCard Model, Sequence Model, Text Model (it is deprecated, and nobody uses it). Refer to website of the Data Mining Group (DMG) for list of supported models: http://www.dmg.org/.

  • non-PMML models produced by IBM SPSS products: Discriminant and Bayesian networks.
  • PMML 4.2 transformations completely.

Different kinds of models can produce various scoring results, for example:

  • Classification models (those with a categorical target: Bayes Net, General Regression, Mining, Naive Bayes, k-Nearest Neighbor, Neural Network, Regression, Ruleset, Support Vector Machine, Tree) produce:
  • Predicted values
  • Probabilities
  • Confidence values
  • Regression models (those with a continuous target: General Regression, Mining, k-Nearest Neighbor, Neural Network, Regression, Tree) produce predicted values; some also produce standard errors.
  • Cox regression (in General Regression) produces predicted survival probability and cumulative hazard values.
  • Tree models also produce Node ID.
  • Clustering models produce Cluster ID and Cluster affinity.
  • Anomaly Detection (represented as Clustering) produces anomaly index and top reasons.
  • Association model produces Consequent, Rule ID, and confidence for top matching rules.

Example code:

import com.ibm.spss.ml.Score

def loadContainer(file: String): Try[ZipContainer] = {
  Try { ZipContainer(IOUtils.toRawBytes(new BufferedInputStream(new FileInputStream(file)))) }
}

val con = loadContainer("linear.pmml") get
val localContainer = LocalContainerManager()
localContainer.exportContainers("InputPMML", List(con))

val score = Score(local).setInputContainerKeys(List("InputPMML"))
val scoredDf = score.transform(data)
scoredDf.show()