This section describes how to set up the data model attributes based on
pyspark.sql.StructField.
spss.datamodel.Role Objects
Copy link to section
This class enumerates valid roles for each field in a data model.
BOTH: Indicates that this field can be either an antecedent or a consequent.
FREQWEIGHT: Indicates that this field is to be used as a frequency weight; this
isn't displayed to the user.
INPUT: Indicates that this field is a predictor or an antecedent.
NONE: Indicates that this field is not used directly during modeling.
TARGET: Indicates that this field is predicted or a consequent.
PARTITION: Indicates that this field identifies the data partition.
RECORDID: Indicates that this field identifie the record id.
SPLIT: Indicates that this field splits the data.
spss.datamodel.Measure Objects
Copy link to section
This class enumerates measurement levels for fields in a data model.
UNKNOWN: Indicates that the measure type is unknown.
CONTINUOUS: Indicates that the measure type is continuous.
NOMINAL: Indicates that the measure type is nominal.
FLAG: Indicates that the field value is one of two values.
DISCRETE: Indicates that the field value should be interpreted as a collection
of values.
ORDINAL: Indicates that the measure type is ordinal.
TYPELESS: Indicates that the field can have any value compatible with its
storage.
pyspark.sql.StructField Objects
Copy link to section
Represents a field in a StructType. A StructField object
comprises four fields:
name (string): name of a StructField
dataType (pyspark.sql.DataType): specific data type
nullable (bool): if the values of a StructField can contain
None values
metadata (dictionary): a python dictionary that stores the option
attributes
You can use the metadata dictionary instance to store the measure, role, or label attribute for
the specific field. The key words for these attributes are:
measure: the key word for measure attribute
role: the key word for role attribute
displayLabel: the key word for label attribute
Example:
from spss.datamodel.Role import Role
from spss.datamodel.Measure import Measure
_metadata = {}
_metadata['measure'] = Measure.TYPELESS
_metadata['role'] = Role.NONE
_metadata['displayLabel'] = "field label description"
StructField("userName", StringType(), nullable=False,
metadata=_metadata)
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.