What is instantiation?
Instantiation is the process of reading or specifying information, such as storage type and values for a data field. To optimize system resources, instantiating is a user-directed process—you tell the software to read values by running data through a Type node.
- Data with unknown types is also referred to as uninstantiated. Data whose storage type and values are unknown is displayed in the Measure column of the Type node settings as Typeless.
- When you have some information about a field's storage, such as string or numeric, the data is called partially instantiated. Categorical or Continuous are partially instantiated measurement levels. For example, Categorical specifies that the field is symbolic, but you don't know whether it's nominal, ordinal, or flag.
- When all of the details about a type are known, including the values, a fully instantiated measurement level—nominal, ordinal, flag, or continuous—is displayed in this column. Note that the continuous type is used for both partially instantiated and fully instantiated data fields. Continuous data can be either integers or real numbers.
When a data flow with a Type node runs, uninstantiated types immediately become partially instantiated, based on the initial data values. After all of the data passes through the node, all data becomes fully instantiated unless values were set to Pass. If the flow run is interrupted, the data will remain partially instantiated. After the Types settings are instantiated, the values of a field are static at that point in the flow. This means that any upstream changes will not affect the values of a particular field, even if you rerun the flow. To change or update the values based on new data or added manipulations, you need to edit them in the Types settings or set the value for a field to Read or Extend.
When to instantiate
- The dataset is large, and the flow filters a subset prior to the Type node
- Data has been filtered in the flow
- Data has been merged or appended in the flow
- New data fields are derived during processing