About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Feb 11, 2025
The Data Audit node provides a comprehensive first look at the data,
including summary statistics, histograms and distribution for each field, as well as information on
outliers, missing values, and extremes. Results are displayed in an easy-to-read matrix that can be
sorted and used to generate full-size graphs and data preparation nodes.
Example
stream = modeler.script.stream() sourcenode = stream.findByID("id46WRP1285C") node = stream.createAt("dataaudit", "My node", 196, 100) stream.link(sourcenode, node) node.setPropertyValue("custom_fields", True) node.setPropertyValue("fields", ["Age", "Na", "K"]) node.setPropertyValue("display_graphs", True) node.setPropertyValue("basic_stats", True) node.setPropertyValue("advanced_stats", True) node.setPropertyValue("median_stats", False) node.setPropertyValue("calculate", ["Count", "Breakdown"]) node.setPropertyValue("outlier_detection_method", "std") node.setPropertyValue("outlier_detection_std_outlier", 1.0) node.setPropertyValue("outlier_detection_std_extreme", 3.0) node.setPropertyValue("output_mode", "Screen")
properties |
Data type | Property description |
---|---|---|
|
flag | |
|
[field1 … fieldN] | |
|
field | |
|
flag | Used to turn the display of graphs in the output matrix on or off. |
|
flag | |
|
flag | |
|
flag | |
|
|
Used to calculate missing values. Select either, both, or neither calculation method. |
|
|
Used to specify the detection method for outliers and extreme values. |
|
number | If is , specifies the number to
use to define outliers. |
|
number | If is , specifies the number to
use to define extreme values. |
|
number | If is , specifies the number to
use to define outliers. |
|
number | If is , specifies the number to
use to define extreme values. |
|
flag | Specifies whether a custom output name is used. |
|
string | If is true, specifies the name to use. |
|
|
Used to specify target location for output generated from the output node. |
|
(.tab)
(.csv)
(.html)
(.cou) |
Used to specify the type of output. |
|
flag | When the is , causes the output to be
separated into pages. |
|
number | When used with , specifies the lines per page of
output. |
|
string |
Was the topic helpful?
0/1000