About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Jan 12, 2024
The experiment settings used for data imputation in time series experiments.
Data imputation methods
Apply one of these data imputation methods in experiment settings to supply missing values in a data set.
Imputation method | Description |
---|---|
FlattenIterative | Time series data is first flattened, then missing values are imputed with the Scikit-learn iterative imputer. |
Linear | Linear interpolation method is used to impute the missing value. |
Cubic | Cubic interpolation method is used to impute the missing value. |
Previous | Missing value is imputed with the previous value. |
Next | Missing value is imputed with the next value. |
Fill | Missing value is imputed by using user-specified value, or sample mean, or sample median. |
Input Settings
These commands are used to support data imputation for time series experiments in a notebook.
Name | Description | Value | DefaultValue |
---|---|---|---|
use_imputation | Flag for switching imputation on or off. | True or False | True |
imputer_list | List of imputer names (strings) to search. If a list is not specified, all the default imputers are searched. If an empty list is passed, all imputers are searched. | "FlattenIterative", "Linear", "Cubic", "Previous", "Fill", "Next" | "FlattenIterative", "Linear", "Cubic", "Previous" |
imputer_fill_type | Categories of "Fill" imputer | "mean"/"median"/"value" | "value" |
imputer_fill_value | A single numeric value to be filled for all missing values. Only applies when "imputer_fill_type" is specified as "value". Ignored if "mean" or "median" is specified for "imputer_fill_type. | (Negative Infinity, Positive Infinity) | 0 |
imputation_threshold | Threshold for imputation. The missing value ratio must not be greater than the threshold in one column. Otherwise, results in an error. | (0,1) | 0.25 |
Notes for use_imputation usage
-
If the
method is specified asuse_imputation
and the input data has missing values:True
takes effect.imputation_threshold
- imputer candidates in
would be used to search for the best imputer.imputer_list
- If the best imputer is
,Fill
andimputer_fill_type
are applied; otherwise, they are ignored.imputer_fill_value
-
If the
method is specified asuse_imputation
and the input data has no missing values:True
is ignored.imputation_threshold
- imputer candidates in
are used to search for the best imputer. If the best imputer isimputer_list
,Fill
andimputer_fill_type
are applied; otherwise, they are ignored.imputer_fill_value
-
If the
method is specified asuse_imputation
but the input data has missing values:False
is turned on with a warning, then the method follows the behavior for the first scenario.use_imputation
-
If the
method is specified asuse_imputation
and the input data has no missing values, then no further processing is required.False
For example:
"pipelines": [ { "id": "automl", "runtime_ref": "hybrid", "nodes": [ { "id": "automl-ts", "type": "execution_node", "op": "kube", "runtime_ref": "automl", "parameters": { "del_on_close": true, "optimization": { "target_columns": [2,3,4], "timestamp_column": 1, "use_imputation": true } } } ] } ]
Parent topic: Data imputation in AutoAI experiments
Was the topic helpful?
0/1000