Linear Overview

In addition to fitting a standard linear regression model, the Linear node offers a boosting option to enhance model accuracy and a bagging option to enhance model stability:

  • Boosting produces a succession of “component models,” each of which is built on the entire dataset. Prior to building each successive component model, the records are weighted based on the previous component model’s residuals. Cases with large residuals are given relatively higher analysis weights so that the next component model will focus on predicting these records well. Together these component models form an ensemble model. The ensemble model scores new records using the weighted median of the ensemble component model predictions.

  • Bagging (bootstrap aggregation) produces replicates of the training dataset by sampling with replacement from the original dataset. This creates bootstrap samples of equal size to the original dataset. Then a “component model” is built on each replicate. Together these component models form an ensemble model. The ensemble model scores new records using either the mean or the median of the predictions from the component models.

With any of the three approaches, automatic data preparation (ADP) may be invoked. This attempts to prepare a dataset so as to generally improve the training speed, predictive power, and robustness of models fit to the prepared data.

Next steps

Like your visualization? Why not deploy it? For more information, see Deploy a model.