Handling missing values
You should decide how to treat missing values in light of your business or domain knowledge. To ease training time and increase accuracy, you may want to remove blanks from your data set. On the other hand, the presence of blank values may lead to new business opportunities or additional insights.
In choosing the best technique, you should consider the following aspects of your data:
- Size of the data set
- Number of fields containing blanks
- Amount of missing information
In general terms, there are two approaches you can follow:
- You can exclude fields or records with missing values
- You can impute, replace, or coerce missing values using a variety of methods