0 / 0
Data quality dimensions

Data quality dimensions

Data quality dimensions describe a measurable characteristic of data and help defining data quality requirements. Use data quality dimensions to determine the expected results of data quality assessment, whether initial assessment or ongoing monitoring.

The state that you want your data to be in usually can be defined as fit for use, defect free, corresponds to specification, or meeting expectations and requirements. When you measure data quality, you compare the actual state of your date to this wanted state. The standards, expectations, and requirements that are important to your business processes are expressed as characteristics or dimensions of the data.

The Data Management Association (DAMA) International published a paper that describes 6 core dimensions of data quality:

Accuracy
Data values are as close as possible to real values.
Predefined data quality checks that identify issues associated with this dimension: none
Completeness
All required data values are present.
Predefined data quality checks that identify issues associated with this dimension: Unexpected missing values
Consistency
Data values within a column comply with a rule.
Predefined data quality checks that identify issues associated with this dimension: Inconsistent capitalization, Inconsistent representation of missing values, Suspect values
Timeliness
Data represent the reality from a required point in time.
Predefined data quality checks that identify issues associated with this dimension: none
Uniqueness
Distinct values appear only once.
Predefined data quality checks that identify issues associated with this dimension: Unexpected duplicated values
Validity
Data conforms to the format, type, or range of its definition.
Predefined data quality checks that identify issues associated with this dimension: Data class violations, Data type violations, Format violations, Values out of range

In addition to these core dimensions that are evaluated by running data quality checks, IBM Match 360 (if deployed) contributes the Entity confidence dimension. This dimension indicates how confident the system is that the entity matches within your data are correct. The dimension score represents the percentage of entities of the particular entity type that have no records with potential match issues as member.

Learn more

Parent topic: Managing data quality

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more