Sample SQL statements for data quality rules
These samples demonstrate how you can set up your SQL rules.
You can copy the provided SQL statements into your own data quality rules and adjust them as needed.
Address doesn't exist
Returns the client ID and name for all records where an address does not exist (null and blank value check).
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where ADDRESS is null or length(trim(ADDRESS))=0
Address type is missing or not in the reference list
Returns the client ID and name for all records where an address type does not exist (null and blank value check) or is not in the reference list.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where ADDRESSTYPE is null or length(trim(ADDRESSTYPE))=0 or rtrim(ADDRESSTYPE) not in ('S','R','B','L','G','U','O')
Age is outside the range
Returns the client ID and name for all records where the derived age is less than the AGE value or exceeds 125. The derived age is calculated as the absolute value of the difference between the current date and date of birth.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where abs(DAYS(CURRENT DATE) - DAYS(DATEOFBIRTH))/365 < AGE OR abs(DAYS(CURRENT DATE) - DAYS(DATEOFBIRTH))/365 > 125
Incorrect marital status of children
Returns the client ID and name for all records where the person is a child (derived age below is 18) and the marital status is not no. The derived age is calculated as the absolute value of the difference between the current date and date of birth.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where abs(DAYS(CURRENT DATE) - DAYS(DATEOFBIRTH))/365 < 18 AND MARITAL_STATUS <> 'N'
Credit card number doesn't match the format
Returns the client ID and name for all records where the credit card number doesn't match the specified format.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where NOT regexp_like(CREDIT_CARD, '^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|6(?:22|4[4-9])[0-9]{13}|65[0-9]{14}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\\d{3})\\d{11})$')
SSN doesn't match the format
Returns the client ID and name for all records where a social security number (SSN) does not match the specified format.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where NOT regexp_like(SSN, '^[0-9]{3}-[0-9]{2}-[0-9]{4}$')
Substring doesn't match entries in reference list
Returns the client ID and name for all records where the substring of a text value starting at position 3 for length 3 is not in the provided reference list.
select CLIENT_ID, NAME from SCHEMA.CUSTOMERS where substring(TEXTFIELD, 3, 3) not in ('AAA','AAB','BAA','CCC')
Parent topic: Creating SQL-based data quality rules