Data class details
This information describes the legacy behavior for governance artifacts. See Moving to the new version of governance artifacts.
For unstructured data assets, data classes are assigned at the asset level and represent the data classes selected by the analysis, or manually, as the best match for the entire asset.
For structured data assets, data classes are assigned at the asset level and at the column level. At asset level, the assigned data classes represent the data classes selected by the analysis, or manually, as the best match for a column.
At column level, data classes are assigned based on column name (scope column) or on column data (scope value) depending on the data class definition:
Scope column Classification is based only on parsing and analysis of the column name, that is, on metadata alone. Depending on how well the column matches the data class, the classifier returns a confidence value between 0.0 and 1.0, where 0.0 means no match, and 1.0 means a perfect match. Alternatively, the classifier can return false instead of 0.0 and true instead of 1.0.
Scope value Classification is based on analysis of the data values found in a column. Depending on whether the value matches the class, the classifier returns true or false. When all values are evaluated, the percentage of values of a column that are not null and match the data class represents the confidence of the data class for the column.
Each predefined data class uses a specific type of classifier to apply classification logic:
- A Java classifier can evaluate the column name or the data values of a column to determine the data class (scope column).
- A Regex classifier evaluates the data values of a column by applying a regular expression is used to determine if each value belongs to the data class (scope value).
- A Value list classifier evaluates the data values of a column based on a given list of valid values to determine the data class (scope value).
Details of each data class
Find the classification type and scope, evaluation criteria, and an example for each predefined data class.
Data class | Scope | Matching criteria | Sample data value |
---|---|---|---|
Account number | Value | RegularExpression: . Column name filter: ^acc(ount)?([ _-])?(num(ber)?|id|no(.)?){1}$ |
123456 |
Address Line 1 | Column | Java class: com.ibm.infosphere.classification.impl.AddressLineClassifier Data type: string Data minimum length: 4 Data maximum length: 100 Column name filter: addr.{0,15}(1|one)$ |
|
Address Line 2 | Column | Java class: com.ibm.infosphere.classification.impl.AddressLineClassifier Data type: string Data minimum length: 4 Data maximum length: 100 Column name filter: addr.{0,15}(2|two)$ |
|
Address Line 3 | Column | Java class: com.ibm.infosphere.classification.impl.AddressLineClassifier Data type: string Data minimum length: 4 Data maximum length: 100 Column name filter: addr.{0,15}(3|three)$ |
|
Airport Code | Value | List of airport codes; case-sensitive Data type: string Data minimum length: 3 Data maximum length: 3 |
|
Alabama State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{6}$|^\d{7}$ Data type: string Data minimum length: 7 Data maximum length: 7 |
1234567 |
Alaska State Driver’s License | Value | RegularExpression: ^\d{7}$ Data type: string Data minimum length: 7 Data maximum length: 7 |
1234567 |
Alberta Province Driver’s License | Value | RegularExpression: ^\d{6}[-]?\d{3}$ Data type: numeric, string Data minimum length: 9 Data maximum length: 10 |
123456-123 |
American Express Card | Value | Java class: com.ibm.infosphere.classification.impl.AMEXClassifier Data type: numeric, string Data minimum length: 16 Data maximum length: 18 |
3400-000000-00009 |
Arizona State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[abdyABDY]\d{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A12345678 |
Arkansas State Driver’s License | Value | RegularExpression: ^\d{9}$ Data type: string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
012345678 |
BIC | Value | Java class: com.ibm.infosphere.classification.impl.BICClassifier Data type: string Data minimum length: 9 Data maximum length: 9 |
DEUTDEDBDUE |
Boolean | Value | List of values: 0, 1, True, False, Yes, No Data type: numeric or string |
True |
British Columbia Province Driver’s License | Value | RegularExpression: ^\d{7}$ Data type: numeric, string Data minimum length: 7 Data maximum length: 7 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567 |
California State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}[0-9]{7}$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
C1234567 |
Canada Post Code | Value | RegularExpression: ^[A-Z]\d[A-Z][ -]?\d[A-Z]\d$ Data type: string Data minimum length: 6 Data maximum length: 6 |
H3A 0B1 |
Canada Province Code | Value | List of Canada province codes; case-sensitive Data type: string Data minimum length: 2 Data maximum length: 2 |
QC |
Canada Province Name | Value | List of Canada province names; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 25 |
Quebec |
Canadian Social Insurance Number (SIN) | Value | Java class: com.ibm.infosphere.classification.impl.CanadianSINClassifier Data type: numeric, string Data minimum length: 9 Data maximum length: 20 |
046-454-286 |
City | Value | List of city names; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 58 |
Los Angeles |
Code | Column | Java class: com.ibm.infosphere.classification.impl.CodeClassifier | |
Colorado State Driver’s License | Value | RegularExpression: (?:(^[0-9]{2}-?[0-9]{3}-?[0-9]{4}$)|(^[a-zA-Z]{1}[0-9]{3,6}$)) Data type: string Data minimum length: 4 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
12-345-2222 |
Color | Value | List of colors; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 41 |
Blue |
Commercial and Government Entity Code | Value | Java class: com.ibm.infosphere.classification.impl.CAGECodeClassifier Data type: string Data minimum length: 5 Data maximum length: 5 Column name filter: cage|fscm|nscm|entity|code |
1ASDY |
Computer Host Name | Value | Java class: com.ibm.infosphere.classification.impl.HostNameClassifier Data type: string Data minimum length: 4 Data maximum length: 255 |
www.example.com |
Connecticut State Driver’s License | Value | RegularExpression: ^[0-9]{9}$ Data type: numric, string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789 |
Country Code | Value | List of country codes; case-sensitive Data type: string Data minimum length: 2 Data maximum length: 3 |
USA |
Country Name | Value | List of country names; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 50 |
India |
Credit Card Expiration Date | Column | Java class: com.ibm.infosphere.classification.impl.CreditCardExpDateAndValidationNumberClassifier | 12/2018 |
Credit Card Validation Number | Column | Java class: com.ibm.infosphere.classification.impl.CreditCardExpDateAndValidationNumberClassifier | 1234 |
Currency | Value | Java class: com.ibm.infosphere.classification.impl.CurrencyClassifier Data type: string Data minimum length: 2 Data maximum length: 25 |
$12,345.67 |
Current Procedural Terminology | Value | Java class: com.ibm.infosphere.classification.impl.CPTClassifier Data type: string Data minimum length: 5 Data maximum length: 5 Column name filter: CPT|medical procedure code|medical procedure|medicalcode|current procedural terminology |
|
Customer Number | Value | RegularExpression: . Column name filter: ^cust(omer)?([ _-])?(num(ber)?|id|no(.)?){1}$ |
3141596 |
Date | Value | Java class: com.ibm.infosphere.classification.impl.DateTimeClassifier | 12-30-2015 |
Date of Birth | Value | Java class: com.ibm.infosphere.classification.impl.DOBClassifier Column name filter: dob$|birth(day)?|geburtsdatum|na(issance|cimiento|scita)|urodzenia|(生ま(れた日)?|誕生日)|出生(年月)? |
12-30-2015 |
Delaware State Driver’s License | Value | RegularExpression: ^[0-9]{1,7}$ Data type: numeric, string Data minimum length: 1 Data maximum length: 7 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567 |
Diners Club Card | Value | Java class: com.ibm.infosphere.classification.impl.DinersClubClassifier Data type: numeric, string Data minimum length: 15 Data maximum length: 18 |
5520111111111121 |
Discover Card | Value | Java class: com.ibm.infosphere.classification.impl.DiscoverClassifier Data type: numeric, string Data minimum length: 17 Data maximum length: 18 |
6220264390045758 |
Driver’s License | Value | RegularExpression: ^[ a-zA-Z0-9-]{1,19}$ Data type: string Data minimum length: 1 Data maximum length: 19 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567 |
DUNS | Value | RegularExpression: ^(\d{2})([ -]?)(\d{3})([ -]?)(\d{4})$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: duns|universal number|universal_number |
12-345-6789 |
Email Address | Value | Java class: com.ibm.infosphere.classification.impl.EmailClassifier Data type: string Data minimum length: 6 Data maximum length: 254 |
[email protected] |
Employment Status | Value | List of employment statuses; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 50 |
employee |
Ethnicity | Value | List of ethnicities; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 22 |
Hispanic |
Eye Color | Value | List of eye colors; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 14 Column name filter: eye|eyecolor|eyecolor |
Hazel |
First Name | Column | Java class: com.ibm.infosphere.classification.impl.GNMFirstNameClassifier Data type: string Column name filter: ^(?i)(given|f(irst)?)([ .-])?name$ |
James |
Florida State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}[0-9]{3}-[0-9]{3}-[0-9]{2}-[0-9]{3}-[0-1]{1}$|^[a-zA-Z]{1}[0-9]{12}$ Data type: string Data minimum length: 13 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
X123-123-33-229-0 |
Fortune 1000 Company | Value | Java class: com.ibm.infosphere.classification.impl.Fortune1000Classifier Data type: string Data minimum length: 2 Data maximum length: 50 |
|
French INSEE Number | Value | Java class: com.ibm.infosphere.classification.impl.FranceINSEEClassifier Data type: numeric, string Data minimum length: 15 Data maximum length: 15 |
151022A10204375 |
Gender | Value | List of values: M, F, Male, Female Data type: string Data minimum length: 1 Data maximum length: 6 |
F |
Geographic Coordinates | Value | Java class: com.ibm.infosphere.classification.impl.GeographicCoordinatesClassifier Data type: string Data minimum length: 3 Data maximum length: 44 |
49° 13" N; 1°10’00.012" E |
Georgia State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[0-9]{7,9}$ Data type: numeric, string Data minimum length: 7 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123-45-6789 |
Germany Vehicle Registration Number | Value | Java class: com.ibm.infosphere.classification.impl.GermanyCarClassifier Data type: string Data minimum length: 4 Data maximum length: 9 |
BB-XY1066 |
Hair Color | Value | List of hair colors; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 14 Column name filter: hair|hair_color|haircolor |
Black |
Hawaii State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[hH]{1}[0-9]{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
H12345678 |
Health Insurance Claim Number | Value | Java class: com.ibm.infosphere.classification.impl.HICNClassifier Data type: string Data minimum length: 6 Data maximum length: 15 |
WD-000-00-0000 |
Hobby/Leisure Activity | Value | List of hobbies; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 38 Column name filter: hobb(y$|ies$)|leisure([ _])activit(y$|ies$) |
Tennis |
Honorific | Value | List of honorifics; case-insensitive Data type: string Data minimum length: 1 Data maximum length: 38 |
Mr |
IBAN | Value | Java class: com.ibm.infosphere.classification.impl.IBANClassifier Data type: string Data minimum length: 14 Data maximum length: 42 |
GB87 BARC 2065 8244 9716 55 |
ICD-10 | Value | Java class: com.ibm.infosphere.classification.impl.ICD10Classifier Data minimum length: 3 Data maximum length: 7 |
D36.7 |
Idaho State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[a-zA-Z]{2}[0-9]{6}[a-zA-Z]{1}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
AA123456X |
Identifier | Column | Java class: com.ibm.infosphere.classification.impl.IdentifierClassifier | |
Illinois State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}[0-9]{3}-[0-9]{4}-[0-9]{4}$|^[a-zA-Z]{1}[0-9]{11}$ Data type: string Data minimum length: 12 Data maximum length: 14 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A123-4567-8999 |
INCO Terms (International Commercial Terms) | Value | List of INCO terms; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 3 |
FCA |
Indiana State Driver’s License | Value | RegularExpression: ^[0-9]{4}-[0-9]{2}-[0-9]{4}$|^[a-zA-Z]{1}[0-9]{9}$|^[0-9]{10}$ Data type: string Data minimum length: 10 Data maximum length: 12 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234-56-7890 |
Indicator | Column | Java class: com.ibm.infosphere.classification.impl.IndicatorClassifier | |
Individual Taxpayer Identification Number (ITIN) | Value | RegularExpression: ^(9\d{2})([ -]?)(?!93|89)([789][0-9])([ -]?)(\d{4})$ Data minimum length: 9 Data maximum length: 11 Column name filter: itin|tax |
913-72-2222 |
International Mobile Equipment Identity (IMEI) | Value | Java class: com.ibm.infosphere.classification.impl.IMEINumberClassifier Data type: string Data minimum length: 15 Data maximum length: 15 Column name filter: imei |
490154203237518 |
International Securities Identification Number (ISIN) | Value | Java class: com.ibm.infosphere.classification.impl.ISINClassifier Data type: numeric, string Data minimum length: 10 Data maximum length: 20 |
GB0002634946 |
International Standard Book Number (ISBN) | Value | Java class: com.ibm.infosphere.classification.impl.ISBNClassifier Data type: numeric, string Data minimum length: 10 Data maximum length: 20 |
978 0 306 40615 7 |
International Standard Industrial Classification | Value | Java class: com.ibm.infosphere.classification.impl.ISICClassifier Data type: string Data minimum length: 5 Data maximum length: 5 Column name filter: ^(unsic)$|^(isic)$|^((industr(ial|ies|y))[ ._-]?(code))$ |
C3319 |
Internet Protocol Address | Value | RegularExpression: ^\s(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\s$ Data type: string Data minimum length: 7 Data maximum length: 15 |
127.127.127.002 |
Internet Protocol Version 6 Address | Value | RegularExpression: ^\s((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s$ Data type: string Data minimum length: 3 Data maximum length: 39 |
fe80:0:0:0:204:61ff:fe9d:f156 |
Iowa State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[0-9]{3}[a-zA-Z]{2}[0-9]{4}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123XX4567 |
Ireland Eircode | Value | RegularExpression: ^[ACD-FHKNPRTV-Y]\d[\dW][ -]?[\dACD-FHKNPRTV-Y]{4}$ Data type: string Data minimum length: 7 Data maximum length: 8 |
D02 NY52 |
ISO 3166-2 Code | Value | List of ISO 3166-2 codes of states or provinces; case-insensitive Data type: string Data minimum length: 4 Data maximum length: 6 |
IN-KA |
Italian Fiscal Code | Value | RegularExpression: ^([A-Z]{3})([ -]?)([A-Z]{3})\2([0-9L-NP-V]{2})([A-EHLMPRST])([0-9LNP-V]{2})\2([A-ILMZ][0-9L-NP-V]{3})([A-Z])$ Data type: string Data minimum length: 16 Data maximum length: 16 |
MRTMTT25D09F205Z |
Japan Credit Bureau (JCB) | Value | Java class: com.ibm.infosphere.classification.impl.JapanCBClassifier Data type: numeric, string Data minimum length: 17 Data maximum length: 18 |
35283095185620637 |
Kansas State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[kK]{1}[0-9]{2}-[0-9]{2}-[0-9]{4}$|^[kK]{1}[0-9]{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
K94-12-3456 |
Kentucky State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}[0-9]{2}-[0-9]{3}-[0-9]{3}$|^[a-zA-Z]{1}[0-9]{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A23-145-678 |
Language Code or Name | Value | List of languages; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 37 Column name filter: lang|locale|language |
EN |
Last Name | Column | Java class: com.ibm.infosphere.classification.impl.GNMLastNameClassifier Data type: string Column name filter: ^l(ast)?([ -])?name$|^surname$|^family(([ -])?)name$ |
Smith |
Latitude | Value | Java class: com.ibm.infosphere.classification.impl.LatitudeClassifier Data type: numeric, string Data minimum length: 1 Data maximum length: 20 Column name filter: ^lat$|^lat_|_lat$|latitud(ine|e|o)?|breitengrad|breddekreds|breedtegraad|breiddegrad|breiddargráða|enlem|πλάτος|широт(ы|a)|קו רוחב|عرض جغرافي|緯度|纬度 |
49° 13" |
Legal Marital/Civil Status | Value | List of marital statuses; case-insensitive Data type: string Data minimum length: 6 Data maximum length: 23 |
Single |
Longitude | Value | Java class: com.ibm.infosphere.classification.impl.LongitudeClassifier Data type: numeric, string Data minimum length: 1 Data maximum length: 21 Column name filter: ^long$|^long_|_long$|longitud(ine|e|o)?|längengrad|laengengrad|længdekreds|laengdekreds|lengtegraad|lengdegrad|lengdargráða|boylam|μήκος|долгот(ы|a)|קו אורך|طول جغرافي|経度|经度 |
1°10’00.012" E |
Louisiana State Driver’s License | Value | RegularExpression: ^00[0-9]{7}$ Data type: string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
003456789 |
Mac Address | Value | RegularExpression: ^(?:[0-9A-Fa-f]{2}([:-]))(?:[0-9A-Fa-f]{2}\1){4}[0-9A-Fa-f]{2}$|(?:^([0-9A-Fa-f]{4}.){2}[0-9A-Fa-f]{4}$ Data minimum length: 14 Data maximum length: 17 |
12:34:56:78:9F |
Maine State Driver’s License | Value | RegularExpression: ^[0-9]{7}$ Data type: numeric, string Data minimum length: 7 Data maximum length: 7 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567 |
Manitoba Province Driver’s License | Value | RegularExpression: ^\d{9}$ Data type: numeric, string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789 |
Maryland State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}-[0-9]{3}-[0-9]{3}-[0-9]{3}-[0-9]{3}$|^[a-zA-Z]{1}[0-9]{12}$ Data type: string Data minimum length: 13 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A-123-456-789-999 |
Massachusetts State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[sS]{1}[0-9]{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
S12345678 |
Master Card | Value | Java class: com.ibm.infosphere.classification.impl.MasterCardClassifier Data type: numeric, string Data minimum length: 17 Data maximum length: 18 |
5285696282092972 |
Michigan State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{12}$|^[a-zA-Z]{1} [0-9]{3} [0-9]{3} [0-9]{3} [0-9]{3}$ Data type: string Data minimum length: 13 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A012345678912 |
Middle Name | Column | Java class: com.ibm.infosphere.classification.impl.GNMFirstNameClassifier Data type: string Column name filter: ^m(iddle)?([ _.-])?name$ |
James |
Minnesota State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{12}$|^[a-zA-Z]{1}-[0-9]{3}-[0-9]{3}-[0-9]{3}-[0-9]{3}$ Data type: string Data minimum length: 13 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A012345678912 |
Missouri State Driver’s License | Value | RegularExpression: ^\d{9}$|^[a-zA-Z]{1}[0-9]{5,9}$ Data type: string Data minimum length: 6 Data maximum length: 10 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
012345678 |
Montana State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{9}$|^\d{13}$ Data type: string Data minimum length: 9 Data maximum length: 13 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
0123456789123 |
Month | Value | Java class: com.ibm.infosphere.classification.impl.MonthClassifier Data type: numeric, string Data minimum length: 1 Data maximum length: 10 |
January |
Name Suffix | Value | List of name suffixes; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 16 |
PhD |
Nebraska State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}[0-9]{3,8}$ Data type: string Data minimum length: 4 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
C12345678 |
Nevada State Driver’s License | Value | RegularExpression: ^[xX]{1}\d{8}$|^\d{10}$|^\d{12}$ Data type: string Data minimum length: 9 Data maximum length: 12 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
0123456789 |
New Brunswick Province Driver’s License | Value | RegularExpression: ^[0-9]{1,7}$ Data type: string Data minimum length: 1 Data maximum length: 7 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
0123456 |
New Hampshire State Driver’s License | Value | RegularExpression: ^\d{2}[a-zA-Z]{3}\d{5}$ Data type: string Data minimum length: 10 Data maximum length: 10 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
01ABC56789 |
New Jersey State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{14}$|^[a-zA-Z]{1}\d{4} \d{5} \d{5}$|^[a-zA-Z]{1}\d{4}-\d{5}-\d{5}$ Data type: string Data minimum length: 15 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A01234567891234 |
New Mexico State Driver’s License | Value | RegularExpression: ^\d{9}$ Data type: string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
012345678 |
New York State Driver’s License | Value | RegularExpression: ^\d{9}$|^[a-zA-Z]{1}\d{18}$|^\d{3} \d{3} \d{3}$ Data type: string Data minimum length: 9 Data maximum length: 19 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
012345678 |
Newfoundland and Labrador Province State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{9}$ Data type: string Data minimum length: 10 Data maximum length: 10 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A012345678 |
NoClassDetected | - | Neither column name nor column values matches any of the available data classes. | - |
North Carolina State Driver’s License | Value | RegularExpression: ^[0-9]{1,12}$ Data type: string Data minimum length: 1 Data maximum length: 12 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789999 |
North Dakota State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[a-zA-Z]{3}-[0-9]{2}-[0-9]{4}$|^[a-zA-Z]{3}[0-9]{6}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
XYZ-11-2222 |
Nova Scotia Province Driver’s License | Value | RegularExpression: ^[a-zA-Z]{2}[0-9]{6}$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
XY123456 |
Ohio State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{2}[0-9]{6}$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
XY123456 |
Oklahoma State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([ -.]?)(\d{2})\2(\d{4})$|^[a-zA-Z]{1}[0-9]{9}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
X123456789 |
Ontario Province Driver’s License | Value | RegularExpression: ^([a-zA-Z]{1}\d{4})([-]?)(\d{5})\2(\d{5})$ Data type: string Data minimum length: 15 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A1234-56123-99999 |
Oregon State Driver’s License | Value | RegularExpression: ^\d{1,9}$ Data type: string Data minimum length: 1 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
012345678 |
Organization Name | Value | Java class: com.ibm.infosphere.classification.impl.GNMOrganizationClassifier Data type: string |
IBM |
Passport Number | Value | RegularExpression: ^[A-Z0-9<]{9}[0-9]{1}[A-Z]{3}[0-9]{7}[A-Z]{1}[0-9]{7}[A-Z0-9<]{14}[0-9]{2}$ Data type: string Data minimum length: 6 Data maximum length: 254 |
L898902C<3UTO6908061F9406236ZE184226B<<<<<14 |
Pennsylvania State Driver’s License | Value | RegularExpression: ^\d{8}$|^\d{2} \d{3} \d{3}$ Data type: string Data minimum length: 8 Data maximum length: 10 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
01234567 |
Percentage | Value | RegularExpression: ^(?[+-]? ?[0-9]{1,10}[,.]?[0-9]{0,10} ?(%|percent|pct))?$ Data type: string Data minimum length: 2 Data maximum length: 25 |
45% |
Person Name | Column | Java class: com.ibm.infosphere.classification.impl.GNMFullNameClassifier Data type: string Column name filter: _?name|नाम|名称|nom|nome|όνομα|nomine|имя|이름|име|naam |
John Doe |
Political Party | Value | List of poitical parties; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 99 Column name filter: politic |
PDP |
Prince Edward Island Province State Driver’s License | Value | RegularExpression: ^\d{6}$ Data type: numeric, string Data minimum length: 6 Data maximum length: 6 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456 |
Quantity | Column | Java class: com.ibm.infosphere.classification.impl.QuantityClassifier | 100 |
Quebec Province Driver’s License | Value | RegularExpression: ^([a-zA-Z]{1}\d{4})([-]?)(\d{6})(\2)(\d{2})$ Data type: string Data minimum length: 13 Data maximum length: 15 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A1234-222222-00 |
Relationship | Value | List of relationship types; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 26 |
Friendship |
Religion | Value | List of religions; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 19 |
Christian |
Rhode Island State Driver’s License | Value | RegularExpression: ^\d{7}$|^[vV]{1}\d{6}$ Data type: string Data minimum length: 7 Data maximum length: 7 |
0123456 |
Routing Transit Number | Value | Java class: com.ibm.infosphere.classification.impl.RTNClassifier Data type: numeric, string Data minimum length: 9 Data maximum length: 9 |
121000358 |
Saskatchewan Province State Driver’s License | Value | RegularExpression: ^\d{8}$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
12345678 |
South Carolina State Driver’s License | Value | RegularExpression: ^\d{9}$ Data type: string Data minimum length: 9 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789 |
South Dakota State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([-.]?)(\d{2})\2(\d{4})$|^\d{6}$|^\d{8}$ Data type: string Data minimum length: 6 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456 |
<a name=”nif”//>Spanish Fiscal Identification Number | Value | RegularExpression: (X?)[ -]?(\d{7})[ -]?([A-HJ-NP-TV-Z]) Data type: string Data minimum length: 8 Data maximum length: 11 |
3124124N |
State/Province Name | Value | List of state and province names; case-insensitive Data type: string Data minimum length: 2 Data maximum length: 43 |
San Salvador |
Temperature | Value | Java class: com.ibm.infosphere.classification.impl.TemperatureClassifier Data type: string Data minimum length: 2 Data maximum length: 25 |
20°C |
Tennessee State Driver’s License | Value | RegularExpression: ^\d{8,9}$ Data type: string Data minimum length: 8 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789 |
Texas State Driver’s License | Value | RegularExpression: ^\d{8}$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
12345678 |
Text | Column | Java class: com.ibm.infosphere.classification.impl.TextClassifier | Put your TV viewing into overdrive with scenes that jump off your screen when you add 3D HDTVs to your home-theater system. |
UK National Insurance Number | Value | RegularExpression: ^([A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z])([ -.]?)(\d{2})\2?(\d{2})\2?(\d{2})(\2([ABCD]))?$ Data type: string Data minimum length: 8 Data maximum length: 13 |
CA 123456 A |
UK Post Code | Value | RegularExpression: ^[A-Z][A-Z]?\d[A-Z\d]?[ -]?\d[ABD-HJLNP-UW-Z]{2}$ Data type: string Data minimum length: 5 Data maximum length: 8 |
L1 8LW |
UK Province Code | Value | List of UK province codes; case-sensitive Data type: string Data minimum length: 2 Data maximum length: 3 |
HAW |
Uniform Resource Locator | Value | Java class: com.ibm.infosphere.classification.impl.URLClassifier Data type: string Data minimum length: 10 Data maximum length: 1000 |
https://www.ibm.com/us-en/products/category/technology |
Universal Product Code (UPC) | Value | Java class: com.ibm.infosphere.classification.impl.UPCClassifier Data type: numeric, string Data minimum length: 12 Data maximum length: 12 |
123456789999 |
US County | Value | List of US county names; case-insensitive Data type: string Data minimum length: 3 Data maximum length: 35 |
Adams |
US Employer Identification Number | Value | RegularExpression: ^(0[1-6]|1[0-6]|2[0-7]|[35][0-9]|[468][0-8]|7[1-7]|9[0-589])[ -]?\d{7}$ Data type: string Data minimum length: 9 Data maximum length: 10 Column name filter: EMPLOYER|EIN |
99-1234567 |
US National Drug Code | Value | Java class: com.ibm.infosphere.classification.impl.USNDCClassifier Data type: string Data minimum length: 10 Data maximum length: 20 Column name filter: DRUG|NDC |
1234-5678-90 |
US Phone Number | Value | RegularExpression: ^(+?1\s[-\/.]?)?(((\d{3}))|(\d{3}))\s[-\/.]?\s(\d{3})\s[-\/.]?\s(\d{4})\s(([xX]|[eE][xX][tT]).?\s(\d+))$ Data type: string Data minimum length: 9 Data maximum length: 16 |
1 (234) 567-8901 |
US Social Security Number | Value | RegularExpression: ^([1-578]\d{2}|0[1-9]\d|00[1-9]|6[0-57-9]\d|66[0-57-9])([ -.]?)([1-9]\d|0[1-9])\2([1-9]\d{3}|0[1-9]\d{2}|00[1-9]\d|000[1-9])$ Data type: numeric, string Data minimum length: 9 Data maximum length: 11 |
123-45-6789 |
US Social Security Number Last 4 | Value | RegularExpression: ^([1-9]\d{3}|0[1-9]\d{2}|00[1-9]\d|000[1-9])$ Data minimum length: 4 Data maximum length: 4 Column name filter: ssn(4)?$|(ssn|social(.?security)?|socsec)(.4)? |
|
US Standard Industrial Classification | Value | Java class: com.ibm.infosphere.classification.impl.USSICClassifier Data minimum length: 3 Data maximum length: 4 Column name filter: SIC|USSIC|Standard Industrial Classification |
1234 |
US State Capital Name | Value | List of US state capital names; case-insensitive Data type: string Data minimum length: 5 Data maximum length: 14 |
Montgomery |
US State Code | Value | List of US state codes; case-sensitive Data type: string Data minimum length: 2 Data maximum length: 2 |
DE |
US State Name | Value | List of US state names; case-insensitive Data type: string Data minimum length: 4 Data maximum length: 20 |
Massachusetts |
US Street Name | Value | Java class: com.ibm.infosphere.classification.impl.StreetClassifier Data type: string Data minimum length: 7 Data maximum length: 50 |
8475 NW St |
US Zip Code | Value | Java class: com.ibm.infosphere.classification.impl.USZipCodeClassifier Data type: numeric, string Data minimum length: 5 Data maximum length: 10 |
02201-1020 |
Utah State Driver’s License | Value | RegularExpression: ^\d{4,9}$ Data type: numeric, string Data minimum length: 4 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456789 |
Vehicle Identification Number (VIN) | Value | Java class: com.ibm.infosphere.classification.impl.VehicleIdNumber Data type: string Data minimum length: 17 Data maximum length: 17 |
1JCCM85E5BT001312 |
Vermont State Driver’s License | Value | RegularExpression: ^\d{8}$|^\d{7}A$ Data type: string Data minimum length: 8 Data maximum length: 8 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567A |
Virginia State Driver’s License | Value | RegularExpression: ^([0-6]\d{2}|7[0-6]\d|77[0-2])([-.]?)(\d{2})\2(\d{4})$|^[A-Za-z]{1}\d{8}$ Data type: string Data minimum length: 9 Data maximum length: 11 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A12345678 |
VISA Card | Value | Java class: com.ibm.infosphere.classification.impl.VisaClassifier Data type: numeric, string Data minimum length: 17 Data maximum length: 18 |
4024007121595481 |
Washington DC State Driver’s License | Value | RegularExpression: ^\d{7}$|^\d{9}$ Data type: numeric, string Data minimum length: 7 Data maximum length: 9 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
1234567 |
Washington State Driver’s License | Value | RegularExpression: ^(?=.[A-Za-z]{2})([a-zA-Z]{2}[A-Za-z]{5}\d{3}[A-Za-z0-9]{2})$ Data type: string Data minimum length: 12 Data maximum length: 12 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
DOE**MJ501A1 |
West Virginia State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{6}$|^\d{7}$ Data type: string Data minimum length: 7 Data maximum length: 7 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
A123456 |
Wisconsin State Driver’s License | Value | RegularExpression: ^[a-zA-Z]{1}\d{3}-\d{4}-\d{4}-\d{2}$ Data type: string Data minimum length: 17 Data maximum length: 17 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
Z123-1234-5678-12 |
Wyoming State Driver’s License | Value | RegularExpression: ^\d{6}-\d{3}$ Data type: string Data minimum length: 10 Data maximum length: 10 Column name filter: d(.)?l(.)?([ -]?(number|no(.)?))?$|driv(ing|er(s|'s)?)[ -]license|license |
123456-123 |
Parent topic: Data classes