Getting started with Master Data Management (Beta)
This getting started tutorial walks you through the steps of setting up and using Master Data Management (Beta) to onboard, match, and explore your master data.
Complete the following sections to get started using the Master Data Management service.
- Step 1. Creating a master data configuration asset
- Step 2. Loading your first data asset into Master Data Management
- Step 3. Customizing your data model (optional)
- Step 4. Mapping your data into the data model
- Step 5. Setting up and running matching
- Step 6. Exploring your master data
Tip: The page titles double as a navigation menu. Click on a page title to open the navigation menu and see a list of other pages within the current area of Master Data Management (configure or explore). Choose an item from the list to navigate to that page. For example, if you are in the configuration space, click on the title to see the list of configuration pages, including Overview, Data setup, and Matching setup.
Step 1. Creating a master data configuration asset
The first step in setting up Master Data Management is to create your master data configuration asset. The configuration asset is where you will onboard data sources, map your data into the system, customize your data model, and set up and tune the matching algorithm.
- Start Master Data Management.
Click Set up master data to create your configuration asset.
You must have the correct privileges to be able to create and configure a configuration asset.
- Review the service instance name. Optionally, rename it to be more descriptive. Click Next.
- Select an existing Cloud Pak for Data as a Service project to use with this Master Data Management service instance or create a new one by clicking +. Click Next.
Optionally, you can associate your Master Data Management instance with a catalog. Choose a catalog from your associated Watson Knowledge Catalog instance, or create a new one by clicking +.
It is only possible to create a catalog under the following conditions:
- You have a Watson Knowledge Catalog lite plan and haven’t yet created the one catalog you’re allowed.
- You belong to an account that has a Watson Knowledge Catalog standard or professional plan and you have the Watson Knowledge Catalog service Administrator role assigned.
- Click Finish.
You’ve now created your master data configuration asset. Now it’s time to get it set up and match some data!
Step 2. Loading your first data asset into Master Data Management
In this step, you will add data into your system, such as from a flat data file in CSV or TSV format. If you have a data file containing record data already, you can use that.
If you don’t have a data file ready to go but want to get started using Master Data Management, you can skip this step and load the provided sample data and model instead. From the master data home page, go to the Master data tile, then click Publish sample model. After the model loads, click Publish sample data.
- From the master data home page, click Configuration to open the Data setup screen. Click Start with data assets.
- Click Add data or the Find and add data icon in the action bar at the top of the screen.
- From the Data panel that opens, choose whether to add data by upload, from the project, or from the catalog. For this tutorial, choose Load to upload a data file.
- On your local machine, select a flat data file in CSV or TSV format and drag it into the Data panel. When the file finishes uploading, it is added to your assets summary list.
- Review the details of your newly added asset. If your asset does not have any information in the Asset record type column, you must define the record type.
- Select your asset in the assets summary list.
- Click Assign record type and select the correct record type from the list. If the appropriate record type is not in the list, then you might have to customize your data model.
Step 3. Customizing your data model (optional)
When you onboard your first data asset, Master Data Management automatically generates the data model using a combination of industry standard model attributes and embedded Watson technology. When you upload additional data, the model will intelligently adjust itself to accomodate newly populated attributes and fields. You can always customize the model to match your organization’s requirements by adding new record types, attributes, and fields.
- On the Data setup screen, click the Modeling tab.
- Review the current model’s record types and attribute types.
- From here, you can:
- View or edit existing record types or create new ones. By default, the data model includes definitions for Person and Organization record types.
- View or edit existing attribute types or create new ones. You can add or remove fields in each attribute type to reflect your organization’s data model requirements.
- When you are done, click the publish model icon in the action bar at the top of the screen.
Step 4. Mapping your data into the data model
Each data source or asset must be mapped and loaded into the data model before it can be used in Master Data Management functions such as matching. Master Data Management includes a powerful automapping capability that removes the need for data engineers to manually map each column of data into the model. The automapping feature detects, analyzes, and categorizes each column of data to the corresponding attributes or fields in the data model. Before you can run automapping, you must profile your data.
- On the Data setup screen, click the Mapping tab.
From the Asset list in the left panel, select the data source that you want to map into the system. The data from the file displays in tabular format with a number of rows and columns. Each column represents an attribute that must be mapped to a corresponding attribute type in the data model. When you first open a data source or asset, each column is marked with a Not Mapped tag.
You can manually map each column if you choose, but you can greatly speed up the mapping process by taking advantage of the automapping feature.
To enable automapping for this asset, you must first profile the data. Click Profile. Profiling analyzes and classifies your data to enable the automapping process to take place. Profiling can take some time to complete, so it runs in the background to allow you to continue working. You might want to start reviewing and manually mapping some columns.
Automapping will never overwrite any manual mapping that you have done.
- When profiling completes, click Auto map. Master Data Management analyzes your data and automatically maps as many columns as possible into the data model. Even if it cannot map a given column, the automap function can suggest some of the most likely mapping selections.
- Review the automapping. If any of the mappings are incorrect, or if a column remains unmapped, then manually map it correctly. Alternately, if a given column is not required, you can exclude it from your Master Data Management data load.
To manually map a column, select it, then use the Mapping targets panel on the right to search for and select the appropriate attribute or field from the data model. Click Map and save to data model.
Scroll right and left through the columns to ensure that every column in your data source is mapped.
- When you’ve finished mapping the data source, you’re ready to publish the data into the system.
- If your data model is new or has changed, you’ll need to publish your model first by clicking the publish model icon in the action bar. Wait for the publish job to complete.
- To publish your data, click the publish data icon in the action bar . Wait for the publish job to complete.
- Return to the Data setup Overview page by clicking the Data setup page title and selecting Overview from the list.
- On the Overview page, confirm that you have at least one data source or asset added and mapped.
Step 5. Setting up and running matching
When your data is mapped and published into the Master Data Management system, you can run the powerful matching process on the data. The matching process analyzes your data to determine if there are any duplicate records in your data. Suspected duplicate records are merged into master data entities to establish a single, trusted, 360-degree view of your customers. Each entity contains one or more records.
Before running matching, ensure that you have loaded, mapped, and published your data sources and data model into the Master Data Management service.
- From the master data home page, click Go to configuration. The master data configuration space opens and displays an overview of your current configuration.
- Click the navigation menu and select Matching setup to open the matching setup page.
- Go to the Match settings tab to select the attributes to use in matching data. The first time you navigate to this tab, the Master Data Management service automatically generates some suggested attributes from your data model to use in matching.
- Review the list of matching attributes. These attributes will be used as the basis of comparison to match records and create master data entities. To add or remove attributes from the list, click Select attributes then select or deselect attributes as needed.
- When you are satisfied with your matching attributes, click the run matching icon in the action bar. The matching process will take a while to complete. It will run in the background so that you can continue working. You’ll be notified when it’s complete.
When matching is complete, go to the Match results tab to see a dashboard of statistics and visualizations to provide insight about your master data.
You can adjust your matching algorithm at any time by editing your matching attributes.
As you add more data sources and assets to your Master Data Management service instance and rerun matching, the new data will be matched both within itself and against the existing data in the system. In this way, you can build a unified, single, 360-degree view of your customers across your entire enterprise.
Step 6. Exploring your master data
After a data engineer has configured the Master Data Management service instance, loaded and mapped data, and run matching, a business analyst or data steward user can explore the master data to search, view, and analyze it.
- From the master data home page, click Search master data to open the master data explorer.
- Search within your data to find data to explore. You can choose whether to search for entities or records, and you can either run a simple text search or an advanced search using rules.
- From your search results list, you can:
- Click a row to see details of the entity or record.
- Use the row’s three-dot menu or Explore icon to select an entity or record for further exploration in the Explore tab. When you send an entity or record to the Explore tab, you can more closely review its details and compare it to any other entities or records in the Explore tab.
- Choose the Explore tab to review and compare the details of any entities or records that you selected for exploration.
- Select any of the entities or records in the Entity explorer panel to view their detailed attributes.
You can also export master data from the master data explorer.