0 / 0
Data governance tutorial: Protect your data

Data governance tutorial: Protect your data

Take this tutorial to protect your data after completing the Curate high quality data tutorial with the Data governance use case of the data fabric trial. Your goal is to control access to data across catalogs in the data fabric.

Quick start: If you did not already create the sample project for this tutorial, access the Data governance sample project in the Resource hub.

The story for the tutorial is that Golden Bank has several departments that need access to high-quality customer mortgage data. As a Data Steward on the governance team, you will create data protection rules to protect confidential mortgage data.

The following animated image provides a quick preview of what you’ll accomplish by the end of this tutorial where you will create data protection rules that define how to deny access to confidential information and how to mask personal information. Click the image to view a larger image.

Animated image

Preview the tutorial

In this tutorial, you will complete these tasks:

Watch Video Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.

This video provides a visual method to learn the concepts and tasks in this documentation.




  • Use the video picture-in-picture

    Tip: Start the video, then as you scroll through the tutorial, the video moves to picture-in-picture mode. Close the video table of contents for the best experience with picture-in-picture. You can use picture-in-picture mode so you can follow the video as you complete the tasks in this tutorial. Click the timestamps for each task to follow along.

    The following animated image shows how to use the video picture-in-picture and table of contents features:

    How to use picture-in-picture and chapters

    Get help in the community

    If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.

    Set up your browser windows

    For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

    Side-by-side tutorial and UI

    Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.

    Back to the top

  • Complete the Curate high quality data tutorial

    preview tutorial video To preview this task, watch the video beginning at 00:43.

    Complete the Curate high quality data tutorial to import and enrich data assets and publish them to a catalog. In addition, to get your masked assets reviewed from other users, you must add category collaborators by completing the set up the prerequisites steps from the Curate high quality data tutorial.


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 01:01.

    A data protection rule defines how to control access to data in a data asset. You can mask data, deny access to data, or filter out row in the data asset. Follow these steps to create a data protection rule to deny access to confidential information for some of the mortgage data assets:

    1. From the Cloud Pak for Data navigation menu Navigation menu, choose Catalogs > View all catalogs.

    2. Open the Mortgage Approval Catalog.

    3. Click the CREDIT_SCORE data asset. Notice that it contains the confidential tag. You will create a rule to deny access to the data.

    4. From the Cloud Pak for Data navigation menu Navigation menu, choose Governance > Rules.

    5. Click Add rule > New data protection rule.

    6. For the Name, copy and paste the following text:

      Confidential Information
      
    7. For the Business definition, copy and paste the following text:

      Rule to prevent unauthorized users from accessing data in data assets that have been tagged as confidential
      
    8. Click Next.

    9. For When does this rule apply?, select the following options.

      1. Select Tag.

      2. Select contains any.

      3. Type the tag name, confidential.

    10. For the What does this rule do?, select Deny access to data.

    11. Click Create. This rule now denies access for anyone trying to access the data in data assets that are tagged as “Confidential”. This rule applies in the Catalog Preview, Catalog Download, Data Refinery, and Project Asset preview. The rule doesn’t apply to the person who added the asset to a catalog.

    12. Watch Video Watch the video at 02:20 to see what other users see trying to access the CREDIT_SCORE data asset.

    Checkpoint icon Check your progress

    The following image shows the data protection rule to deny access. This rule takes effect immediately.

    Deny access rule

    Checkpoint The following image shows what the user sees when this rule is in effect. In this case, the user is denied access to the asset.

    Deny access to asset


    Back to the top


  • preview tutorial video To preview this task, watch the video beginning at 02:28.

    Some of the mortgage data assets include personal identifiable information, which you need to protect, but the rest of the columns contains valuable information that is beneficial to a broader audience. That is where data masking comes in handy. Follow these steps to create a data protection rule that masks columns with a US Social Security Number:

    1. From the Cloud Pak for Data navigation menu Navigation menu, choose Catalogs > View all catalogs.

    2. Click Mortgage Approval Catalog.

    3. In the catalog, click the MORTGAGE_APPLICANTS_TRUST data asset.

    4. Click the Asset tab to preview the data. Notice that one of the columns contains Social Security Numbers.

    5. Return to the Overview tab to see more metadata about the columns. In the list of columns, search for the Social Security Number column to see that this column was auto-assigned the Social Security Number business term. You will create a rule to mask this column.

    6. Click Close to return to the asset preview.

    7. From the Cloud Pak for Data navigation menu Navigation menu, choose Governance > Rules.

    8. Click Add rule > New data protection rule.

    9. For the Name, copy and paste the following text:

      Redact Social Security Number
      
    10. For the Business definition, copy and paste the following text:

      Rule to redact Social Security Number
      
    11. Click Next.

    12. For When does this rule apply?, select the following options:

      1. Select Business term.

      2. Select contains any.

      3. Start typing social, and then select Social Security Number.

    13. For the What does this rule do?, select Redact columns. Business term and Social Security Number are filled in for you. This option replaces the data with Xs. You can hover over each masking option to see an example of masked data with the selected option.

    14. Click Create. This rule redacts columns with US Social Security Numbers in data assets.

    15. Watch Video Watch the video at 03:49 to see what other users see accessing the MORTGAGE_APPLICANTS data asset.

    Checkpoint icon Check your progress

    The following image shows the data protection rule to mask data. This rule takes effect immediately.

    Mask data rule

    Checkpoint The following image shows what the user sees when this rule is in effect. In this case, the Social Security Number column is masked using the redact method.

    Masked asset


    Back to the top

As a Data Steward on the governance team, you learned how to create data protection rules to protect confidential mortgage data.

Next steps

You are now ready to consume your data by evaluating, sharing, shaping, and analyzing data in the data fabric. See the Consume your data tutorial.

Learn more

Parent topic: Use case tutorials

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more