Quick start tutorials
Take quick start tutorials to learn how to perform specific tasks, such as refine data or build a model. These tutorials help you quickly learn how to do a specific task or set of related tasks.
If you want to learn how to implement specific use cases, then consider taking the use case tutorials. The use case tutorials help you to try out data fabric use cases, such as Data integration, or building and governing AI use cases, such as Data Science and MLOps.
The quick start tutorials are categorized by task:
Each tutorial requires one or more service instances. Some services are included in multiple tutorials. The tutorials are grouped by task. You can start with any task. Each of these tutorials provides a description of the tool, a video, the instructions, and additional learning resources.
The tags for each tutorial describe the level of expertise (, , or ), and the amount of coding required (, , or ).
Prerequisite
The prerequisite for all tutorials is to sign up for or join a Cloud Pak for Data as a Service account.
After completing these tutorials, see the Other learning resources section to continue your learning.
Preparing data
To get started with preparing, transforming, and integrating data, understand the overall workflow, choose a tutorial, and check out other learning resources for working on the platform.
Your data preparation workflow has these basic steps:
-
Create a project.
-
If necessary, create the service instance that provides the tool you want to use and associate it with the project.
-
Add data to your project. You can add data files from your local system, data from a remote data source that you connect to, data from a catalog, or sample data from the Resource hub.
-
Choose a tool to analyze your data. Each of the tutorials describes a tool.
-
Run or schedule a job to prepare your data.
Tutorials for preparing data
Each of these tutorials provides a description of the tool, a video, the instructions, and additional learning resources:
Tutorial | Description | Expertise for tutorial |
---|---|---|
Refine and visualize data with Data Refinery | Prepare and visualize tabular data with a graphical flow editor. | Select operations to manipulate data. |
Transform data with DataStage | Design a data integration flow to filter and sort tables with a graphical flow editor. | Drop data and operation nodes on a canvas and select properties. |
Virtualize data | Create a virtualized table by joining two tables. | Select tables and connect the primary key columns. |
Analyzing and visualizing data
To get started with analyzing and visualizing data, understand the overall workflow, choose a tutorial, and check out other learning resources for working with other tools.
Your analyzing and visualizing data workflow has these basic steps:
-
Create a project.
-
If necessary, create the service instance that provides the tool you want to use and associate it with the project.
-
Add data to your project. You can add data files from your local system, data from a remote data source that you connect to, data from a catalog, or sample data from the Resource hub.
-
Choose a tool to analyze your data. Each of the tutorials describes a tool.
Tutorials for analyzing and visualizing data
Each of these tutorials provides a description of the tool, a video, the instructions, and additional learning resources:
Tutorial | Description | Expertise for tutorial |
---|---|---|
Analyze data in a Jupyter notebook | Load data, run, and share a notebook. | Understand generated Python code. |
Refine and visualize data with Data Refinery | Prepare and visualize tabular data with a graphical flow editor. | Select operations to manipulate data. |
Building, deploying, and trusting models
To get started with building, deploying, and trusting models, understand the overall workflow, choose a tutorial, and check out other learning resources for working on the platform.
The different stages involved in the AI lifecycle are as follows:
- Define scope: Start by defining the scope of your project by identifying the key objectives and requirements.
- Prepare data: Collect and prepare data for use with machine learning algorithms.
- Build model: Develop and refine the AI model to solve the defined problem by training the model with prepared data.
- Deploy model: Deploy the model to production after the building process is complete.
- Automate pipeline: Automate the path to production by automating parts of the AI lifecycle.
- Monitor performance: Evaluate your model's performance for fairness, quality, drift and explainability.
The following diagram shows the stages of the AI lifecycle:
Your workflow to build, deploy, and trust models has these basic steps:
-
Create a project.
-
If necessary, create the service instance that provides the tool you want to use and associate it with the project.
-
Choose a tool to build, deploy, and trust models. Each of the tutorials describes a tool.
Tutorials for building, deploying, and trusting models
Each tutorial provides a description of the tool, a video, the instructions, and additional learning resources:
Tutorial | Description | Expertise for tutorial |
---|---|---|
Build and deploy a machine learning model with AutoAI | Automatically build model candidates with the AutoAI tool. | Build, deploy, and test a model without coding. |
Build and deploy a machine learning model in a notebook | Build a model by updating and running a notebook that uses Python code and the Watson Machine Learning APIs. | Build, deploy, and test a scikit-learn model that uses Python code. |
Build and deploy a machine learning model with SPSS Modeler | Build a C5.0 model that uses the SPSS Modeler tool. | Drop data and operation nodes on a canvas and select properties. |
Build and deploy a Decision Optimization model | Automatically build scenarios with the Modeling Assistant. | Solve and explore scenarios, then deploy and test a model without coding. |
Evaluate a machine learning model | Deploy a model, configure monitors for the deployed model, and evaluate the model. | Run a notebook to configure the models and use Watson OpenScale to evaluate. |
Curating and governing data
To get started with curating and governing data, understand the overall workflows, choose a tutorial, and check out other learning resources for working in Cloud Pak for Data as a Service.
If you are working with your own IBM Knowledge Catalog Lite plan, you can create two catalogs with 50 assets, five business terms, and one data protection rule. You are also the owner of the only category for organizing governance artifacts like data classes and business terms.
If you are working in your organization's account with a IBM Knowledge Catalog Standard, Enterprise, or Professional plan, you must have specific roles and permissions for curating data and creating governance artifacts like data classes and business terms.
Your data curation workflow has these basic steps:
- Add data assets to a catalog:
- Add data assets one at a time in a project and then publish them to a catalog.
- Add all data assets from a connection in a project by importing metadata, and then publish them to a catalog.
- Add data assets one at a time from within a catalog.
- Enrich the data assets by assigning governance artifacts, such as business terms.
Your governing data workflow has these basic steps:
- For a data protection rule, specify how to identify the type of data to mask and the masking method. The rule is enforced immediately.
- For all other types of governance artifacts:
- Create the draft governance artifacts in a category.
- Publish the governance artifacts.
Tutorials for curating and governing data
Choose a data fabric tutorial in the Data governance use case.
Other learning resources
Guided tutorials
Access the Build an AI model sample project to follow a guided tutorial in the Resource hub. After you create the sample project, the readme provides instructions:
- Choose Explore and prepare data to remove anomalies in the data with Data Refinery.
- Choose Build a model in a notebook to build a model with Python code.
- Choose Build and deploy a model to automate building a model with the AutoAI tool.
Watch this video series to see how to work with the assets in the sample project.
General
Preparing data
Analyzing and visualizing data
Building, deploying, and trusting models
Curating and governing data
Videos
- A comprehensive set of videos that show many common tasks in Cloud Pak for Data as a Service.
Samples
-
Resource hub provides sample notebooks, data sets, and projects that you can import.
-
Industry accelerators provide sample projects with end-to-end solutions that solve specific business problems.
-
Knowledge Accelerators provide industry-specific sets of ready to use governance artifacts.
Training
-
Take a data fabric tutorial to try out a data fabric use case, such as AI governance, Data Science and MLOps, Data governance, or Data integration.
-
Take control of your data with Watson Studio is a learning path that consists of step-by-step tutorials that explain the process of working with data using Watson Studio.
Parent topic: Getting started