The platform provides collaborative workspaces and tools, and you provide the content to the platform, in the form of assets. An asset is an item that contains information about data, other valuable information, or code that works with
data.
You add assets by importing them or creating them with tools. You work with assets in collaborative workspaces. The workspace that you use depends on your tasks.
Projects Where you collaborate with others to work with data and create assets. Most tools are in projects and you run assets that contain code in projects. For example, you can import data, prepare data, analyze data, or
create models in projects. See Projects.
Catalogs Where you store assets to share with your organization or go to find assets that you need to work with. You can copy assets from catalogs into projects, or publish assets from projects into the catalog. You can edit
asset properties and metadata in a catalog, but you can't run assets. See Catalogs.
Deployment spaces Where you deploy and run assets that are ready for testing or production. You move assets from projects into deployment spaces and then create deployments from those assets. You monitor and update deployments
as necessary. See Deployment spaces.
You can find any asset in any of the workspaces for which you are a collaborator by searching for it from the global search bar. See Searching for assets across the platform.
You can create many different types of assets, but all assets have some common properties:
To create most types of assets, you must use a specific tool. Most tools are provided by one or more services. The tools to create data assets and connection assets are provided by the platform and do not require any specific services.
To see which services you need for which tools, open the tools and services map.
The following table lists the types of assets that you can create, the tools you need to create them, and the workspaces where you can add them.
Assets accumulate information in properties when you create them, use them, or when they are updated by automated processes. Some properties are provided by users and can be edited by users. Other properties are automatically provided by the
system. Most system-provided properties can't be edited by users.
The Last modified field for an asset tracks both user actions and system actions. System actions often occur in the background and might involve only changes to the asset's internal metadata.
Common properties for assets everywhere
Copy link to section
Most types of assets have the properties that are listed in the following table in all the workspaces where those asset types exist.
Common properties for assets
Property
Description
Editable?
Name
The asset name. Can contain up to 255 characters. Supports multibyte characters. Cannot be empty, contain Unicode control characters, or contain only blank spaces. Asset names do not need to be unique within a project or deployment space.
Whether asset names must be unique in a catalog depends on the duplicate handling method set for the catalog.
Yes
Description
Optional. Supports multibyte characters and hyperlinks.
Yes
Creation date
The timestamp of when the asset was created or imported.
No
Creator or Owner
The username or email address of the person who created or imported the asset.
No
Last modified date
The timestamp of when the asset was last modified.
No
Last editor
The username or email address of the person who last modified the asset.
No
Common properties for assets in catalogs
Copy link to section
In addition to the common properties that all assets have, assets in catalogs have the properties and pages that are listed in the following table.
Set to public by default. This setting can restrict access to an asset in a catalog when set to private. Only the owner and members of the asset can view and use private assets.
Yes
Access page
The owner and members of the asset. By default, the asset owner is the user who added the asset to the catalog. The asset members can view and use the asset when it is marked private. See Controlling access to an asset.
Yes
Ratings page
Optional. Catalog collaborators can rate and review assets.
Yes
Tags
Optional. Text labels that catalog collaborators create to simplify searching. A tag consists of one string of up to 255 characters. It can contain spaces, letters, numbers, underscores, dashes, and the symbols # and @.
Yes
Relationships
Optional. Relationships that appear in the Related items section of the asset Overview page are informational and do not have other effects on the asset. Can be between assets in the same workspace or
different workspaces. For example, you can add a relationship between an asset in a catalog and an asset in a project. Can be between an asset and an artifact. For example, you can add a relationship between an asset and a policy.
Administrators can create custom relationships for assets. See Adding asset relationships.
Yes
Governance artifacts
Optional. The business terms and classification that users assigned to the asset. These assignments can affect the asset. For example, an assigned business term can trigger the enforcement of a data protection rule.
Yes
You can create custom properties for asset types. Custom properties are shown in the Details section on the asset's Overview tab in the catalog. See Custom properties and relationships.
Some assets are associated with running a tool. For example, an AutoAI experiment asset runs in the AutoAI tool. Assets that run in tools are also known as operational assets. Every time that you run assets in tools, you start a job. You can
monitor and schedule jobs. Jobs use compute resources. Compute resources are measured in capacity unit hours (CUH) and are tracked. Depending on your service plans, you can have a limited amount of CUH per month, or pay for the CUH that
you use every month.
For many assets that run in tools, you have a choice of the compute environment configuration to use. Typically, larger and faster environment configurations consume compute resources faster.
In addition to basic properties, most assets that run in tools contain the following types of information in projects:
Properties for assets in projects
Properties
Description
Editable?
Workspaces
Environment definition
The environment template, hardware specification, and software specification for running the asset. See Environments.
Yes
Projects, Spaces
Settings
Information that defines how the asset is run. Specific to each type of asset.
Yes
Projects
Associated data assets
The data that the asset is working on.
Yes
Projects
Jobs
Information about how to run the asset, including the environment definition, schedule, and notification options. See Jobs.
Yes
Projects, Spaces
Data asset types and their properties
Copy link to section
Data asset types contain metadata and other information about data, including how to access the data.
How you create a data asset depends on where your data is:
If your data is in a file, you upload the file from your local system to a project, catalog, or deployment space.
If your data is in a remote data source, you first create a connection asset that defines the connection to that data source. Then, you create a data asset by selecting the connection, the path or other structure, and the table
or file that contains the data. This type of data asset is called a connected data asset.
For data sources that support SQL queries, you can also create dynamic views, which are data assets of the type Query. To create such an asset, select the connection and provide an SQL query that retrieves only
the data that you need.
The following graphic illustrates how data assets from files point to uploaded files in Cloud Object Storage. Connected data assets require a connection asset and point to data in a remote data source.
You can create the following types of data assets in a project, catalog, or deployment space:
Data asset from a file Represents a file that you uploaded from your local system. The file is stored in the object storage container on the IBM Cloud Object Storage instance that is associated with the workspace. The contents
of the file can include structured data, unstructured textual data, images, and other types of data. You can create a data asset with a file of any format. However, you can do more actions on CSV files than other file types. See Properties of data assets.
You can create a data asset from a file by uploading a file in a workspace. You can also create data files with tools and convert them to assets. For example, you can create data assets from files with the Data Refinery, Jupyter notebook,
and RStudio tools.
Connected data asset Represents a table, file, or folder that is accessed through a connection to a remote data source. The connection is defined in the connection asset that is associated with the connected data asset.
You can create a connected data asset for every supported connection. When you access a connected data asset, the data is dynamically retrieved from the data source. See Properties of data assets.
You can import connected data assets from a data source with the connected data tool in a workspace. If you want to import sets of connected data assets, for example an entire database schema, use the metadata import tool in projects. You
can create virtual tables that compile data from multiple data sources with Data Virtualization in the Data virtualization workspace.
In projects, you can create dynamic views that contain filtered data from one or more tables in a data source by using the query data-access tool.
Folder asset Represents a folder in IBM Cloud Object Storage. A folder data asset is special case of a connected data asset. You create a folder data asset by specifying the path to the folder and the IBM Cloud Object Storage
connection asset. You can view the files and subfolders that share the path with the folder data asset. The files that you can view within the folder data asset are not themselves data assets. For example, you can create a folder data
asset for a path that contains news feeds that are continuously updated. See Properties of data assets.
You can import folder assets from IBM Cloud Object Storage with the connected data tool in a workspace.
Connection asset Contains the information necessary to create a connection to a data source. See Properties of connection assets.
You can create connections with the connection tool in a workspace.
Learn more about creating and importing data assets:
Properties of data assets from files and connected data assets
Copy link to section
In addition to basic properties and common catalog properties, data assets from files and connected data assets have the properties or pages that are listed in the following table.
Properties of data assets from files and connected data assets
Property or page
Description
Editable?
Workspaces
Columns
A summary of the properties of the columns in the data asset. Includes the quality score, description, assigned data classes, and assigned business terms for each column. The assigned data classes and business terms can affect the asset.
For example, an assigned business term can trigger the enforcement of a data protection rule.
Primary key and key relationship information: • A column that is set as primary key is identified by a key icon (. A primary key is also shown in the asset side panel. • If key relationships exist for the asset, you can click the View key relationships link. On
the Parent of tab, you see all relationships for the primary key. On the Child of tab, you see all relationships for which the asset contains a foreign key.
No
Catalogs
Tags
Optional. Text labels that users create to simplify searching. A tag consists of one string of up to 255 characters. It can contain spaces, letters, numbers, underscores, dashes, and the symbols # and @.
Yes
Projects, Catalogs
Format
The MIME type of a file. Automatically detected.
Yes
Projects, Catalogs, Spaces
Asset details
Information about the size of the data, the number of columns and rows, and the asset version. In projects, also the table type of relational data is shown.
No
Projects, Catalogs, Spaces
Source
Information about the data file in storage or the data source and connection.
No
Catalogs, Spaces
Query
SQL query that generates the asset. Dynamic views only.
Yes
Projects
Connection details
For connected data assets, the path, the connection name, the type of connector, and the connection owner. For dynamic views, only the connection name and the connector type are shown.
No
Projects
Activities pane
The history of actions performed on the asset in all workspaces. See Activities.
No
Projects, Catalogs
Preview asset or Asset page
A preview of the data that includes a limited set of columns and rows from the original data source. See Asset contents or previews.
No
Projects, Catalogs, Spaces
Profile page
Metadata and statistics about the content of the data. For example, when an enriched asset is published to a catalog, the expanded metadata is also published, and Display name and Description, which
can be an AI-generated or an edited version, show on this page. This information is also surfaced on the Overview page. See Profile.
Yes
Projects, Catalogs
Data quality page
Information about the data quality of an asset and its columns, and the data quality checks that were applied. See Data quality.
Charts and graphs that users create to understand the data. See Visualizations.
Yes
Projects
Feature group page
Information about which columns in the data asset are used as features in models. See Managing feature groups.
Yes
Projects, Catalogs, Spaces
Properties of connection assets
Copy link to section
The properties of connection assets depend on the data source that you select when you create a connection. See Connection types. Connection assets for most data sources have the properties that
are listed in the following table.
Properties of connection assets
Properties
Description
Editable?
Workspaces
Connection details
The information that identifies the data source. For example, the database name, hostname, IP address, port, instance ID, bucket, endpoint URL, and so on.
Yes
Projects, Catalogs, Spaces
Credential setting
Whether the credentials are shared across the platform (default) or each user must enter their personal credentials. Not all data sources support personal credentials.
Yes
Projects, Catalogs, Spaces
Authentication method
The format of the credentials information. For example, an API key or a username and password.
Yes
Projects, Catalogs, Spaces
Credentials
The username and password, API key, or other credentials, as required by the data source and the specified authentication method.
Yes
Projects, Catalogs, Spaces
Certificates
Whether the data source port is configured to accept SSL connections and other information about the SSL certificate.
Use this interactive map to learn about the relationships between your tasks, the tools you need, the services that provide the tools, and where you use the tools.
Select any task, tool, service, or workspace
You'll learn what you need, how to get it, and where to use it.
Some tools perform the same tasks but have different features and levels of automation.
Jupyter notebook editor
Prepare data
Visualize data
Build models
Deploy assets
Create a notebook in which you run Python, R, or Scala code to prepare, visualize, and analyze data, or build a model.
AutoAI
Build models
Automatically analyze your tabular data and generate candidate model pipelines customized for your predictive modeling problem.
SPSS Modeler
Prepare data
Visualize data
Build models
Create a visual flow that uses modeling algorithms to prepare data and build and train a model, using a guided approach to machine learning that doesn’t require coding.
Decision Optimization
Build models
Visualize data
Deploy assets
Create and manage scenarios to find the best solution to your optimization problem by comparing different combinations of your model, data, and solutions.
Data Refinery
Prepare data
Visualize data
Create a flow of ordered operations to cleanse and shape data. Visualize data to identify problems and discover insights.
Orchestration Pipelines
Prepare data
Build models
Deploy assets
Automate the model lifecycle, including preparing data, training models, and creating deployments.
RStudio
Prepare data
Build models
Deploy assets
Work with R notebooks and scripts in an integrated development environment.
Federated learning
Build models
Create a federated learning experiment to train a common model on a set of remote data sources. Share training results without sharing data.
Deployments
Deploy assets
Monitor models
Deploy and run your data science and AI solutions in a test or production environment.
Catalogs
Catalog data
Governance
Find and share your data and other assets.
Metadata import
Prepare data
Catalog data
Governance
Import asset metadata from a connection into a project or a catalog.
Metadata enrichment
Prepare data
Catalog data
Governance
Enrich imported asset metadata with business context, data profiling, and quality assessment.
Data quality rules
Prepare data
Governance
Measure and monitor the quality of your data.
Masking flow
Prepare data
Create and run masking flows to prepare copies of data assets that are masked by advanced data protection rules.
Governance
Governance
Create your business vocabulary to enrich assets and rules to protect data.
Data lineage
Governance
Track data movement and usage for transparency and determining data accuracy.
AI factsheet
Governance
Monitor models
Track AI models from request to production.
DataStage flow
Prepare data
Create a flow with a set of connectors and stages to transform and integrate data. Provide enriched and tailored information for your enterprise.
Data virtualization
Prepare data
Create a virtual table to segment or combine data from one or more tables.
OpenScale
Monitor models
Measure outcomes from your AI models and help ensure the fairness, explainability, and compliance of all your models.
Data replication
Prepare data
Replicate data to target systems with low latency, transactional integrity and optimized data capture.
Master data
Prepare data
Consolidate data from the disparate sources that fuel your business and establish a single, trusted, 360-degree view of your customers.
Services you can use
Services add features and tools to the platform.
watsonx.ai Studio
Develop powerful AI solutions with an integrated collaborative studio and industry-standard APIs and SDKs. Formerly known as Watson Studio.
watsonx.ai Runtime
Quickly build, run and manage generative AI and machine learning applications with built-in performance and scalability. Formerly known as Watson Machine Learning.
IBM Knowledge Catalog
Discover, profile, catalog, and share trusted data in your organization.
DataStage
Create ETL and data pipeline services for real-time, micro-batch, and batch data orchestration.
Data Virtualization
View, access, manipulate, and analyze your data without moving it.
Watson OpenScale
Monitor your AI models for bias, fairness, and trust with added transparency on how your AI models make decisions.
Data Replication
Provide efficient change data capture and near real-time data delivery with transactional integrity.
Match360 with Watson
Improve trust in AI pipelines by identifying duplicate records and providing reliable data about your customers, suppliers, or partners.
Manta Data Lineage
Increase data pipeline transparency so you can determine data accuracy throughout your models and systems.
Where you'll work
Collaborative workspaces contain tools for specific tasks.
Project
Where you work with data.
> Projects > View all projects
Catalog
Where you find and share assets.
> Catalogs > View all catalogs
Space
Where you deploy and run assets that are ready for testing or production.
> Deployments
Categories
Where you manage governance artifacts.
> Governance > Categories
Data virtualization
Where you virtualize data.
> Data > Data virtualization
Master data
Where you consolidate data into a 360 degree view.