IBM
Cloud Pak for Data
IBM
Cloud Pak for Data
Log In
Sign Up
Branch
Pull
0
Commit
0
Push
0
Checkout branch
Merge conflict
Commit history
Git preferences
Pull and push
Pull only
Migrate
New
Add
New
Add
Analytics project
Data Transform project
Launch IDE
JupyterLab
RStudio
Project terminal
Add to project
Add
Add to project
Add
Connected assets
Notebook
Connection
Data asset
Scheduled job
Model
Function
BETA
Synthesized neural network
BETA
Experiment
BETA
Streams flow
BETA
Modeler flow
Dashboard
Data Refinery flow
Environment
Visual recognition model
Natural language classifier model
Application
Fork Project
Export
Download
Roadmap
0 / 0
Confirm
Do you want to log out?
Find information
Overview
What's new
Service plan changes and deprecations
2020 What's new
2019 What's new
2018 What's new
2017 What's new
Overview of IBM Cloud Pak for Data as a Service
Relationships between services
Asset types and properties
Searching for assets
Profiles
Previews
Activities
Feature differences between Cloud Pak for Data deployments
Offering plans
Known issues and limitations
FAQ
Browser support
Language support
APIs for services
Notices
Accessibility
Get help
Services and integrations
IBM Cloud services
Creating services
Watson Studio
Watson Studio plans
Watson Knowledge Catalog
Watson Knowledge Catalog plans
Watson Machine Learning
Watson Machine Learning plans
Analytics Engine
Cloudant
Cloud Object Storage
Cognos Dashboard Embedded
Databases for EDB
Databases for Elasticsearch
Databases for MongoDB
Databases for PostgresSQL
DataStage
DataStage plans
Data Replication
Db2
Db2 Warehouse
Event Streams
IBM Match 360 with Watson (Beta)
Secure Gateway
SQL Query
Watson Assistant
Watson Discovery
Watson Natural Language Understanding
Watson Language Translator
Watson Natural Language Classifier
Watson OpenScale
Watson Personality Insights
Watson Query
Provisioning Watson Query
Watson Query plans
Watson Speech to Text
Watson Text to Speech
Watson Tone Analyzer
Regional availability
Regional limitations
Integrations with other cloud platforms
Integrating with AWS
Integrating with Azure
Integrating with Google
Getting started
Signing up for Cloud Pak for Data as a Service
Signing up for the data fabric trials
Setting up the platform for administrators
Setting up the IBM Cloud account
Adding users to the account
Roles
Setting up Cloud Object Storage
Setting up Watson Studio and Machine Learning
Setting up Watson Knowledge Catalog
User roles and permissions
Creating the catalog for platform connections
Setting up Watson Query
Configuring inbound access rules for firewalls
Configuring inbound firewall access for DataStage
Configuring inbound firewall access to Redshift
Configuring firewall access for a connection to Cloud Object Storage
Getting started with preparing data
Refine data
Transform data
Virtualize data
Getting started with analyzing and visualizing data
Analyze data in a Jupyter notebook
Tell a story with a dashboard
Getting started with building, deploying, and trusting models
Build and deploy a machine learning model with AutoAI
Build and deploy a machine learning model in a Jupyter notebook
Build and deploy a machine learning model with SPSS Modeler
Build, run, and deploy a Decision Optimization model
Getting started with curating and governing data
Curate data
Protect data
Data fabric tutorials
Build and deploy a model tutorial
Test and validate the model tutorial
Integrate data tutorial
Customer 360 preview
Trust your data tutorial
Protect your data tutorial
Know your data tutorial
End-to-end examples: Industry accelerators
Video library
Projects
Creating a project
Object storage
Importing a project
Administering projects
Managing collaborators
Project collaborator roles
Adding associated services
Exporting a project
Managing assets in projects
Downloading data assets
Choosing a tool
Choosing compute resources for tools
Compute options for the notebook editor
Compute options for Data Refinery
Compute options for DataStage
Compute options for RStudio
Compute options for AutoAI
Compute options for Decision Optimization
Managing compute resources
Creating environments
Customizing environments
Examples of customizations
Runtime usage
Creating and managing jobs
Creating jobs in the Notebook editor
Creating jobs in Data Refinery
Creating jobs in DataStage
Creating jobs with Data Privacy
Adding catalog assets to a project
Leaving a project
Markdown cheatsheet
Preparing data
Adding data to a project
Adding very large files to a project
Adding connections to projects
Controlling access to COS buckets
Securing connections
Adding data from a connection
Adding a folder asset from a connection
Connection types
Amazon RDS for MySQL connection
Amazon RDS for Oracle connection
Amazon RDS for PostgreSQL connection
Amazon Redshift connection
Amazon S3 connection
Setting up temporary credentials or a Role ARN for Amazon S3
Analytics Engine HDFS connection
Apache Cassandra connection
Apache Derby connection
Apache HDFS connection
Apache Hive connection
Box connection
Cloudant connection
Cloudera Impala connection
Compose for MySQL connection
Databases for DataStax connection
Databases for MongoDB connection
Databases for PostgreSQL connection
Dropbox connection
Elasticsearch connection
FTP connection
Generic S3 connection
Google BigQuery connection
Google Cloud Pub/Sub connection
Google Cloud Storage connection
Greenplum connection
HTTP connection
IBM Cloud Object Storage connection
IBM Cloud Object Storage (infrastructure) connection
IBM Cognos Analytics connection
IBM Data Virtualization connection
IBM Data Virtualization Manager for z/OS connection
IBM Db2 connection
IBM Db2 Big SQL connection
IBM Db2 Event Store connection
IBM Db2 for i connection
IBM Db2 for z/OS connection
IBM Db2 Hosted connection
IBM Db2 on Cloud connection
IBM Db2 Warehouse connection
IBM Informix connection
IBM MQ connection
IBM Netezza Performance Server connection
IBM Planning Analytics connection
Looker connection
MariaDB connection
Microsoft Azure Blob Storage connection
Microsoft Azure Cosmos DB connection
Microsoft Azure Data Lake Store connection
Microsoft Azure File Storage connection
Microsoft Azure SQL Database connection
Microsoft SQL Server connection
MongoDB connection
MySQL connection
OData connection
ODBC connection
Oracle connection
PostgreSQL connection
Salesforce.com connection
SAP ASE connection
SAP IQ connection
SAP OData connection
Snowflake connection
SQL Query connection
Tableau connection
Teradata connection
Adding platform connections
Refining data
Adding data to Data Refinery
Validating your data
Visualizing your data
Managing Data Refinery flows
GUI operations
Interactive code templates
Supported data sources for Data Refinery
Tutorial: Shape raw data
Curating data
Importing metadata
Enriching your data assets
Reviewing enrichment results
Data quality score
Data quality violations
Term assignment
Publishing enrichment results
Default enrichment settings
DataStage
Adding data to IBM DataStage
Creating a DataStage flow
Downloading and importing a DataStage flow and its dependencies
Creating, scheduling, running, and monitoring jobs
DataStage connectors
Asset browser
Data set
File set
Input tab
Output tab
Lookup file set
Sequential file
Defining data definitions
Making parts of your job design reusable
Local subflows
Subflows
Making jobs adaptable
Creating and using parameters and parameter sets
Inserting parameters and parameter sets as properties
Migrating DataStage jobs
DataStage stages
Aggregator
Fast path
Stage tab
Calculation and recalculation dependent properties
Input tab
Output tab
Bloom Filter
Stage tab
Input tab
Output tab
Change Apply
Example data
Fast path
Change Capture
Fast path
Stage tab
Input tab
Output tab
Checksum
Adding a Checksum column to your data
Properties for Checksum Stage
Mapping output columns
Specifying execution options
Column Export
Fast path
Stage tab
Input tab
Format section
Output tab
Column Generator
Fast path
Stage tab
Input tab
Output tab
Column Import
Examples
Fast path
Stage tab
Input tab
Output tab
Output link format section
Combine Records
Examples
Example 1
Example 2
Fast path
Stage tab
Properties section
Outputs section
Combine keys section
Options section
Advanced section
NLS Locale section
Input tab
Partitioning and collecting data
Output tab
Compare
Fast path
Stage tab
Input tab
Output tab
Compress
Fast path
Stage tab
Input tab
Output tab
Copy
Fast path
Stage tab
Input tab
Output tab
Decode
Fast path
Input tab
Output tab
Difference
Fast path
Stage tab
Input tab
Output tab
Encode
Fast path
Stage tab
Input tab
Output tab
Expand
Fast path
Stage tab
Input tab
Partitioning and collecting data
Output tab
External Filter
Fast path
Stage tab
Input tab
Output tab
Filter
Specifying the filter
Input data columns
Supported Boolean expressions and operators
Order of association
String comparison
Fast path
Stage tab
Input tab
Partitioning and collecting data
Output tab
Funnel
Fast path
Stage tab
Link Ordering section
Input tab
Output tab
Mapping output
Generic
Fast path
Stage tab
Input tab
Output tab
Head
Fast path
Stage tab
Input tab
Output tab
Hierarchical data
Using the Hierarchical Data stage
Adding a Hierarchical Data stage to a DataStage flow
Configuring runtime properties for the Hierarchical Data stage
The assembly
Input step
Output step
Assembly Editor
Opening the Assembly Editor
Mapping data
Working with the mapping table
Determining mapping candidates
Configuring how mapping candidates are determined
XML Composer step
XML Composer validation rules
XML Parser step
XML Parser validation rules
Setting default values for types
JSON transformation
Schema management
Opening the Schema Library Manager
Working with libraries and resources
Creating a JSON schema in the schema library
JSON Parser step
JSON Parser validation rules
JSON Composer step
JSON Composer validation rules
REST web services in DataStage
REST step pages
General
Security
Request
Response
Mappings
Output schema of the REST step
Passing multiple rows from an XML or JSON file
Transformation steps for the Hierarchical Data stage
Aggregate step
H-Pivot step
HJoin step
Order Join step
Regroup step
Sort step
Union step
V-Pivot step
Join
Join versus lookup
Fast path
Stage tab
Input tab
Output tab
Lookup
Lookup versus Join
Fast path
Properties
Stage tab
Input tab
Output tab
Make Subrecord
Examples
Fast path
Stage tab
Properties section
Input section
Output section
Options section
Advanced section
Input tab
Partitioning and collecting data
Output tab
Make Vector
Examples
Example 1
Example 2
Fast path
Stage tab
Properties section
Options section
Advanced section
Input tab
Partitioning and collecting data
Output tab
Merge
Fast path
Stage tab
Input tab
Output tab
Modify
Fast path
Stage tab
Input tab
Output tab
Peek
Fast path
Stage tab
Input tab
Output tab
Pivot Enterprise
Specifying a horizontal pivot operation
Specifying a horizontal pivot operation and mapping output columns
Example of horizontally pivoting data
Specifying a vertical pivot operation
Specifying a vertical pivot operation and mapping output columns
Example of vertically pivoting data
Properties tab
Specifying execution options
Specifying where the stage runs
Specifying partitioning or collecting methods
Specifying a sort operation
Promote Subrecord
Examples
Example 1
Example 2
Fast path
Stage tab
Properties tab
Options section
Advanced tab
Input tab
Partitioning and collecting data
Output tab
Remove Duplicates
Fast path
Stage tab
Input tab
Output tab
Row Generator
Fast path
Stage tab
Output tab
Sample
Fast path
Stage tab
Link Ordering
Input tab
Partitioning and collecting data
Output tab
Mapping
Sort
Fast path
Stage tab
Input tab
Output tab
Split Subrecord
Examples
Fast path
Stage tab
Properties tab
Options section
Advanced tab
Input tab
Partitioning and collecting data
Output tab
Split Vector
Examples
Example 1
Example 2
Fast path
Stage tab
Properties tab
Options section
Advanced tab
Input tab
Partitioning and collecting data
Output tab
Surrogate Key Generator
Creating the key source
Deleting the key source
Updating the state file
Generating surrogate keys
Switch
Example
Fast path
Stage tab
Input tab
Output tab
Tail
Fast path
Stage tab
Input tab
Output tab
Transformer
Basic concepts
Properties
Stage variables
Loop variables
Entering expressions
Loop example: converting a single row to multiple rows
Loop example: multiple repeating values in a single field
Loop example: generating new rows
Loop example: aggregating data
Surrogate Key tab
Link ordering
Advanced
Input tab
Output tab
Runtime column propagation
System variables
Evaluation sequences for transformer expressions, stage variables, and loop variables
Reserved words
Wave Generator
Stage Tab
Properties
Input tab
Output tab
Write Range Map
Fast path
Stage tab
Input tab
QualityStage stages
Investigate
Stage tab
Input tab
Output tab
Standardize
Fast path
Partitioning and collecting data
SQL Properties
Using Before/After SQL Statements
Before SQL
Before SQL (node)
After SQL
After SQL (node)
Sharing DataStage artifacts with all IBM Cloud Object Storage containers
High availability and disaster recovery
DataStage command-line tools
Virtualizing data
Connecting to data sources
Supported data sources
Connecting to Amazon S3
Connecting to Ceph
Connecting to Cloud Object Storage
Connecting to Google BigQuery
Connecting to Snowflake
Status of data sources
Creating virtual objects
Creating a virtualized table from a single data source
Creating a virtualized table from multiple data sources
Creating a virtualized table from files in Cloud Object Storage
Creating schemas for virtual objects
Joining virtual objects
Managing access to virtual objects
Managing access to virtual objects for user roles
Revoking access to virtual objects for user roles
Managing visibility of virtual objects
Monitoring data access
Governing virtual data
Publishing virtual data to the catalog
Virtualizing data with business terms
Enabling strict mode
Enforcing business terms to virtualize data
Governing virtual data with data protection rules
Enabling enforcement of data protection rules
Masking virtual data
Managing caches and queries
Adding data caches
Cache recommendations
Configuring cache recommendations
Finding cache recommendations
Viewing query history
Restrictions for caching
Improving query performance
Collecting statistics
Enabling caching
Monitoring and exploring the service
Monitoring integrated databases
Exploring integrated databases
Administering users and roles
Connecting to the service
Managing roles for users
Assigning roles to users
Modifying user roles
Scaling the service
Scaling your service
SQL interface
Running SQL
Watson Query procedures
removeCosConn stored procedure
removeRdbcX stored procedure
setCosConn stored procedure
setRdbcX stored procedure
setRdbcX stored procedure
Watson Query views
LISTNODES view
LISTRDBC view
Limitations and known issues
Masking data with Data Privacy
Creating masking flows
Running masking flow jobs
Managing job performance
Managing master data (beta)
Creating a master data configuration asset
Giving users access to IBM Match 360
Entities and records in IBM Match 360
IBM Match 360 matching algorithms
Configuring master data
Customizing your data model
Adding data and mapping it to your data model
Adding master data from InfoSphere MDM
Matching your data to create master data entities
Customizing and strengthening your matching algorithm
Managing IBM Match 360 jobs
Exploring master data
Defining the way records and attributes are displayed
Exploring master data entities and records
Adding and editing individual records
Exporting master data
APIs available in IBM Match 360
Tutorial: Onboarding and matching data
Data replication (beta)
Tutorial: Get started with IBM Data Replication
Creating a Data Replication service instance
Configuring a project for Data Replication
Replicating data
Goal - copying a schema
Supported Data Replication connections
Replicating Amazon RDS for PostgreSQL data
Replicating Db2 on Cloud data
Replicating IBM Db2 Warehouse data
Monitoring Data Replication
Securing your data in Data Replication
High availability and disaster recovery
Analyzing data and building models
Notebooks
Creating notebooks
Parts of a notebook
Jupyter kernels and notebook environments
Libraries and scripts
Installing custom libraries
Geospatio-temporal library
Data skipping library
Parquet encryption
Key management by application
Key managment by KMS
Time series library
Using the time series library
Time series key functionality
Time series functions
Time series lazy evaluation
Time reference system
Importing scripts into a notebook
Coding and running notebooks
Loading and accessing data in a notebook
Support for loading data in a notebook
Accessing data in an AWS S3 bucket
Manually adding the project access token
Using project-lib for Python
Using Python functions to work with Cloud Object Storage
Using project-lib for R
Watson Natural Language Processing library (beta)
Library block catalog
Syntax analysis
Noun phrase extraction
Keyword extraction and ranking
Entity extraction
Sentiment extraction
Tone classification
Emotion classification
Creating your own models
Detecting entities with a custom dictionary
Detecting entities with regular expressions
Classifying text with a custom classification model
SPSS predictive analytics algorithms
Data preparation
Classification and regression
Clustering
Forecasting
Survival analysis
Score
Sharing and publishing notebooks
Sharing notebooks
Hiding code in a notebook
Publishing notebooks on GitHub
Publishing notebooks as a gist
RStudio
Using Spark in RStudio
Cognos Dashboards
AutoAI
AutoAI tutorial
Building an experiment with one data source
Build an experiment from sample data
Build a text analysis experiment
Building an experiment with joined data
Tutorial: Build and deploy a data join model
Tutorial: Build a multiclass data join model
Join feature engineering details
Building a time series experiment
Training a univariate time series experiment
Training a multivariate time series experiment
Time series experiment implementation details
Scoring a time series model
Saving an AutoAI generated notebook
Using autoai-lib for Python
Selecting an AutoAI model
AutoAI implementation details
Data imputation in AutoAI
Imputation details for time series experiments
Evaluating AutoAI experiments for fairness
AutoAI feature comparison
Troubleshooting AutoAI experiments
Federated learning
Federated Learning tutorial and samples
Federated Learning Tensorflow 2 tutorial
Federated Learning Tensorflow 2 samples
Federated Learning XGBoost tutorial
Federated Learning XGBoost samples
Creating the Federated Learning Experiment
Choosing your framework, fusion method, and hyperparameters for Federated Learning
Additional details for Federated Learning implementation
Decision Optimization
Ways to use Decision Optimization
Sample models and notebooks
Decision Optimization notebooks
Decision Optimization experiments
Views and scenarios
Visualization view
Modeling Assistant models
Selecting a Decision domain in the Modeling Assistant
Formulating and running a model: house construction scheduling
Adding multi-concept constraints and custom decisions: shift assignment
Creating advanced custom constraints with Python
Python DOcplex models
Input and output data
Solving and analyzing a model: the diet problem
Create new scenario
Working with multiple scenarios
Generating multiple scenarios
OPL models
Run parameters and Environment
SPSS Modeler
Tutorials
Introduction to modeling
Building the flow
Browsing the model
Evaluating the model
Scoring records
Summary
Automated modeling for a flag target
Historical data
Building the flow
Generating and comparing models
Summary
Automated modeling for a continuous target
Training data
Building the flow
Comparing the models
Summary
Automated data preparation
Building the flow
Comparing the models
Drug treatment - exploratory graphs
Reading in text data
Creating a distribution chart
Creating a scatterplot
Creating a web chart
Creating advanced visualizations
Deriving a new field
Building a model
Browsing the model
Using an Analysis node
Screening predictors
Building the flow
Building the models
Reducing input data string length
Reclassifying the data
Classifying telecommunications customers
Building the flow
Browsing the model
Telecommunications churn
Building the flow
Browsing the model
Forecasting bandwidth utilization
Forecasting with the Time Series node
Creating the flow
Examining the data
Defining the dates
Defining the targets
Setting the time intervals
Creating the model
Examining the model
Summary
Forecasting catalog sales
Creating the flow
Examining the data
Exponential smoothing
ARIMA
Summary
Making offers to customers (self-learning)
Building the flow
Browsing the model
Retail sales promotion
Examining the data
Learning and testing
Condition monitoring
Examining the data
Data preparation
Learning
Testing
Hotel satisfaction example for Text Analytics
Text Mining node
Using the Text Analytics Workbench
Building and deploying the model
Text Link Analysis node
Nodes palette
Import
Data Asset node
User Input node
Sim Gen node
Extension Import node
Record Operations
Select node
Sample node
Sort node
Balance node
Distinct node
Aggregate node
Merge node
Append node
Streaming Time Series node
SMOTE node
RFM Aggregate node
Space-Time-Boxes node
Extension Transform node
Streaming TCM node
CPLEX Optimization node
Field Operations
Auto Data Prep node
Type node
Viewing and setting information about types
Measurement levels
Geospatial measurement sublevels
Converting continuous data
What is instantiation?
Data values
Setting options for values
Specifying values and labels for continuous data
Specifying values and labels for nominal and ordinal data
Specifying values for a flag
Specifying values for collection data
Specifying values for geospatial data
Defining missing values
Checking type values
Setting the field role
Setting field format options
Filter node
Derive node
Filler node
Reclassify node
Binning node
RFM Analysis node
Ensemble node
Partition node
Set to Flag node
Restructure node
Transpose node
Field Reorder node
History node
Time Intervals node
Anonymize node
Reproject node
Graphs
Charts node
Plot node
Multiplot node
Time Plot node
Distribution node
Histogram node
Collection node
Web node
Evaluation node
Modeling
Auto Classifier node
Continuous machine learning
Auto Numeric node
Auto Cluster node
TCM node
Bayes Net node
C5.0 node
C&R Tree node
CHAID node
QUEST node
Tree-AS node
Random Trees node
Random Forest node
Decision List node
Time Series node
GenLin node
GLMM node
GLE node
Linear node
Linear-AS node
Regression node
LSVM node
Logistic node
Neural Net node
KNN node
Cox node
PCA/Factor node
SVM node
Feature Selection node
Discriminant node
SLRM node
Spatio-Temporal Prediction (STP) node
Spatio-Temporal Prediction (STP) model nugget
Association Rules node
Apriori node
CARMA node
Sequence node
Kohonen node
Anomaly node
K-Means node
TwoStep cluster node
TwoStep-AS cluster node
Isotonic-AS node
XGBoost-AS node
K-Means-AS node
XGBoost Tree node
XGBoost Linear node
Gaussian Mixture node
KDE node
One-Class SVM node
MultiLayerPerceptron-AS node
HDBSCAN node
Extension Model node
Extension model nugget
Text Analytics
About text mining
How extraction works
How categorization works
Language Identifier node
Text Link Analysis node
Expert options
TLA node output
Text Mining node
Text Mining model nuggets
Text Analytics Workbench
The Concepts tab
The Text links tab
The Categories tab
The Resource editor tab
Setting options
Advanced linguistic settings
Advanced frequency settings
Generating a model nugget
Outputs
Table node
Matrix node
Analysis node
Data Audit node
Transform node
Statistics node
Means node
Report node
Set Globals node
Sim Fit node
Sim Eval node
KDE Simulation node
Extension Output node
Export
Data Asset Export node
Extension Export node
Extension nodes
R scripts
Python for Spark scripts
Scripting with Python for Spark
Data metadata
Date, time, timestamp
Exceptions
Examples
Reference information
Tips and shortcuts
Scripting and automation
Scripting overview
Types of scripts
Flow scripts
Flow script example: Training a neural net
Jython code size limits
Running and interrupting scripts
Scripting differences
The scripting language
Python and Jython
Python scripting
Operations
Lists
Strings
Remarks
Statement syntax
Identifiers
Blocks of code
Passing arguments to a script
Examples
Mathematical methods
Using non-ASCII characters
Object-oriented programming
Defining a class
Creating a class instance
Adding attributes to a class instance
Defining class attributes and methods
Hidden variables
Inheritance
Scripting in SPSS Modeler
Flows, SuperNode streams, and diagrams
Flows
SuperNode flows
Diagrams
Running a flow
The scripting context
Referencing existing nodes
Finding nodes
Setting properties
Creating nodes and modifying flows
Creating nodes
Linking and unlinking nodes
Importing, replacing, and deleting nodes
Traversing through nodes in a flow
Getting information about nodes
The scripting API
Example: Searching for nodes using a custom filter
Metadata: Information about data
Parameters
Global values
Scripting tips
Looping through nodes
Accessing flow run results
Table content model
XML Content Model
JSON Content Model
Column statistics content model and pairwise statistics content model
Properties reference overview
Syntax for properties
Structured properties
Abbreviations
Node and flow property examples
Node properties overview
Common node properties
Flow properties
Data Asset Import node properties
Import node common properties
dataassetimport properties
extensionimportnode properties
simgennode properties
userinputnode properties
Record Operations node properties
appendnode properties
aggregatenode properties
balancenode properties
cplexoptnode properties
derive_stbnode properties
distinctnode properties
extensionprocessnode properties
mergenode properties
rfmaggregatenode properties
samplenode properties
selectnode properties
sortnode properties
streamingtimeseries properties
Field Operations node properties
anonymizenode properties
autodataprepnode properties
astimeintervalsnode properties
binningnode properties
derivenode properties
ensemblenode properties
fillernode properties
filternode properties
historynode properties
partitionnode properties
reclassifynode properties
reordernode properties
reprojectnode properties
restructurenode properties
rfmanalysisnode properties
settoflagnode properties
transposenode properties
typenode properties
Modeling node properties
Common modeling node properties
anomalydetectionnode properties
apriorinode properties
associationrulesnode properties
autoclassifiernode properties
Setting algorithm properties
autoclusternode properties
autonumericnode properties
bayesnetnode properties
c50node properties
carmanode properties
cartnode properties
chaidnode properties
coxregnode properties
decisionlistnode properties
discriminantnode properties
extensionmodelnode properties
factornode properties
featureselectionnode properties
genlinnode properties
glmmnode properties
gle properties
kmeansnode properties
kmeansasnode properties
knnnode properties
kohonennode properties
linearnode properties
linearasnode properties
logregnode properties
lsvmnode properties
neuralnetworknode properties
questnode properties
randomtrees properties
regressionnode properties
sequencenode properties
slrmnode properties
stpnode properties
svmnode properties
tcmnode properties
ts properties
treeas properties
twostepnode properties
twostepAS properties
Model nugget node properties
applyanomalydetectionnode properties
applyapriorinode properties
applyassociationrulesnode properties
applyautoclassifiernode properties
applyautoclusternode properties
applyautonumericnode properties
applybayesnetnode properties
applyc50node properties
applycarmanode properties
applycartnode properties
applychaidnode properties
applycoxregnode properties
applydecisionlistnode properties
applydiscriminantnode properties
applyextension properties
applyfactornode properties
applyfeatureselectionnode properties
applygeneralizedlinearnode properties
applyglmmnode properties
applygle properties
applygmm properties
applykmeansnode properties
applyknnnode properties
applykohonennode properties
applylinearnode properties
applylinearasnode properties
applylogregnode properties
applylsvmnode properties
applyneuralnetworknode properties
applyocsvmnode properties
applyquestnode properties
applyrandomtrees properties
applyregressionnode properties
applyselflearningnode properties
applysequencenode properties
applysvmnode properties
applystpnode properties
applytcmnode properties
applyts properties
applytreeas properties
applytwostepnode properties
applytwostepAS properties
applyxgboosttreenode properties
applyxgboostlinearnode properties
hdbscannugget properties
kdeapply properties
Graph node properties
collectionnode properties
distributionnode properties
dvcharts properties
evaluationnode properties
histogramnode properties
multiplotnode properties
plotnode properties
timeplotnode properties
webnode properties
Output node properties
analysisnode properties
dataauditnode properties
extensionoutputnode properties
kdeexport properties
matrixnode properties
meansnode properties
reportnode properties
setglobalsnode properties
simfitnode properties
statisticsnode properties
tablenode properties
transformnode properties
Export node properties
dataassetexport properties
extensionexportnode properties
Python node properties
gmm properties
hdbscannode properties
kdemodel properties
kdeexport properties
ocsvmnode properties
rfnode properties
smotenode properties
xgboostlinearnode properties
xgboosttreenode properties
Spark node properties
isotonicasnode properties
kmeansasnode properties
multilayerperceptronnode properties
xgboostasnode properties
SuperNode properties
CLEM (legacy) language reference
Building CLEM (legacy) expressions
About CLEM
CLEM examples
Values and data types
Expressions and conditions
Working with strings
Handling blanks and missing values
Working with numbers
Working with times and dates
Summarizing multiple fields
Working with multiple-response data
The Expression Builder
Accessing the Expression Builder
Creating expressions
Selecting functions
Database functions
Selecting fields
Viewing or selecting values
Checking CLEM expressions
Find
CLEM datatypes
Integers
Reals
Characters
Strings
Lists
Fields
Dates
Time
CLEM operators
Functions reference
Conventions in function descriptions
Information functions
Conversion functions
Comparison functions
Logical functions
Numeric functions
Trigonometric functions
Probability functions
Spatial functions
Bitwise integer operations
Random functions
String functions
SoundEx functions
Date and time functions
Converting date and time values
Sequence functions
Global functions
Functions handling blanks and null values
Special fields
SPSS algorithms
Working with your data
Missing data values
Handling missing values
Handling records with missing values
Handling fields with missing values
Handling records with system missing values
Functions available for missing values
Temporary storage
Supported data sources for SPSS Modeler
Adding comments and annotations
Deploying models
Flow scripting
Flow scripting example
Flow and SuperNode parameters
SQL optimization
How does SQL pushback work?
Tips for maximizing SQL pushback
Nodes supporting SQL pushback
CLEM expressions and operators supporting SQL pushback
Generating SQL from model nuggets
Disabling or caching nodes in a flow
Disabling nodes in a flow
Caching options for nodes
Importing an SPSS Modeler stream
Setting properties for flows
Expression Builder
Selecting functions
Deploying and managing models
Machine Learning essentials
Creating a service instance
Authentication
Service endpoints
Managing frameworks and software specifications
Compute options for model training and scoring
Supported deployment frameworks
Software specifications and hardware specifications for deployments
Requirements for using custom components in models
Creating a custom software specification in a project
Managing outdated software specifications or frameworks
Deploying assets
Deployment spaces
Deployment spaces dashboard
Collaborator permissions for spaces
Creating deployment spaces
Adding data assets to a deployment space
Promoting assets to a deployment space
Importing models to a deployment space
Exporting deployment spaces
Deleting deployment spaces
Creating an online deployment
Creating a batch deployment
Data sources for scoring batch deployments
Batch deployment input details by framework
Batch deployment input details for AutoAI models
Batch deployment input details for Decision optimization models
Batch deployment input details for Python functions
Batch deployment input details for Pytorch models
Batch deployment input details for scikit-learn and XGBoost models models
Batch deployment input details for Spark models
Batch deployment input details for SPSS models
Batch deployment input details for Tensorflow models
Deploying an SPSS model with multiple inputs
Deploying Python functions
Writing deployable Python functions
Creating a deployment job
Getting the deployment endpoint URL
Managing deployments
Managing deployment jobs
Updating a deployment
Scaling a deployment
Deleting a deployment
Training and deploying machine learning models in notebooks
Watson Machine Learning Python client example notebooks
Watson Machine Learning REST API example notebooks
Managing AI Lifecycle with ModelOps
ModelOps use case
Model management and activity tracking
Watson Studio Pipelines
Getting started with Pipelines (beta)
Exploring the built-in sample pipeline (beta)
Exploring the Gallery sample pipeline (beta)
Creating a pipeline (beta)
Configuring pipeline components (beta)
Configuring global objects (beta)
Programming a pipeline (beta)
Pipeline limits and troubleshooting (beta)
Decision Optimization
Deploying a model using the user interface
Deployment steps
Model deployment
Model execution
Model input and output data file formats
Model input and output data adaptation
Output data definition
Solve parameters
Running jobs
REST API example
Changing Python version in a deployed model with REST API
Python client examples
Delegating the CPLEX engine solve to Watson Machine Learning
Migrating
Migrating from COS or DB2 to a connection asset
Migrating from Watson Machine Learning API V4 Beta
Migrating Python code for Decision Optimization with Machine Learning-v2 instances
Migrating from Decision Optimization on Cloud (DOcplexcloud)
Watson OpenScale
Provisioning and launching Watson OpenScale
Setup options for Watson OpenScale
Getting started with auto setup
Getting started with manual setup
Installing a Python module to set up Watson OpenScale
FAQs
APIs, SDKs, and tutorials
Metrics computation using Python SDK
Python notebook advanced tutorial
Updating notebooks from V1 to V2 Python SDK
Supported machine learning engines, frameworks, and models
IBM Watson Machine Learning
Microsoft Azure ML Studio frameworks
Microsoft Azure ML Service frameworks
Amazon SageMaker frameworks
Custom ML frameworks
Integrating 3rd-party ML engines with Watson OpenScale
Configuring Watson OpenScale
Creating credentials for Watson OpenScale
Selecting deployments to monitor
Specifying a database
Payload logging for non-IBM Watson Machine Learning service instances
Sending a scoring request
Payload and feedback logging
Automating payload logging
Understanding how de-biasing works
Indirect bias
Configure asset deployments using JSON configuration files
Formatting and uploading feedback data
Formatting and uploading training data
Defining the input and output schema by using the Python Client or REST API
Integrating Watson OpenScale with Watson Assistant
Upgrading Watson OpenScale from a lite to a paid plan
Deleting the Watson OpenScale service instance and data
Setting up alerts
Prepare models for monitoring
Configuring the quality monitor
Configuring the fairness monitor
Configuring the explainability monitor
Configuring the drift detection monitor
Configuring the endpoint monitor
Working with unstructured text models
Creating custom monitors and metrics
Get model insights
Viewing data for a deployment
Visualizing data for a specific hour
Debiasing options
Explaining transactions
Model risk management and model governance
Configure model risk management and model governance
Configure Watson OpenScale for model risk management
Set up model governance with IBM OpenPages MRG
Manage model risk
End-to-end model governance tutorial
Fairness metrics overview
Disparate impact
Quality metrics overview
Area under ROC
Area under PR
Accuracy
True positive rate (TPR)
False positive rate (FPR)
Recall
Precision
F1-Measure
Logarithmic loss
Proportion explained variance
Mean absolute error
Mean squared error
R squared
Root of mean squared error
Weighted true positive rate
Weighted false positive rate
Weighted recall
Weighted precision
Weighted F1-Measure
Drift detection overview
Drop in accuracy
Drop in data consistency
Performance metrics overview
Throughput
Analyzing the scoring payload
Predictions by confidence
Chart builder
Known issues and limitations
High availability and disaster recovery
Information security
Watson OpenScale Identity and Access Management
Configuring Identity and Access Management
Securing your connection to Watson OpenScale
Securing your data in Watson OpenScale
Catalogs
Administering a catalog
Creating a catalog
Duplicate asset handling
Managing access to a catalog
Catalog collaborator permissions
Changing catalog settings
Deleting a catalog
Setting up reporting for Watson Knowledge Catalog
Description of Db2 SQL tables used for Watson Knowledge Catalog reporting
Description of Postgres SQL tables used for Watson Knowledge Catalog reporting
Catalog assets
Finding and viewing an asset in a catalog
Adding assets to a catalog
Adding a file
Adding a connection
Adding data from a connection
Adding a folder asset from a connection
Publishing an asset from a project
Adding COBOL copybook assets
Downloading data assets
Editing asset properties
Adding asset relationships
Controlling access to an asset
Profiling an asset
Removing an asset
Governance
Governance artifacts
Preparing for governance
Finding and viewing governance artifacts
Tags
Governance artifact properties
Managing governance artifacts
Importing governance artifacts
Exporting governance artifacts
Workflows for governance artifacts
Managing workflows for governance artifacts
Categories
Predefined categories
Designing categories
Managing categories
Managing category collaborators
Category collaborator roles
Importing or exporting categories
Policies
Managing policies
Authoring policies
Governance rules
Managing governance rules
Authoring governance rules
Data location rules (experimental)
Designing data location rules
Data location rules enforcement
Managing data location rules
Data protection rules
Designing data protection rules
Data protection rules enforcement
Managing data protection rules
Advanced data masking
Redacting data method
Obfuscating data method
Preserve format method
Identifier masking method
Business terms
Managing business terms
Authoring business terms
Classifications
Managing classifications
Authoring classifications
Predefined classifications
Data classes
Managing data classes
Adding matching methods to data classes
Predefined data classes
Predefined data classes details
Reference data
Creating reference data sets
Importing files for reference data sets
Relationships between reference data sets
Predefined reference data sets
Knowledge Accelerators
Notices
IBM Knowledge Accelerator for Energy and Utilities
IBM Knowledge Accelerator for Financial Services
IBM Knowledge Accelerator for Healthcare
IBM Knowledge Accelerator for Insurance
Components of Knowledge Accelerators
Use of governance artifacts in Knowledge Accelerators
Category areas and subcategories
Business terms
Relationships
Custom attributes
Classifications
Business Core Vocabulary
Core area categories
Concept terms
Property terms
Relationship terms
Business Performance Indicators
Performance area categories
Performance analysis terms
Measures
Industry Alignment Vocabularies
Alignment area categories
Alignment topic categories
Alignment terms
Business Scopes
Scope area categories
Reference data sets
Getting started with Knowledge Accelerators
Viewing available Knowledge Accelerators items
Knowledge Accelerator for Energy and Utilities items
Knowledge Accelerator for Financial Services items
Knowledge Accelerator for Healthcare items
Knowledge Accelerator for Insurance items
Using Knowledge Accelerators
Customizing Knowledge Accelerators inline
Creating a separate client-specific category structure
Creating data classes based on reference data sets
Model inventory
Tracking models in an inventory
Registering models for tracking
Registering external models for tracking
Viewing facts for tracked models
Customizing facts for tracked models
Troubleshooting
Watson Knowledge Catalog
Data Refinery
DataStage
Watson Machine Learning
IBM Match 360 with Watson
IBM Cloud Status
IBM Cloud Object Storage for projects
Watson OpenScale
Watson Query
Troubleshooting general issues
Cannot grant users access to a view
Troubleshooting virtualization issues
Error message when you try to use an unsupported file format in Cloud Object Storage
Speed up loading of tables when you virtualize
Reveal hidden tables when you virtualize
Listing of virtual objects is slow
Viewing of objects in your cart is slow
Troubleshooting governance issues
Access to a table is denied by policies
Cannot access assets in the catalog
Cannot access assets with masked data
Cannot enforce policies and data protection rules
Publishing data to the catalog fails
Troubleshooting data source connections
Cannot push down string functions with string units on Db2 remote data source
Snowflake connection times out
Cannot connect to data source
Cannot push down join views
Errors when you delete a connection
Troubleshooting data caches and queries
Cannot delete schemas or virtual objects
Cannot see updated information in the cache dashboard
Errors when finding cache recommendations
Troubleshooting queries
SQL messages
Queries on virtualized flat files fail with incorrect results
View-related actions are not available
Error SQL20478 when you run a query
Error SQL0727N when you query view results
Error SQL1822N when you run a query
Concurrent queries are slow or fail
Incorrect query results for Db2 remote data sources
Data type STRING in Hive tables is assigned CLOB data type
Data types STRING, TEXT, and VARCHAR in Snowflake tables are assigned CLOB data type
Performance issues in queries with subqueries
SUM() or AVG() function returns an error
Table statistics are not collected
Administration
Managing Cloud Pak for Data as a Service
Monitoring account resource usage
Managing authorized users for Watson Studio
Managing account settings
Upgrading your account and services
Activating the Hybrid Subscription Advantage
Stop using Cloud Pak for Data as a Service
Security
Network security
Enterprise security
Account security
Data security
Coll