Watson Machine Learning key terms

Here are key terms related to IBM Watson Machine Learning and IBM Watson Studio.

Table 1. Key terms
TermDescription and related terms

IBM Cloud

The IBM cloud computing platform. Online infrastructure that enables you to use software as a service, without installing it on your computer. See: IBM Cloud platform overview external link

Related terms:

  • IBM Cloud account - You need this account to access services in IBM Cloud.
  • IBMid - What you use to log in to IBM Cloud, your IBM Cloud account user id.
  • IBM Cloud catalog - There are 140+ services on IBM Cloud. You can browse and purchase services from the catalog.
  • IBM Cloud dashboard - A web GUI to manage your IBM Cloud service instances.

IBM Cloud service

In the IBM Watson Machine Learning and Watson Studio documentation, the term service refers a service in IBM Cloud, such as IBM Watson Machine Learning or Cloud Object Storage.

Related terms:

  • Service instance - When you purchase a service on IBM Cloud, an instance of that service is provisioned for you to use. When you don't need to use a service anymore, you can delete your instance of that service.
  • Service credentials - Not the same thing as IBMid. Service credentials are authentication credentials associated with a service (not a specific user.) Service credentials are what apps and other tools (like the IBM Cloud command line) use to interact with a service. Sometimes, you have to take action (in the IBM Cloud dashboard, for example) to generate new credentials for one of your services.
  • Region - IBM Cloud has servers all over the world. A region is a geographic area where your services are hosted. You can find out what region your service instance is in by looking at your IBM Cloud dashboard. See also: IBM Cloud regions external link
  • API endpoint - When you log in to IBM Cloud using the command line, you specify which region you will be working in by selecting an API endpoint (sometimes referred to as endpoint URL.) See also: IBM Cloud regions and endpoints external link

Command line interface (CLI)

A mechanism for working with IBM Cloud services from a command prompt on your local computer.

Related terms:

  • IBM Cloud CLI - Enables you to create, manage, and use IBM Cloud services from a command line on your computer.
  • IBM Watson Machine Learning commands - A collection of IBM Cloud CLI commands that are specific to using the IBM Watson Machine Learning service. Implemented in an IBM Cloud CLI plugin named "machine-learning".
  • Environment variables - The Watson Machine Learning commands read information from environment these variables on your computer: ML_ENV, ML_USERNAME, ML_PASSWORD, and ML_INSTANCE. You need to set these environment variables before using Watson Machine Learning commands with a particular Watson Machine Learning service instance.

Repository

Associated with your Watson Machine Learning service is storage for saving artifacts, such as: training definitions and models.  Note that the repository is not the same things as the cloud object storage where you store your training data or training output results.

Related terms:

  • Object identifier - When you create or store an object (such as a training definition, a model, or a deployment) in the repository, a unique identifier (sometimes called guid) is generated for the object. To work with an object that is in the repository, you must refer to the object by its object identifier.
  • Manifest file - For many Watson Machine Learning actions (such as storing objects in the repository, training models, and deploying trained models) you must specify metadata, including configuration and run-time details.  One way to pass that metadata is in JSON- or YAML-formatted files called manifest files.

Training run

Training a model can be so computationally intensive that training a model on your local computer or in a notebook might take too long or fail.  So, the Watson Machine Learning service provides a mechanism to upload your model-building code and then run the training on the Watson servers.  In the Watson Machine Learning and Watson Studio documentation, the term training run is used in two, related ways:

  • To refer to a job that is running on the Watson servers to train your model
  • To refer to an artifact that is automatically stored in the repository related to the training job

Related terms:

  • Training definition - Two things are required to run a training run:
    • Model-building code
    • Metadata about how to run the training

    When training a simple model with a single training run, you can run the training run by passing these two things as parameters.  However, when running an experiment you must specify the model-building code and training metadata in a training definition.

  • DATA_DIR - A variable used in your model-building code to refer to the bucket in your IBM Cloud Object Storage instance where your training data is located.
  • RESULT_DIR - A variable used in your model-building code to refer to the bucket in your IBM Cloud Object Storage instance where your training results output should be saved.

Experiment

In the Watson Machine Learning and Watson Studio documentation, the term experiment is used in two, related ways:

  • To refer to the act of running multiple training runs (possibly using hyperparameter optimization) to find the best-performing model
  • To refer to an artifact that you store in the repository

Related terms:

  • Experiment run - You can run an experiment multiple times, changing details to vary the outcome.  In the Watson Machine Learning and Watson Studio documentation, the term experiment run refers to an object that is created in the repository each time you run an experiment.

Model

In the Watson Machine Learning and Watson Studio documentation, the term model is used to refer to both models that apply machine learning algorithms as well as to neural networks.  With Watson Machine Learning and Watson Studio, you work with models in several ways: design, build, train, store, deploy.

Related terms:

  • Design a model - Choosing the algorithms, neural network layers, and nodes that make up the architecture of the model.
  • Build a model - Implement the architecture of the model, in Python code for example, or using tools like the model builder and the flow editor in Watson Studio.
  • Train a model - Iteratively adjust the variables of a model (for example, the weights on neurons in a neural network) until the model performance on training data is optimized.  Training a model can be computationally intensive, so instead of training a model on your local computer, you can submit a model and training data as a training run on the Watson Machine Learning service.
  • Store a trained model - Before you can deploy a model in your Watson Machine Learning service, you must save a copy of the trained model in your Watson Machine Learning repository.
  • Deploy a stored model - Deploying a model in your Watson Machine Learning service makes the model available for your tools or apps to use.

Deployment

A deployment is an artifact in your Watson Machine Learning service through which tools and apps can access trained models.

Related terms:

  • Web service deployment - When you deploy a model as a web service, an API endpoint is generated for your deployment so your tools and apps can use a REST API to send data to your deployed model for analysis.
  • Batch deployment - If you have a lot of data stored in cloud object storage or in a database, you can use a batch deployment to send that data to your deployed model for analysis all at once.
  • Streaming deployment - You can create a deployment that processes live, streaming data using the messages service IBM Event Streams.
  • Score - Passing data to a deployment to analyze the data (for example, classifying the data or making a prediction from the data) is sometimes referred to as scoring (or scoring the model.)
  • Payload - In the Watson Machine Learning and Watson Studio documentation, the term payload refers to data that is sent to a deployment for analysis (for example, to be classified or to be the basis of a prediction.)

Custom components

You can define your own transformers, estimators, functions, classes, and tensor operations for use in models you deploy in Watson Machine Learning.  In the Watson Machine Learning and Watson Studio documentation, the term custom components refers to transformers, estimators, and so on, that you create.

Related terms:

  • Library - In the context of using custom components, the term library (or custom library) is used to refer to the custom Python distribution package that you create and store in your Watson Machine Learning service to implement your custom components.
  • Runtime - To use custom components in models you deploy in your Watson Machine Learning service, you must create a runtime resource object (or simply, runtime) in your repository.  (To enable your model to use the custom components, the runtime metadata references your custom library, and your stored model metadata references the runtime.)