Creating jobs in deployment spaces

Last updated: Oct 09, 2024

A job is a way of running a batch deployment, or a self-contained asset like a script, notebook, code package, or flow in Watson Machine Learning. You can select input and output for your job and choose to run it manually or on a schedule. From a deployment space, you can create, schedule, run, and manage jobs.

Creating a batch deployment job

Follow these steps when you are creating a batch deployment job.

From the Deployments tab, select your deployment.
Click New job.
Enter job name and description.
Select hardware specification.
Optional: If you are deploying a Python script, an R script, or a notebook, then you can enter environment variables to pass parameters to the job.
Optional: Configure retention options.
To avoid consuming resources by retaining all historical job metadata, you can set thresholds for saving a set number of job runs and associated logs, or set a time threshold for saving artifacts for a specified number of days.
Optional: Schedule your job.
If you didn't specify a schedule, the job will run immediately.
Optional: Select the job-related events that trigger notifications in Watson Machine Learning. You can select three event types: success, warning, and failure.
In the Input pane, from the Data asset menu, select your input data type:
- To enter the payload in JSON format, select Inline data. For an example of JSON payload, refer to Example JSON payload for inline data.
- To specify an input data source, select Data asset, click Select data source, and then specify your asset.
In the Output pane, from the Data asset menu select your output data type:
- To write your job results to a new output file, select Create new and then provide a name and optional description.
- To write your job results to an existing data asset, select Data asset, click Select data source, and then specify the asset to overwrite.
Click Create.

Notes:

Scheduled jobs display on the Jobs tab of the deployment space.
Results of job runs are written to the specified output file and saved as a space asset.
A data asset can be a data source file that you promoted to the space, a connected data source, or tables from databases and files from file-based data sources.
If you exclude certain weekdays in your job schedule, the job might not run as you would expect. The reason is due to a discrepancy between the time zone of the user who creates the schedule, and the time zone of the main node where the job runs.
When you create or modify a scheduled job, an API key is generated. Future runs will use this API key.

Example JSON payload for inline data

{
  "deployment": {
    "id": "<deployment id>"
  },
  "space_id": "<your space id>",
  "name": "test_v4_inline",
  "scoring": {
  "input_data": [{
    "fields": ["AGE", "SEX", "BP", "CHOLESTEROL", "NA", "K"],
    "values": [[47, "M", "LOW", "HIGH", 0.739, 0.056], [47, "M", "LOW", "HIGH", 0.739, 0.056]]
    }]
  }
}

Queuing and concurrent job executions

The maximum number of concurrent jobs for each deployment is handled internally by the deployment service. For batch deployment, by default, two jobs can be executed concurrently. Any deployment job request for a batch deployment that already has two running jobs is placed in a queue for execution later. When any of the running jobs are completed, the next job in the queue is run. The queue has no size limit.

Retention of deployment job metadata

Job-related metadata is persisted and can be accessed until the job and its deployment are deleted.

Parent topic: Managing predictive deployments

Was the topic helpful?

0/1000