0 / 0
Scheduling a notebook
Scheduling a notebook

Scheduling a notebook

You can create a job to run your notebook at periodic intervals.

Notebook jobs can run in environments that are started in Cloud Pak for Data as a Service.

Note that notebook scheduling is not supported if you run your notebook in an IBM Analytics Engine service.

Creating a notebook job

To schedule a notebook job:

  1. Click the jobs icon from the notebook's menu bar and select Create a job.
  2. Define the job details by entering a name and a description (optional).
  3. On the Configure page, select:

    • A notebook version. The most recently saved version of the notebook is used by default. If no version of the notebook exists, you must create a version by clicking the versions icon from the notebook action bar.
    • A runtime. By default, the job uses the same environment definition that was selected for the notebook.
    • Advanced configuration to add environment variables and select the job run retention settings.

      • The environment variables that are passed to the notebook when the job is started and affect the execution of the notebook.

        Each variable declaration must be made for a single variable in the following format VAR_NAME=foo and appear on its own line.

        For example, to determine which data source to access if the same notebook is used in different jobs, you can set the variable DATA_SOURCE to DATA_SOURCE=jdbc:db2//db2.server.com:1521/testdata in the notebook job that trains a model and to DATA_SOURCE=jdbc:db2//db2.server.com:1521/productiondata in the job where the model runs on real data. In another example, the variables BATCH_SIZE, NUM_CLASSES and EPOCHS that are required for a Keras model can be passed to the same notebook with different values in separate jobs.

      • Select the job run result output. You can select:

        • Log & notebook to store the output files of specific runs, the log file, and the resulting notebook. This is the default that is set for all new jobs. Select:

          • To compare the results of different job runs, not just by viewing the log file. By keeping the output files of specific job runs, you can compare the results of job runs to fine tune your code. For example, by configuring different environment variables when the job is started, you can change the way the code in the notebook behaves and then compare these differences (including graphics) step by step between runs.

            Note:

            • The job run retention value is set to 5 by default to avoid creating too many run output files. This means that the last 5 job run output files will be retained. You need to adjust this value if you want to compare more run output files.
            • You cannot use the results of a specific job run to create a URL to enable "Sharing by URL". If you want to use a specific job result run as the source of what is shown via "Share by URL", you must create a new job and select Log & updated version.
          • To view the logs.
        • Log only to store the log file only. The resulting notebook is discarded. Select:

          • To view the logs.
        • Log & updated version to store the log file and update the output cells of the version you used as input to this task. Select:

          • To view the logs.
          • To share the result of a job run via "Share by URL".
    • Retention configuration to set how long to retain finished job runs and job run artifacts like logs or notebook results. You can either select the number of days to retain the job runs or the last number of job runs to keep. The retention value is set to 5 by default (the last 5 job run output files are retained).

      Be mindful when changing the default as too many job run files can quickly use up project storage.

  4. On the Schedule page, you can optionally add a one-time or repeating schedule.

    If you define a start day and time without selecting Repeat, the job will run exactly one time at the specified day and time. If you define a start date and time and you select Repeat, the job will run for the first time at the timestamp indicated in the Repeat section.

    You can't change the time zone; the schedule uses your web browser's time zone setting. If you exclude certain weekdays, the job might not run as you would expect. The reason might be due to a discrepancy between the time zone of the user who creates the schedule, and the time zone of the compute node where the job runs.

  5. Optionally set to see notifications for the job. You can select the type of alerts to receive.
  6. Review the job settings. Then create the job and run it immediately, or create the job and run it later. All notebook code cells are run and all output cells are updated.

    The notebook job is listed under Jobs in your project.

You can view the other jobs associated with your notebook and the job details:

  • From an opened notebook by clicking the jobs icon from the notebook's menu bar and selecting Save and view jobs. If more than one job exists for a notebook, select which job details to view.
  • From the Jobs tab of your project by clicking the jobs associated with your notebook.

Editing a notebook job

You can edit job settings, for example the schedule settings or pick another environment definition.

To edit a notebook job:

  1. Click the jobs icon from the notebook's menu bar and select Save and view jobs.
  2. Select the job and click Edit job to change job settings.

Parent topic: Coding and running notebooks

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more