An environment definition defines the hardware and software configuration to use to run tools like notebooks, model builder, or the flow editor, and to create jobs to run Data Refinery flows or to schedule notebooks. With environments, you can determine the compute resources to be used by tools, jobs, or an IDE like RStudio in Watson Studio.
You can choose from the following types of environments:
Default environments for notebooks
You can use the default environments if you are working in notebooks for example, and don’t require using Spark. Watson Studio provides a selection of preset default environment definitions with different hardware and software configurations, including free environments that do not consume capacity units. You can choose to use the defaults, or you can create your own environments which you can customize to fit your needs.
Default environments can be created for the languages R and Python.
If your notebook includes Spark APIs, or you are working with big data sets and need to run distributed workloads across a cluster, you must associate the notebook with a Spark service or environment. Likewise, if you are creating machine learning models in the model builder, or model flows in the flow editor, you must run these tools in a Spark runtime.
Watson Studio provides Spark environment definitions for Python, R and Scala with preset hardware and software configurations. You can choose to use a preset environment definition, or you can create your own environment definitions which you can customize to fit your needs.
If you want to work with R in RStudio in Watson Studio, you can open RStudio within the context of a project. When you launch RStudio, you can select the RStudio environment runtime in which to open RStudio.
Environments for Data Refinery flows
If you want to run any Data Refinery flows, you can select to run the flows in a Spark R environment.
If you want to build and train machine learning models directly in a notebook, and you need more computing power than you can get from a CPU environment to improve model training performance, you can associate your notebook with a GPU (Graphics Processing Unit) environment.
Watson Studio does not provide GPU environment definitions with preset hardware and software configurations. You must create your own GPU environment definition which you can customize to fit your needs.
An environment definition references a runtime service. When you start a tool associated with an environment for example a notebook, or create a Data Refinery job, the runtime service creates a runtime instance based on the environment configuration you specified.
Note: The memory size for each environment is the size given to the runtime instance, which means that some of the space is also used by system resources, like the Jupyter server for notebooks for example. Therefore, the specified memory space is not entirely available for your use.