An environment definition defines the hardware and software configuration that you can use to run tools like notebooks, model builder, or the flow editor, or if you want to run Data Refinery flows in Watson Studio. With environments, you have dedicated resources and flexible options for the compute resources used by the tools.
Watson Studio offers the following types of environments:
You can use the default Anaconda-based environments if you are working in notebooks, for example, and don’t require using Spark. Watson Studio provides a selection of preset default Anaconda-based environment definitions with different hardware and software configurations, including a free environment that does not consume capacity units. You can choose to use the defaults, or you can create your own environments based on the provided defaults which you can customize to fit your needs.
Default environments can be created for the languages R and Python.
If your notebook includes Spark APIs, or you are working with big data sets and need to run distributed workloads across a cluster, you must associate the notebook with a Spark service or environment. Likewise, if you are creating machine learning models in the model builder, or model flows in the flow editor, you must run these tools in a Spark runtime.
Spark environments can be created for the languages Scala, R, and Python.
If you want to work with R in RStudio in Watson Studio, you can open RStudio within the context of a project. When you launch RStudio, a default RStudio environment runtime is automatically activated.
Environments for Data Refinery flows
If you want to run any Data Refinery flows, you can select to run the flows in a Spark R environment.
If you want to build and train machine learning models directly in a notebook, and you need more computing power than you can get from a CPU environment to improve model training performance, you can associate your notebook with a GPU (Graphics Processing Unit) environment.
An environment definition references a runtime service. When you start a tool associated with an environment for example a notebook, the runtime service creates a runtime instance based on the environment configuration you specified.
The memory size for each environment is the size given to the runtime instance, which means that some of the space is also used by system resources, like the Jupyter server for notebooks for example. Therefore, the specified memory space is not entirely available for your use.
- Default environments
- Spark environments
- RStudio environments
- GPU environments
- Spark environments for Data Refinery
- Create an environment definition
- Customize a default environment definition
- Create a notebook and use an environment
- Change the environment of a notebook
- Stop active runtimes when no longer needed
- Track capacity unit consumption of your runtimes