Migrating from project-lib for R to ibm-watson-studio-lib
The ibm-watson-studio-lib
library is the successor of the project-lib library
. Although you can still continue using project-lib
API in your notebooks, you should think about migrating existing notebooks
to use the ibm-watson-studio-lib
library.
Advantages of using ibm-watson-studio-lib
include:
- The asset browsing API provides read-only access to all types of assets, not only those explicitly supported by the library.
ibm-watson-studio-lib
uses a constistent API naming convention that structures available functions according to their area of application.
The following sections describe the changes you need to make in existing R notebooks to start using the the ibm-watson-studio-lib
library.
Set up the library
You need to make the following changes in existing notebooks to start using ibm-watson-studio-lib
:
In code using project-lib
change:
library(projectLib)
project <- projectLib::Project$new("<ProjectId>", "ProjectToken")
To the following using ibm-watson-studio-lib
:
library(ibmWatsonStudioLib)
wslib <- access_project_or_space(list("token" = "ProjectToken"))
Set up the library in Spark environments
You need to make the following changes in existing notebooks to start using ibm-watson-studio-lib
in Spark environments.
In code using project-lib
change:
library(projectLib)
project <- projectLib::Project$new(sc, "<ProjectId>", "ProjectToken")
To the following using ibm-watson-studio-lib
:
library(ibmWatsonStudioLib)
wslib <- access_project_or_space(list("token" = "ProjectToken"))
wslib$spark$provide_spark_context(sc)
Library usage
The following sections describe the code changes that you need to make in your notebooks when migrating functions in project-lib
to the corresponding functions in ibm-watson-studio-lib
.
Get project information
To fetch project related information programmatically, you need to change the following functions:
List data connections
In code using project-lib
change:
project$get_connections()
To the following using ibm-watson-studio-lib
:
assets <- wslib$list_connections()
wslib$show(assets)
Alternatively, with ibm-watson-studio-lib
, you can list connected data assets:
assets <- wslib$list_connected_data()
wslib$show(assets)
List data files
This function returns the list of the data files in your project.
In code using project-lib
change using:
project$get_files()
To the following using ibm-watson-studio-lib
:
assets <- wslib$list_stored_data()
wslib$show(assets)
Get name or description
In ibm-watson-studio-lib
, you can retrieve any metadata about the project, for example the name of a project or its description, via the entrypoint wslib.here
.
In code using project-lib
change:
name <- project$get_name()
desc <- project$get_description()
To the following using ibm-watson-studio-lib
:
name <- wslib$here$get_name()
desc <- wslib$here$get_description()
Get metadata
There is no replacement for get_matadata
in project-lib
:
project$get_metadata()
The function wslib$here
in ibm-watson-studio-lib
exposes parts of this information. The following functions are available in wslib$here
:
wslib$here$get_name()
: Returns the project namewslib$here$get_description()
: Returns the proejct descriptionwslib$here$get_ID()
: Returns the project IDwslib$here$get_storage()
: Returns the storage metadata
Get storage metadata
In code using project-lib
change:
project$get_storage_metadata()
To the following using ibm-watson-studio-lib
:
wslib$here$get_storage()
Fetch data
To access data in a file, you need to change the following functions.
In code using project-lib
change:
buffer <- project$get_file("MyAssetName.csv")
# or, without direct storage access:
buffer <- project$get_file("MyAssetName.csv", directStorage=FALSE)
# or:
buffer <- project$get_file("MyAssetName.csv", directOsRetrieval=FALSE)
To the following using ibm-watson-studio-lib
:
buffer <- wslib$load_data("MyAssetName.csv")
Additionally, ibm-watson-studio-lib
offers a function to download a data asset and store it in the local file system:
info <- wslib$download_file("MyAssetName.csv", "MyLocalFile.csv")
Save data
To save data to a file, you need to change the following functions.
In code using project-lib
change (and for all variations of directStorage=FALSE
and setProjectAsset=TRUE
):
project$save_data("NewAssetName.csv", data)
project$save_data("MyAssetName.csv", data, overwrite=TRUE)
To the following using ibm-watson-studio-lib
:
asset <- wslib$save_data("NewAssetName.csv", data)
wslib$show(asset)
asset <- wslib$save_data("MyAssetName.csv", data, overwrite=TRUE)
wslib$show(asset)
Additionally, ibm-watson-studio-lib
offers a function to upload a local file to the project storage and create a data asset:
asset <- wslib$upload_file("MyLocalFile.csv", "MyAssetName.csv")
wslib$show(asset)
Get connection information
To return the metadata associated with a connection, you need to change the following functions.
In code using project-lib
change:
connprops <- project$get_connection(name="MyConnection")
To the following using ibm-watson-studio-lib
:
connprops <- wslib$get_connection("MyConnection")
Get connected data information
To return the metadata associated with a connected data asset, you need to change the following functions.
In code using project-lib
change:
dataprops <- project$get_connected_data(name="MyConnectedData")
To the following using ibm-watson-studio-lib
:
dataprops <- wslib$get_connected_data("MyConnectedData")
Access asset by ID instead of name
You can return the metadata of a connection or connected data asset by accessing the asset by ID instead of by name.
In project-lib
change:
connprops <- project$get_connection(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
connprops <- project$get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
datapros <- project$get_connected_data(id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
# or:
datapros <- project$get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
To the following using ibm-watson-studio-lib
:
connprops <- wslib$by_id$get_connection("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
dataprops <- wslib$by_id$get_connected_data("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")
In project-lib
, it is not possible to access files (stored data assets) by ID. You can only do this by name. The ibm-watson-studio-lib
library supports accessing files by ID. See Using ibm-watson-studio-lib.
Fetch assets by asset type
When you retrieve the list of all project assets, you can pass the optional parameter assetType
to the function get_assets
which allows you to filter assets by type. The accepted values for this parameter in project-lib
are data_asset
, connection
and asset
.
In code using project-lib
change:
project$get_assets()
# Or, for a supported asset type:
project$get_assets("<asset_type>")
# Or:
project$get_assets(assetType="<asset_type>")
To the following using ibm-watson-studio-lib
:
assets <- wslib$assets$list_assets("asset")
wslib$show(assets)
# Or, for a specific asset type:
assets <- wslib$assets$list_assets("<asset_type>")
# Example, list all notebooks:
notebook_assets <- wslib$assets$list_assets("notebook")
wslib$show(notebook_assets)
To list the available asset types, use:
assettypes <- wslib$assets$list_asset_types()
wslib$show(assettypes)
Spark support
To work with Spark, you need to change the functions that enable Spark support and retrieving the URL to a file.
Set up Spark support
To set up Spark support:
In code using project-lib
change:
# Provide SparkContext during setup
library(projectLib)
project <- projectLib::Project$new(sc, "<ProjectId>", "ProjectToken")
To the following using ibm-watson-studio-lib
:
library(ibmWatsonStudioLib) wslib <- access_project_or_space(list("token" = "ProjectToken"))
# provide SparkContext after initialization
wslib$spark$provide_spark_context(sc)
Retrieve URL to access a file from Spark
To retrieve a URL to access a file referenced by an asset from Spark via Hadoop:
In code using project-lib
change:
url = project$get_file_url("MyAssetName.csv")
# or
url = project$get_file_url("MyAssetName.csv", directStorage=FALSE)
# or
url = project$get_file_url("MyAssetName.csv", directOsRetrieval=FALSE)
To the following using ibm-watson-studio-lib
:
url = wslib$spark$get_data_url("MyAssetName.csv")
Get file URL for usage with Spark
Retrieve a URL to access a file referenced by an asset from Spark via Hadoop.
In code using project-lib
change:
url = project$get_file_url("MyFileName.csv, directStorage=TRUE)
# or
url = project$get_file_url("MyFileName.csv", directOsRetrieval=TRUE)
To the following using ibm-watson-studio-lib
:
wslib$spark$storage$get_data_url("MyFileName.csv")
Access project storage directly
You can fetch data from the project storage or save data to the project storage without synchronising the project assets.
Fetch data
To fetch data from the project storage:
In code using project-lib
change:
project$get_file("MyFileName.csv", directStorage=TRUE)
# Or:
project$get_file("MyFileName.csv", directOsRetrieval=TRUE)
To the following using ibm-watson-studio-lib
:
wslib$storage$fetch_data("MyFileName.csv")
Save data
To save data to a file in the project storage:
In code using project-lib
change:
project$save_data("NewFileName.csv", data, directStorage=TRUE)
# Or:
project$save_data("NewFileName.csv", data, setProjectAsset=FALSE)
To the following using ibm-watson-studio-lib
:
wslib$storage$store_data("NewFileName.csv", data)
In code using project-lib
change:
# Save (and overwrite if file exists) and do not create an asset in the project
project$save_data("MyFileName.csv", data, directStorage=TRUE, overwrite=TRUE)
# Or:
project$save_data("MyFileName.csv", data, setProjectAsset=FALSE, overwrite=TRUE)
To the following using ibm-watson-studio-lib
:
wslib$storage$store_data("MyFileName.csv", data, overwrite=TRUE)
Additionaly, `ibm-watson-studio-lib` provides a function to download a file from the project storage to the local file system:
wslib$storage$download_file("MyStorageFile.csv", "MyLocalFile.csv")
You can as well register a file in the project storage as data asset using:
wslib$storage$register_asset("MyStorageFile.csv", "MyAssetName.csv")
Learn more
To use the ibm-watson-studio-lib
library for R in notebooks, see ibm-watson-studio-lib for R.
Parent topic: Using ibm-watson-studio-lib