IBM watsonx.data Presto connection
To access your data in IBM watsonx.data, create a connection asset for it. The connection asset includes information for connecting to a watsonx.data instance and to a query engine that is running on that instance.
IBM watsonx.data is an open, hybrid, and governed data lakehouse that is optimized by a query engine for all data and AI workloads.
Prerequisite
Set up an instance of watsonx.data.
You can connect to software or as a service instances:
-
watsonx.data software on Cloud Pak for Data: See Installing watsonx.data on Cloud Pak for Data
-
watsonx.data as a Service on IBM Cloud: See Getting started with watsonx.data on IBM Cloud
-
watsonx.data stand-alone software: See Installing stand-alone watsonx.data
Create a connection to watsonx.data
Your connection details vary between watsonx.data software and watsonx.data as a Service.
watsonx.data software
To create the connection asset, in the Connection details section of the Connect to a data source page, select Connect to watsonx.data on Cloud Pak for Data and provide these details:
-
Hostname or IP address: Find this information in the URL of the watsonx.data web console, between
https://
and/watsonx-data/
:https://<hostname-or-IPaddress>/watsonx-data/#/<remainder-of-URL>
. -
Port: The default port number is
443
. If the connection has a different port, you can find this number in the URL of the watsonx.data web console, after the final colon. -
Instance ID: Find this value in the watsonx.data console navigation menu. Click Instance details from the navigation menu.
-
Instance name: Find the instance name in the Cloud Pak for Data web client home page. Click Services > Instances from the navigation menu.
watsonx.data as a Service
To create the connection asset, in the Connection details section of the Connect to a data source page, provide these details:
-
Hostname or IP address: Find this information in the URL of the watsonx.data web console, between
https://
and/#/
:https://<hostname-or-IPaddress>/#/<remainder-of-URL>
. For example,us-south.lakehouse.cloud.ibm.com
. -
Port: The default port number is
443
. Do not use the suggested port number that is shown in the field. If the connection has a different port, you can find this number in the URL of the watsonx.data web console, after the final colon. -
Instance ID: Find this value in the watsonx.data console. Click Instance details from the navigation menu.
-
Instance name: Find this value on the watsonx.ai Service instances page. Click Administration > Services > Service instances. For example,
watsonx.data-aaa
. Do not use the suggested instance name that is shown in the field. -
CRN: Cloud resource name: Find this value in the watsonx.data console. Click Instance details from the navigation menu.
Credentials
Your credentials vary between watsonx.data software and watsonx.data as a Service.
watsonx.data software
The username and password or usernames and API key for the watsonx.data instance. The same credentials are also used for the engine.
You must select the authentication method:
- Username and password: The username and password that is used to access Cloud Pak for Data where the watsonx.data instance is located, or the username and password for watsonx.data standalone.
- Username and API key: The username and API key that is used to access Cloud Pak for Data where the watsonx.data instance is located, or the username and password for watsonx.data standalone. This authentication method is recommended if Cloud Pak for Data uses an Identity Management Service (IAM), for example, LDAP or SSO. The API key is located in the Profile and settings of the target Cloud Pak for Data cluster. For information on API keys, see Generating API keys for authentication.
watsonx.data as a Service
The username and password for the watsonx.data instance. The same credentials are also used for the engine.
- Username: The default username is
ibmlhapikey_<cloud-account-email-address>
. For example,[email protected]
. - Password: The password is the user's API key. To create an API key, see IBM Cloud docs: Creating an API key in the console.
Certificates
By default, SSL is enabled is selected. This setting is recommended for increased security. If you do not use SSL, the data might be subject to vulnerabilities such as data leakage. Although the database that is hosted in watsonx.data can also have an SSL certificate, the connection goes through the engine.
The SSL certificate must be in PEM format.
The SSL certificates information varies between watsonx.data software and watsonx.data as a Service.
watsonx.data software
If SSL is enabled on a watsonx.data instance on Cloud Pak for Data and the certificate is a self-signed certificate, you must enter the certificate in the SSL certificate field.
Ask your watsonx.data administrator if SSL is set up. To obtain the certificate of an IBM watsonx.data instance on Cloud Pak for Data, run this command:
openssl s_client -showcerts -connect <cpd_hostname>:<cpd_port>
For example:
openssl s_client -showcerts -connect cpd.myserver.example.com:443
watsonx.data as a Service
The SSL certificate is optional.
Engine connection details
The engine connection details vary between watsonx.data software and watsonx.data as a Service.
Only the Presto (Java) engine is supported. (The Presto (C++) engine is not supported.)
watsonx.data software
- Prerequisite
- The administrator must expose the secure route to Presto server. See Exposing secure route to Presto server.
Provide these engine connection details. Find this information in the watsonx.data web console. Click Infrastructure manager from the navigation menu, and then click the engine name to view the engine details.
-
Engine's hostname or IP address: The hostname or IP address is the value of the Internal host field.
-
Engine ID: This value is in the Engine ID field.
-
Engine's port: The port number is the value in the Internal host field after the colon (
:
). The default port number is8443
.
watsonx.data as a Service
Provide these engine connection details. Find this information in the watsonx.data web console. Click Infrastructure manager from the navigation menu, and then click the engine name to view the engine details.
-
Engine's hostname or IP address: The hostname or IP address is the value in the Host field before the colon (
:
). -
Engine ID: This value is in the Engine ID field.
-
Engine's port: The port number is the value in the Host field after the colon (
:
).
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use the watsonx.data Presto connection in the following workspaces and tools:
Projects
- Data Refinery (Watson Studio or IBM Knowledge Catalog)
- DataStage (DataStage service). See Connecting to a data source in DataStage.
- Decision Optimization (Watson Studio and Watson Machine Learning)
- Metadata import (IBM Knowledge Catalog)
Catalogs
-
Platform assets catalog
-
Other catalogs (IBM Knowledge Catalog)
Writing data into watsonx.data
You can ingest data into watsonx.data with DataStage. You must enter a catalog_name
, schema_name
, and table_name
properties. The table_name
property is required. You can pass the fully qualified
name, catalog_name.schema_name.table_name
, into the table_name
property.
watsonx.data web console
Learn more
Related connections
Parent topic: Supported connections