To access your data from a storage service that is compatible with the Amazon S3 API, create a connection asset for it.
Create a Generic S3 connection
To create the connection asset, you need these connection details:
- Endpoint URL: The endpoint URL to access to S3
- Bucket(optional): The name of the bucket that contains the files
- Region (optional): S3 region. Specify a region that matches the regional endpoint.
- Access key: The access key (username) that authorizes access to S3
- Secret key: The password associated with the Access key ID that authorizes access to S3
- The SSL certificate of the trusted host. The certificate is required when the host certificate is not signed by a known certificate authority.
- Disable chunked encoding: Select if the storage does not support chunked encoding.
- Enable global bucket access: Consult the documentation for your S3 data source for whether to select this property.
- Enable path style access: Consult the documentation for your S3 data source for whether to select this property.
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use the Generic S3 connection in the following workspaces and tools:
Projects
- Data Refinery (watsonx.ai Studio or IBM Knowledge Catalog)
- DataStage (DataStage service). See Connecting to a data source in DataStage.
- Decision Optimization (watsonx.ai Studio and watsonx.ai Runtime)
- Metadata enrichment (IBM Knowledge Catalog)
- Metadata import (IBM Knowledge Catalog)
Catalogs
-
Platform assets catalog
-
Other catalogs (IBM Knowledge Catalog)
Note:Preview, profile, and masking are not certified for this connection in IBM Knowledge Catalog.
Generic S3 connection setup
For setup information, consult the documentation of the S3-compatible data source that you are connecting to.
Supported file types
The Generic S3 connection supports these file types: Avro, CSV, delimited text, Excel, JSON, ORC, Parquet, SAS, SAV, SHP, and XML.
Table formats
In addition to Flat file, the Generic S3 connection supports these Data Lake table formats: Delta Lake and Iceberg.
Related connection: Amazon S3 connection
Parent topic: Supported connections