About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Last updated: Nov 26, 2024
In Cloud Pak for Data as a Service you can access data stored in AWS S3 buckets through access points from a notebook.
Run the notebook in an environment in Cloud Pak for Data as a Service. Create an internet-enabled access point to connect to the S3 bucket.
Connecting to AWS S3 data through an internet-enabled access point
You can access data in an AWS S3 bucket through an internet-enabled access point in any AWS region.
To access S3 data through an internet-enabled access point:
-
Create an access point for your S3 bucket. See Creating access points.
Set the network origin to
.Internet
-
After the access point is created, make a note of the Amazon resource name (ARN) for the access point. Example:
. You will need to enter the ARN in your notebook.ARN: arn:aws:s3:us-east-1:675068711478:accesspoint/cust-data-bucket-internet-ap
Accessing AWS S3 data from your notebook
The following sample code snippet shows you how to access AWS data from your notebook by using an access point:
import boto3
import pandas as pd
# use an access key and a secret that has access to the bucket
access_key="..."
secret="..."
s3_client = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret)
#the Amazon resource name (ARN) of the access point
arn = "..."
# the file you want to retrieve
fileName="customers.csv"
response = s3_client.get_object(Bucket=arn, Key=fileName)
s3FileStream = response["Body"]
#for other file types, change the line below to use the appropriate read_() method from pandas
customerDF = pd.read_csv(s3FileStream)
Parent topic: Loading and accessing data in a notebook