0 / 0
Accessing data in AWS through access points from a notebook

Accessing data in AWS through access points from a notebook

In Cloud Pak for Data as a Service you can access data stored in AWS S3 buckets through access points from a notebook.

Run the notebook in an environment in Cloud Pak for Data as a Service. Create an internet-enabled access point to connect to the S3 bucket.

Connecting to AWS S3 data through an internet-enabled access point

You can access data in an AWS S3 bucket through an internet-enabled access point in any AWS region.

To access S3 data through an internet-enabled access point:

  1. Create an access point for your S3 bucket. See Creating access points.

    Set the network origin to Internet.

  2. After the access point is created, make a note of the Amazon resource name (ARN) for the access point. Example: ARN: arn:aws:s3:us-east-1:675068711478:accesspoint/cust-data-bucket-internet-ap. You will need to enter the ARN in your notebook.

Accessing AWS S3 data from your notebook

The following sample code snippet shows you how to access AWS data from your notebook by using an access point:

import boto3
import pandas as pd

# use an access key and a secret that has access to the bucket
access_key="..."  
secret="..." 

s3_client = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret)

#the Amazon resource name (ARN) of the access point
arn = "..." 
# the file you want to retrieve
fileName="customers.csv"

response = s3_client.get_object(Bucket=arn, Key=fileName)
s3FileStream = response["Body"]
#for other file types, change the line below to use the appropriate read_() method from pandas
customerDF = pd.read_csv(s3FileStream)

Parent topic: Loading and accessing data in a notebook

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more