Apache Cassandra connection
To access your data in Apache Cassandra, create a connection asset for it.
Apache Cassandra is an open source, distributed, NoSQL database.
Supported versions
Apache Cassandra 2.0 or later
Create a connection to Apache Cassandra
To create the connection asset, you need these connection details:
- Hostname or IP address
- Port number
- Keyspace (optional)
- Username and password
- Read consistency (optional): Specifies the number of replicas that must respond to a read request before the data is returned to the client application.
- all: Data is returned to the application after all replicas have responded. This setting provides the highest consistency and lowest availability.
- local_one: Data is returned from the closest replica in the local data center.
- local_quorum: Data is returned after a quorum of replicas in the same data center as the coordinator node has responded. This setting voids latency of inter -data center communication.
- local_serial: Data within a data center is read without proposing a new addition or update. Uncommitted transactions within the data center are committed as part of the read.
- one: Data is returned from the closest replica. This setting provides the highest availability, but increases the likelihood of stale data being read.
- quorum: (Default). Data is returned after a quorum of replicas has responded from any data center.
- serial: Data is read without proposing a new addition or update. Uncommitted transactions are committed as part of the read.
- three: Data is returned from three of the closest replicas.
- two: Data is returned from two of the closest replicas.
- Write consistency (optional): Specifies the number of replicas for which the write request must succeed before an acknowledgment is returned to the client application.
- all: A write must succeed on all replica nodes in the cluster for that partition key. This setting provides the highest consistency and lowest availability.
- any: A write must succeed on at least one node. Even if all replica nodes for the given partition key are down, the write can succeed after a hinted handoff has been written. This setting provides the lowest consistency and highest availability.
- each_quorum: A write must succeed on a quorum of replica nodes across a data center.
- local_one: A write must succeed on at least one replica node in the local data center.
- local_quorum: A write must succeed on a quorum of replica nodes in the same data center as the coordinator node. This setting voids latency of inter -data center communication.
- local_serial: The driver prevents unconditional updates to achieve linearizable consistency for lightweight transactions within the data center.
- one: A write must succeed on at least one replica node.
- quorum: (Default). A write must succeed on a quorum of replica nodes.
- serial: The driver prevents unconditional updates to achieve linearizable consistency for lightweight transactions.
- three: A write must succeed on at least three replica nodes.
- two: A write must succeed on at least two replica nodes.
- SSL certificate (if required by the database server)
For Private connectivity, to connect to a database that is not externalized to the internet (for example, behind a firewall), you must set up a secure connection.
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a catalog
- Click Add to catalog > Connection. See Adding a connection asset to a catalog.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use Apache Cassandra connections in the following workspaces and tools:
Projects
- Data Refinery (Watson Studio or IBM Knowledge Catalog)
- DataStage (DataStage service). See Connecting to a data source in DataStage. The Apache Cassandra for DataStage connection gives you increased performance and more features such as before and after SQL statements and reject links. However, you cannot use the Apache Cassandra for DataStage connection outside of the DataStage service.
- Decision Optimization (Watson Studio and Watson Machine Learning)
- Metadata enrichment (IBM Knowledge Catalog)
- Metadata import (IBM Knowledge Catalog)
- Notebooks (Watson Studio). Click Read data on the Code snippets pane to get the connection credentials and load the data into a data structure. See Load data from data source connections.
- SPSS Modeler (Watson Studio)
Catalogs
-
Platform assets catalog
-
Other catalogs (IBM Knowledge Catalog)
Primary keys in SQL statements
If you create a target table with an SQL statement and you do not specify a key column, the first column is designated as the primary key.
Apache Cassandra setup
Learn more
Parent topic: Supported connections