To access your data in Apache Hive, create a connection asset for it.
Apache Hive is a data warehouse software project that provides data query and analysis and is built on top of Apache Hadoop.
Supported versions
Apache Hive 1.0.x, 1.1.x, 1.2.x. 2.0.x, 2.1.x, 3.0.x, 3.1.x.
Create a connection to Apache Hive
To create the connection asset, you need the following connection details:
- Database name (optional): If you do not enter a database name, you must enter the catalog name, schema name, and the table name in the properties for SQL queries.
- Hostname or IP address
- Port number
- HTTP path (optional): The path of the endpoint such as the gateway, default, or hive if the server is configured for the HTTP transport mode.
- If required by the database server, the SSL certificate
For Private connectivity, to connect to a database that is not externalized to the internet (for example, behind a firewall), you must set up a secure connection.
Choose the method for creating a connection based on where you are in the platform
- In a project
- Click Assets > New asset > Connect to a data source. See Adding a connection to a project.
- In a deployment space
- Click Import assets > Data access > Connection. See Adding data assets to a deployment space.
- In the Platform assets catalog
- Click New connection. See Adding platform connections.
Next step: Add data assets from the connection
Where you can use this connection
You can use the Apache Hive connection in the following workspaces and tools:
Projects
- Data Refinery
- Decision Optimization
- SPSS Modeler
- Synthetic Data Generator
Catalogs
- Platform assets catalog
Apache Hive setup
Restriction
Running SQL statements
To ensure that your SQL statements run correctly, refer to the SQL Operations in the Apache Hive documentation for the correct syntax.
Learn more
Parent topic: Supported connections