0 / 0
Accessing data sources by using remote connectors in Data Virtualization
Last updated: Nov 26, 2024
Accessing data sources by using remote connectors in Data Virtualization
Use remote connectors in Data Virtualization, along with IBM® Cloud Secure Gateway, to access data sources and files that are located in protected networks.
Access remote data source or services
Remote connectors provide access to data sources or other data services that are not directly accessible from the Cloud Pak for Data cluster. Additionally, remote connectors facilitate data source discovery with remote port scanning. For more information, see Discovering remote data sources.
Access data stored in files
You can access file data, in formats such as CSV, TSV, and XLS, on remote file systems. Additionally, connectors provide remote browsing and data preview to facilitate virtualization configuration.
Improve query performance
Remote connectors enable distributed aggregations and join filters, and accelerate query processing on multiple worker pods. Connectors also enable greater numbers of data source connections and enhance parallelism during processing. As the number of connected sources increases, the distribution and parallelism of processing benefits query performance. Thus, moving the connector closer to the data source moves that processing closer to the data source.
Recommendations:
  • Locate the remote connector as close as possible to the data source. When it is on the same machine as the data source, you eliminate network latency between the data source and the remote connector. If it is located within the same data center, you have a stable high-speed network between them. The latency increases the further the remote connector moves from the data source. Latencies still exist along the connector communications path, but the connector performs more operations on the result data from the data source.
  • Adjust the number of data sources on each remote connector. The maximum recommended number of data sources per remote connector is 10 because of the memory settings that are defined for each connector.
  • Ensure that you have IBM Java 8 installed on the data source where the remote connector will be located.

How to access data on remote data sources

Use the following workflow to understand how to access data on remote data sources.

Process overview to connect Data Virtualization to remote data sources.

To try it out, see Improve performance for your data virtualization data sources with remote connectors.

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more