Go back to the English version of the documentationSetting the character-encoding scheme in Data Virtualization
Setting the character-encoding scheme in Data Virtualization for IBM Cloud Pak for Data
Last updated: 05 февр. 2025 г.
To ensure that remote connectors correctly decode file data, you must set the character encoding scheme manually. By setting the character encoding scheme, you configure the remote connector to apply specific decoding to read data files.
About this task
Cloud Pak for Data automatically detects the encoding scheme of flat data files, such as CSV and TSV files. However, you must set the encoding scheme manually for flat data files to avoid decoding issues.
These instructions use files with data encoded in Shift-JS (Japanese) as an example. To get a full list of data encodings, see Supported encodings.
Note:
- You can follow these steps while the remote connector is running. However, to apply new encoding schemes to an existing virtual table, you must delete the virtual table and virtualize it again.
- The properties files are located under a special folder in the remote connector installation directory, separate from your data files. The Data Virtualization remote connector remains self-contained with minimal disruption to your own environment, which also follows the containerization principles and benefits that are provided by the Docker installation of remote connectors.
Procedure
To ensure that remote connectors correctly decode data in files, choose one of the following methods: