0 / 0
Using non-ASCII characters

Using non-ASCII characters

To use non-ASCII characters, Python requires explicit encoding and decoding of strings into Unicode. In SPSS Modeler, Python scripts are assumed to be encoded in UTF-8, which is a standard Unicode encoding that supports non-ASCII characters. The following script will compile because the Python compiler has been set to UTF-8 by SPSS Modeler.

Scripting example showing Japanese characters. The node that's created has an incorrect label.

However, the resulting node has an incorrect label.

Figure 1. Node label containing non-ASCII characters, displayed incorrectly
Node label containing non-ASCII characters, displayed incorrectly

The label is incorrect because the string literal itself has been converted to an ASCII string by Python.

Python allows Unicode string literals to be specified by adding a u character prefix before the string literal:

Scripting example showing Japanese characters. The node that's created has the correct label.

This will create a Unicode string and the label will be appear correctly.

Figure 2. Node label containing non-ASCII characters, displayed correctly
Node label containing non-ASCII characters, displayed correctly

Using Python and Unicode is a large topic that's beyond the scope of this document. Many books and online resources are available that cover this topic in great detail.

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more