0 / 0
Python for Spark scripts (SPSS Modeler)

Python for Spark scripts

SPSS Modeler supports Python scripts for Apache Spark.

Note:
  • Python nodes depend on the Spark environment.
  • Python scripts must use the Spark API because data is presented in the form of a Spark DataFrame.
  • When installing Python, make sure all users have permission to access the Python installation.
  • If you want to use the Machine Learning Library (MLlib), you must install a version of Python that includes NumPy.

Tips

You can run the following Python scripts from an Extension Output node:

  • To view information about the distribution of Python included with SPSS Modeler:
    import sys
    sys.version
  • To list all installed Python packages:
    import subprocess
    subprocess.check_call([sys.executable, '-m', 'pip', 'list'])
  • To install Python packages from an air-gapped environment, use the --index-url option which allows pip to install packages from a given Python repository (the repository must be compliant with PEP 503). For more information, including a list of all options, see https://pip.pypa.io/en/stable/cli/pip_install/.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more