The workflow in SPSS Modeler is built around the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology. This methodology embeds your work in SPSS Modeler in a larger project with several phases. The phases were you work in SPSS Modeler use projects to manage your work and assets.
Phases for data mining
The CRISP-DM methodology has the following phases.
- Business understanding
- During this phase, try to gain as much insight as possible into the business
goals for data mining. Meet with stakeholders and determine how your work
with SPSS Modeler addresses business objectives
or problems.
For more information, see Understanding and preparing data.
- Data understanding
- You need to collect and understand your data before you build flows in
SPSS Modeler. Take the time to understand
the data structure, relationships, and patterns in your data.
For more information, see Understanding and preparing data.
- Data preparation
- You need to prepare your data before you train models in SPSS Modeler. Take the time to process your data so
that it is optimized for use in data mining.
For more information, see Understanding and preparing data.
- Modeling
- Build SPSS Modeler flows to explore your data,
try different models, and investigate relationships to find useful
information.
For more information, see Building flows and models.
- Deployment
-
After you build and train a predictive model, you can promote it to watsonx.ai Runtime if you have the watsonx.ai Runtime service.
For more information, see Promoting SPSS Modeler flows and models.
- Evaluation
- Evaluate the quality of your models and their predictions. For example, you can add Analysis nodes to your flows to assess how accurate your model's predictions are. You can also use an Evaluation node to compare predictive models and find the best one.
Working with projects and data assets
All your work with SPSS Modeler is done within a project. A project holds all your data assets and flows.
- For more information about projects, see Working in projects
- For more information about adding data to your project to use in a flow, see Add data to a project.
You can connect SPSS Modeler to a data source like a database to make accessing your data in SPSS Modeler easier.
For more information about connecting data, see Supported data sources
You can import a stream (.str) that was created in SPSS Modeler Subscription or SPSS Modeler client. If the imported flow contains one or more import or export nodes, you are prompted to convert the nodes when you open the flow.
For more information, see Importing an SPSS Modeler stream.
Scripting
You can use scripting in SPSS Modeler to automate tasks. You can write scripts in R, Python, or Python for Spark, and Control Language for Expression Manipulation (CLEM). CLEM is a language for analyzing and manipulating the data streams through your flows.
For more information, see Scripting and automation.