0 / 0
Text Mining node
Last updated: May 23, 2024
Text Mining node (SPSS Modeler)

You can use the Text Mining node for text mining, which is an iterative process that identifies relevant concepts and patterns in the text data. When you run the Text Mining node, the extraction engine reads through the text data, identifies the relevant concepts, and assigns a type to each. You can then review the extraction results by using the Text Analytics Workbench to fine-tune the extraction process. You can rerun the Text Mining node to produce new results, and then evaluate the new results.

Figure 1. Text Mining node to analyze comments from hotel guests
Text Mining node to analyze comments from hotel guests
  1. Add a Data Asset node that points to hotelSatisfaction.csv.
  2. From the Text Analytics category on the node palette, add a Text Mining node, connect it to the Data Asset node you added in the previous step, and double-click it to open its properties.
  3. Under Fields, select Comments for the Text field and select id for the ID field.
    Note: Only the Text field is required.
    Figure 2. Text Mining node properties
    Text Mining node build properties. It shows some field settings in the window like the Text field and ID field.
  4. Under Copy resources from, select Text analysis package, click Select Resources, and then load Hotel Satisfaction (English).tap (with Current category set(s) = Topic + Opinion).
    A text analysis package (TAP) is a predefined set of libraries and advanced linguistic and nonlinguistic resources, which are bundled with one or more sets of predefined categories. If no text analysis package is relevant for your application, you can instead select Resource template under Copy resources from. A resource template is a predefined set of libraries and advanced linguistic and nonlinguistic resources that were fine-tuned for a particular domain or usage.
    Figure 3. Text Mining node properties
    Text Mining node build properties. It shows the radio section choices for the Copy resources from option. The choices are Resource template or Text analysis package.
  5. Under Build models, check that Build interactively (category model nugget) is selected. Later when you run the node, this option starts Text Analytics Workbench, which is an interactive interface where you can explore and fine-tune the extraction results.
  6. Under Begin session by, select Extracting concepts and text links. The option Extracting concepts extracts only concepts, whereas TLA extraction outputs both concepts and text links that are connections between topics (such as service, personnel, and food) and opinions.
  7. Under Expert, select Accommodate spelling for a minimum word character length of. This option applies a fuzzy grouping technique that helps group commonly misspelled words or closely spelled words under one concept. The fuzzy grouping algorithm temporarily strips double or triple consonants and all vowels (except the first one) from extracted words. It then compares them to see whether they're the same. For example, location and locattoin are grouped.
    Figure 4. Text Mining node properties
    Text Mining node expert properties. It shows property settings for the Text Mining node. Some major settings groups are Settings, Build models, and Expert. In the Expert grouping are check boxes for setting such as Accommodate spelling for a minimum root character limit, Extract uniterms, Extract nonlinguistic entities, Uppercase algorithm, Group partial and full person names together when possible, and Use derivation when grouping compound nouns.
  8. Click Save.
  9. Run the Text Mining node to open the Text Analytics Workbench, and then proceed to the next section of this tutorial.
Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more