Text Analytics Workbench (SPSS Modeler) | IBM Cloud Pak for Data as a Service

Text Analytics Workbench

Last updated: Sep 24, 2024

Text Analytics Workbench (SPSS Modeler)

From a Text Mining node, you can choose to start the Text Analytics Workbench session when your flow runs. The Text Analytics Workbench is an interactive session where you can explore the extraction results and fine tune the configuration for the Text Mining node.

Text mining is an iterative process in which extraction results are reviewed according to the context of the text data, fine-tuned to produce new results, and then reevaluated. When you run the Text Mining node, the extraction engine reads through the text data, identifies the relevant concepts, and assigns a type to each.

When the Text Mining node finishes running, the Text Analytics Workbench opens so you can review the extraction results. The Text Analytics Workbench is organized into tabs. On each tab, you can focus on different areas of the text mining process.

Concepts: Concepts are important words and phrases that were identified and extracted from your text data. They are also referred to as extraction results. These concepts are grouped into types. You can use these concepts to explore your data and create your categories. You can manage the concepts on the Concepts tab.
Text links: You can extract patterns from your text data if you have text link analysis (TLA) rules in your linguistic resources. For example, your resource template already has some TLA rules. These patterns can help you uncover interesting relationships between concepts in your data. You can also use these patterns as descriptors in your categories. You can manage these patterns on the Text links tab.
Categories: Using descriptors (such as extraction results, patterns, and rules) as a definition, you can manually or automatically create a set of categories. Documents and records are assigned to these categories based on whether they contain a part of the category definition. You can manage categories on the Categories tab.
Resources: The extraction process relies on a set of parameters and definitions from linguistic resources to govern how text is extracted and handled. You can tune these linguistic resources (such as templates and libraries) on the Resource editor tab.

You can use the workbench to do the following text mining tasks:

Extract key concepts from your text data
Build categories
Explore patterns in text link analysis (TLA)
Generate category model nuggets
Save the resources that you tuned or used during the extraction process as a text analysis package (TAP).