After SPSS Modeler extracts the concepts and types from your text data, you can begin building categories. In the Text Analytics Workbench, you can use the Categories tab to create and explore categories.
On the Categories tab, you can build categories using descriptors. These terms are defined as follows.
- Categories
- Categories are a group of closely related ideas and patterns to which documents and records are assigned through a scoring process. They organize related concepts and patterns into larger groupings, which are easier to work with. Categories are a combination of concepts, types, rules, and patterns.
- Descriptors
- Descriptors are used to identify whether a record or document belongs in a category. Every category is made up of a set of descriptors, such as concepts, types, and rules. When some or all the text in a document or record matches a descriptor, the document or record is matched to the category.
In the Text Analytics Workbench, you can explore the pattern results and use them as descriptors for categories. You can use the Text Mining node to extract text link analysis (TLA) results as a way to fine-tune templates to your data. You can then use these templates later directly in the Text Link Analysis node.
You can build categories automatically by using the automated techniques that are built into SPSS Modeler, such as semantic networks and concept inclusion. Or you can create categories manually by using other insights that you might have regarding the data. You can also use a combination of both, and you can also load a set of prebuilt categories from a text analysis package.
You can refine the extraction results by modifying the linguistic resources, which you can do on the Resource editor tab.
Categories pane
You can manage any categories that you build in the Categories pane. You can create new categories if categories don't exist for concepts and patterns that you want to group. Or you can refine a category if you want it to include or exclude specific concepts or patterns. You can select a row in the pane to display information about the corresponding documents or descriptors.
You can search for specific keywords in categories by clicking the Search icon.
To change how categories are built, select Setting options.
from the toolbar while no categories are selected. For more information about the settings, seePreview pane
When you select a row, the Preview pane shows the text from the documents or records that have the concept you select. The text is highlighted to help you easily identify them in the text.
Descriptors pane
The Descriptors pane shows a list of concepts, types, type patterns, and concept patterns. You can also see whether any of these descriptors are part of a category.
Searching the Categories tab
To locate information quickly in a particular section:
- Click the Find icon on the Categories tab to display the search field.
- Type the word string that you want to search for. You can use the up and down arrow buttons to control the direction of your search. If a match is found, the text is highlighted.
- To look for the next match, click the arrow button again.
Custom category sets
You can download a category set as an .xslx file. You can customize the category set and then reuse it by uploading the .xslx file while on the Categories tab.