0 / 0
Hierarchical text categorization

Hierarchical text categorization

The Watson Natural Language Processing Categories block assigns individual nodes within a hierarchical taxonomy to an input document. For example, in the text IBM announces new advances in quantum computing, examples of extracted categories are technology and computing/hardware/computer and technology and computing/operating systems. These categories represent level 3 and level 2 nodes in a hierarchical taxonomy.

This block differs from the Classification block in that training starts from a set of seed phrases associated with each node in the taxonomy, and does not require labeled documents.

Note that the Hierarchical text categorization block can only be used in a notebook that is started in an environment based on Runtime 23.1 that includes the Watson Natural Language Processing library.

Block name

categories_esa_en_stock

Supported languages

The Categories block is available for the following languages. For a list of the language codes and the corresponding language, see Language codes.

de, en

Capabilities

Use this block to determine the topics of documents on the web by categorizing web pages into a taxonomy of general domain topics, for ad placement and content recommendation. The model was tested on data from news reports and general web pages.

For a list of the categories that can be returned, see Category types.

Dependencies on other blocks

The following block must run before you can run the hierarchical categorization block:

  • syntax_izumo_<language>_stock

Code sample

import watson_nlp

# Load Syntax and a Categories model for English
syntax_model = watson_nlp.load('syntax_izumo_en_stock')
categories_model = watson_nlp.load('categories_esa_en_stock')

# Run the syntax model on the input text
syntax_prediction = syntax_model.run('IBM announced new advances in quantum computing')

# Run the categories model on the result of syntax
categories = categories_model.run(syntax_prediction)
print(categories)

Output of the code sample:

{
  "categories": [
    {
      "labels": [
        "technology & computing",
        "computing"
      ],
      "score": 0.992489,
      "explanation": []
    },
    {
      "labels": [
        "science",
        "physics"
      ],
      "score": 0.945449,
      "explanation": []
    }
  ],
  "producer_id": {
    "name": "ESA Hierarchical Categories",
    "version": "1.0.0"
  }
}

Parent topic: Watson Natural Language Processing task catalog

Generative AI search and answer
These answers are generated by a large language model in watsonx.ai based on content from the product documentation. Learn more