Watson Natural Language Processing library

Last updated: Nov 21, 2024

The Watson Natural Language Processing library provides natural language processing functions for syntax analysis and pre-trained models for a wide variety of text processing tasks, such as sentiment analysis, keyword extraction, and classification. The Watson Natural Language Processing library is available for Python only.

With Watson Natural Language Processing, you can turn unstructured data into structured data, making the data easier to understand and transferable, in particular if you are working with a mix of unstructured and structured data. Examples of such data are call center records, customer complaints, social media posts, or problem reports. The unstructured data is often part of a larger data record that includes columns with structured data. Extracting meaning and structure from the unstructured data and combining this information with the data in the columns of structured data:

Gives you a deeper understanding of the input data
Can help you to make better decisions.

Watson Natural Language Processing provides pre-trained models in over 20 languages. They are curated by a dedicated team of experts, and evaluated for quality on each specific language. These pre-trained models can be used in production environments without you having to worry about license or intellectual property infringements.

Although you can create your own models, the easiest way to get started with Watson Natural Language Processing is to run the pre-trained models on unstructured text to perform language processing tasks.

Some examples of language processing tasks available in Watson Natural Language Processing pre-trained models:

Language detection: detect the language of the input text
Syntax: tokenization, lemmatization, part of speech tagging, and dependency parsing
Entity extraction: find mentions of entities (like person, organization, or date)
Noun phrase extraction: extract noun phrases from the input text
Text classification: analyze text and then assign a set of pre-defined tags or categories based on its content
Sentiment classification: is the input document positive, negative or neutral?
Tone classification: classify the tone in the input document (like excited, frustrated, or sad)
Emotion classification: classify the emotion of the input document (like anger or disgust)
Keywords extraction: extract noun phrases that are relevant in the input text
Concepts: find concepts from DBPedia in the input text
Relations: detect relations between two entities
Hierarchical categories: assign individual nodes within a hierarchical taxonomy to the input document
Embeddings: map individual words or larger text snippets into a vector space

Watson Natural Language Processing encapsulates natural language functionality through blocks and workflows. Blocks and workflows support functions to load, run, train, and save a model.

For more information, refer to Working with pre-trained models.

Some examples of how you can use the Watson Natural Language Processing library:

Running syntax analysis on a text snippet:

import watson_nlp

# Load the syntax model for English
syntax_model = watson_nlp.load('syntax_izumo_en_stock')

# Run the syntax model and print the result
syntax_prediction = syntax_model.run('Welcome to IBM!')
print(syntax_prediction)

Extracting entities from a text snippet:

import watson_nlp
entities_workflow = watson_nlp.load('entity-mentions_transformer-workflow_multilingual_slate.153m.distilled')
entities = entities_workflow.run('IBM\'s CEO Arvind Krishna is based in the US', language_code="en")
print(entities.get_mention_pairs())

For examples of how to use the Watson Natural Language Processing library, refer to Watson Natural Language Processing library usage samples.

Using Watson Natural Language Processing in a notebook

You can run your Python notebooks that use the Watson Natural Language Processing library in any of the environments that listed here. The GPU environment templates include the Watson Natural Language Processing library.

DO + NLP: Indicates that the environment template includes both the CPLEX and the DOcplex libraries to model and solve decision optimization problems and the Watson Natural Language Processing library.

~ : Indicates that the environment template requires the watsonx.ai Studio Professional plan. See Offering plans.

Environment templates that include the Watson Natural Language Processing library
Name	Hardware configuration	CUH rate per hour
NLP + DO Runtime 24.1 on Python 3.11 XS	2vCPU and 8 GB RAM	6
GPU V100 Runtime 24.1 on Python 3.11 ~	40 vCPU + 172 GB + 1 NVIDIA® V100 (1 GPU)	68
GPU 2xV100 Runtime 24.1 on Python 3.11 ~	80 vCPU + 344 GB + 2 NVIDIA® V100 (2 GPU)	136

Normally these environments are sufficient to run notebooks that use prebuilt models. If you need a larger environment, for example to train your own models, you can create a custom template that includes the Watson Natural Language Processing library.

Create a custom template without GPU by selecting the engine type Default, the hardware configuration size that you need, and choosing NLP + DO Runtime 24.1 on Python 3.11 as the software version.
Create a custom template with GPU by selecting the engine type GPU, the hardware configuration size that you need, and choosing GPU Runtime 24.1 on Python 3.11 as the software version.