The Watson Natural Language Processing library provides natural language processing functions for syntax analysis and out-of-the-box pre-trained models for a wide variety of text processing tasks, such as sentiment analysis, keyword extraction, and classification. The Watson Natural Language Processing library is available for Python only.
With Watson Natural Language Processing, you can turn unstructured data into structured data, making the data easier to understand and transferable, in particular if you are working with a mix of unstructured and structured data. Examples of such data are call center records, customer complaints, social media posts, or problem reports. The unstructured data is often part of a larger data record which includes columns with structured data. Extracting meaning and structure from the unstructured data and combining this information with the data in the columns of structured data, gives you a deeper understanding of the input data and can help you to make better decisions.
Watson Natural Language Processing provides pre-trained models in over 20 languages. They are curated by a dedicated team of experts, and evaluated for quality on each specific language. These pre-trained models can be used in production environments without you having to worry about license or intellectual property infringements.
Although you can create your own models, the easiest way to get started with Watson Natural Language Processing is to run the pre-trained models on unstructured text to perform language processing tasks.
Here are some examples of language processing tasks available in Watson Natural Language Processing pre-trained models:
- Language detection: detect the language of the input text
- Syntax: tokenization, lemmatization, part of speech tagging, and dependency parsing
- Entity extraction: find mentions of entities (like person, organization, or date)
- Noun phrase extraction: extract noun phrases from the input text
- Text classification: analyze text and then assign a set of pre-defined tags or categories based on its content
- Sentiment classification: is the input document positive, negative or neutral?
- Tone classification: classify the tone in the input document (like excited, frustrated, or sad)
- Emotion classification: classify the emotion of the input document (like anger or disgust)
- Keywords extraction: extract noun phrases that are relevant in the input text
- Concepts: find concepts from DBPedia in the input text
- Relations: detect relations between two entities
- Hierarchical categories: assign individual nodes within a hierarchical taxonomy to the input document
- Embeddings: map individual words or larger text snippets into a vector space
Watson Natural Language Processing encapsulates natural language functionality through blocks and workflows. Blocks and workflows support functions to load, run, train, and save a model.
For more information, refer to Working with pre-trained models.
Here are some examples of how you can use the Watson Natural Language Processing library:
Running syntax analysis on a text snippet:
import watson_nlp
# Load the syntax model for English
syntax_model = watson_nlp.load('syntax_izumo_en_stock')
# Run the syntax model and print the result
syntax_prediction = syntax_model.run('Welcome to IBM!')
print(syntax_prediction)
Extracting entities from a text snippet:
import watson_nlp
entities_workflow = watson_nlp.load('entity-mentions_transformer-workflow_multilingual_slate.153m.distilled')
entities = entities_workflow.run('IBM\'s CEO Arvind Krishna is based in the US', language_code="en")
print(entities.get_mention_pairs())
For examples of how to use the Watson Natural Language Processing library, refer to Watson Natural Language Processing library usage samples.
Using Watson Natural Language Processing in a notebook
You can run your Python notebooks that use the Watson Natural Language Processing library in any of the environments listed here. The GPU environment templates include the Watson Natural Language Processing library.
DO + NLP: Indicates that the environment templates includes both the CPLEX and the DOcplex libraries to model and solve decision optimization problems and the Watson Natural Language Processing library.
~ : Indicates that the environment template requires the Watson Studio Professional plan. See Offering plans.
Name | Hardware configuration | CUH rate per hour |
---|---|---|
NLP Runtime 23.1 on Python 3.10 XS | 2vCPU and 8 GB RAM | 6 |
DO + NLP Runtime 22.2 on Python 3.10 | 2 vCPU and 8 GB RAM | 6 |
GPU V100 Runtime 23.1 on Python 3.10 ~ | 40 vCPU + 172 GB + 1 NVIDIA® V100 (1 GPU) | 68 |
GPU 2xV100 Runtime 23.1 on Python 3.10 ~ | 80 vCPU + 344 GB + 2 NVIDIA® V100 (2 GPU) | 136 |
GPU V100 Runtime 22.2 on Python 3.10 ~ | 40 vCPU + 172 GB + 1 NVIDIA® V100 (1 GPU) | 68 |
GPU 2xV100 Runtime 22.2 on Python 3.10 ~ | 80 vCPU + 172 GB + 2 NVIDIA® V100 (2 GPU) | 136 |
Normally these environments are sufficient to run notebooks that use prebuilt models. If you need a larger environment, for example to train your own models, you can create a custom template that includes the Watson Natural Language Processing library. Refer to Creating your own environment template.
- Create a custom template without GPU by selecting the engine type
Default
, the hardware configuration size that you need, and choosingNLP Runtime 23.1 on Python 3.10
orDO + NLP Runtime 22.2 on Python 3.10
as the software version. - Create a custom template with GPU by selecting the engine type
GPU
, the hardware configuration size that you need, and choosingGPU Runtime 23.1 on Python 3.10
orGPU Runtime 22.2 on Python 3.10
as the software version.
Learn more
Parent topic: Notebooks and scripts