You can train your own models for sentiment extraction based on the Slate IBM Foundation model. This pretrained model can be find-tuned for your use case by training it on your specific input data.
For a list of available Slate models, see this table:
Model | Description |
---|---|
pretrained-model_slate.153m.distilled_many_transformer_multilingual_uncased |
Generic, multi-purpose model |
pretrained-model_slate.125m.finance_many_transformer_en_cased |
Model pretrained on finance content |
pretrained-model_slate.110m.cybersecurity_many_transformer_en_uncased |
Model pretrained on cybersecurity content |
pretrained-model_slate.125m.biomedical_many_transformer_en_cased |
Model pretrained on biomedical content |
- Input data format for training
- Loading the pretrained model resources
- Training the model
- Applying the model on new data
Input data format for training
You need to provide a training and development data set to the training function. The development data is usually around 10% of the training data. Each training or development sample is represented as a JSON object. It must have a text and a labels field. The text represents the training example text, and the labels field is an array, which contains exactly one label of positive, neutral, or negative.
The following is an example of an array with sample training data:
[
{
"text": "I am happy",
"labels": ["positive"]
},
{
"text": "I am sad",
"labels": ["negative"]
},
{
"text": "The sky is blue",
"labels": ["neutral"]
}
]
The training and development data sets are created as data streams from arrays of JSON objects. To create the data streams, you might use the utility method prepare_data_from_json
:
import watson_nlp
from watson_nlp.toolkit.sentiment_analysis_utils.training import train_util as utils
training_data_file = "train_data.json"
dev_data_file = "dev_data.json"
train_stream = utils.prepare_data_from_json(training_data_file)
dev_stream = utils.prepare_data_from_json(dev_data_file)
Loading the pretrained model resources
The pretrained Slate IBM Foundation model needs to be loaded before it passes to the training algorithm. In addition, you need to load the syntax analysis models for the languages that are used in your input texts.
To load the model:
# Load the pretrained Slate IBM Foundation model
pretrained_model_resource = watson_nlp.load('<pretrained Slate model>')
# Download relevant syntax analysis models
syntax_model_en = watson_nlp.load('syntax_izumo_en_stock')
syntax_model_de = watson_nlp.load('syntax_izumo_de_stock')
# Create a list of all syntax analysis models
syntax_models = [syntax_model_en, syntax_model_de]
Training the model
For all options that are available for configuring sentiment transformer training, enter:
help(watson_nlp.workflows.sentiment.AggregatedSentiment.train_transformer)
The train_transformer
method creates a workflow model, which automatically runs syntax analysis and the trained sentiment classification. In a subsequent step, enable language detection so that the workflow model can run on input
text without any prerequisite information.
The following is a sample call using the input data and pretrained model from the previous section (Training the model):
from watson_nlp.workflows.sentiment import AggregatedSentiment
sentiment_model = AggregatedSentiment.train_transformer(
train_data_stream = train_stream,
dev_data_stream = dev_stream,
syntax_model=syntax_models,
pretrained_model_resource=pretrained_model_resource,
label_list=['negative', 'neutral', 'positive'],
learning_rate=2e-5,
num_train_epochs=10,
combine_approach="NON_NEUTRAL_MEAN",
keep_model_artifacts=True
)
lang_detect_model = watson_nlp.load('lang-detect_izumo_multi_stock')
sentiment_model.enable_lang_detect(lang_detect_model)
Applying the model on new data
After you train the model on a data set, apply the model on new data by using the run()
method, as you would use on any of the existing pre-trained blocks.
Sample code:
input_text = 'new input text'
sentiment_predictions = sentiment_model.run(input_text)
Parent topic: Creating your own models