This tutorial focuses on a sample use case in the finance industry. Golden Bank needs to perform a stock anomalies analysis to boost productivity and increase the accuracy of a stock analyst's work in investment banking.
- Required services
- watsonx.ai
- Watson Machine Learning
Scenario: Stock anomaly analysis process
To accomplish this goal, the typical process might be as follows:
- An investment banker or manager asks the stock analyst to research a company’s stock.
- The stock analyst downloads the company’s stock data.
- They search through the stock data manually to find anomalies in how the stock price performed.
- They explain the anomalies by manually searching the web for relevant news articles around the specific dates.
- The stock analyst summarizes the reasoning behind the anomalies using the news articles.
- They do follow up research about specific pieces of information and dates.
- They send the report to the investment banker for them to do further analysis in order to make and investment decision.
Basic task workflow using watsonx.ai
Watsonx.ai can help accomplish each phase of this process. Your basic workflow includes these tasks:
- Open a project. Projects are where you can collaborate with others to work with data.
- Add your data to the project. You can add CSV files or data from a remote data source through a connection.
- Train a model. You can use a variety of tools, such as, AutoAI, SPSS Modeler, or Jupyter notebooks to train the model.
- Deploy and test your model.
- Transform the data.
- Prompt a foundation model.
- Tune the foundation model.
Read about watsonx.ai
To transform your business processes with AI-driven solutions, your enterprise needs to integrate both machine learning and generative AI into your operational framework. Watsonx.ai provides the processes and technologies to enable your enterprise to develop and deploy machine learning models and generative AI solutions.
Watch a video about watsonx.ai
Watch this video to preview the steps in this tutorial. There might be slight differences in the user interface shown in the video. The video is intended to be a companion to the written tutorial.
This video provides a visual method to learn the concepts and tasks in this documentation.
Try a tutorial to watsonx.ai
In this tutorial, you will complete these tasks:
- Task 1: Create the sample project
- Task 2: Visualize the data
- Task 3: Train the model
- Task 4: Deploy the model
- Task 5: Gather relevant news articles
- Task 6: Prompt the foundation model
Tips for completing this tutorial
Here are some tips for successfully completing this tutorial.
Use the video picture-in-picture
The following animated image shows how to use the video picture-in-picture and table of contents features:
Get help in the community
If you need help with this tutorial, you can ask a question or find an answer in the watsonx Community discussion forum.
Set up your browser windows
For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.
Task 1: Create the sample project
To preview this task, watch the video beginning at 00:58.
This tutorial uses a sample project that contains the data sets, notebook, and prompt templates to perform the analysis. Follow these steps to create a project based on a sample:
-
Access the Stock anomalies analysis project in the Resource hub.
-
Click Create project.
-
Accept the default values for the project name, and click Create.
-
Click View new project when the project is successfully created.
-
-
Associate a Watson Machine Learning service with the project:
-
When the project opens, click the Manage tab, and select the Services and integrations page.
-
On the IBM services tab, click Associate.
-
Select your Watson Machine Learning instance. If you don't have a Watson Machine Learning service instance provisioned yet, follow these steps:
-
Click New service.
-
Select Watson Machine Learning.
-
Click Create.
-
Select the new service instance from the list.
-
-
Click Associate service.
-
If necessary, click Cancel to return to the Services & Integrations page.
-
-
Click the Assets tab in the project to see the sample assets.
For more information or to watch a video, see Creating a project.
For more information on associated services, see Adding associated services.
Check your progress
The following image shows the project Assets tab. You are now ready to visualize the training data.
Task 2: Visualize the data
To preview this task, watch the video beginning at 01:27.
The three data sets in the sample project contain synthetic data generated using public stock data from the Yahoo! Finance website as a basis. The training data for a time series anomaly prediction model must be structured and sequential. In this case, the synthetic data is structured and sequential. Follow these steps to view the data assets in the sample project:
- Open the historical_data.csv data set. This data set contains historical stock price performance from May 2012 to May 2016.
- Return to the project's Assets tab, and open the test_data.csv data set. This data set contains historical stock price performance in Q1 2023.
- Return to the project's Assets tab, and open the training_data.csv data set. This data set contains historical stock price performance in 2023.
- Click the Visualization tab.
- Select the Date column, and then click Visualize data. The first suggested chart type, a histogram, displays.
- Select the Line chart type.
- For the X-axis, select the Date column.
- For the Y-axis, select the Adj Close column. This shows the adjusted closing price by date. The target column for anomaly analysis is the adjusted closing price.
Check your progress
The following image shows a visualization of the training_data.csv file. Now you are ready to build the model using this training data.
Task 3: Train the model
To preview this task, watch the video beginning at 02:13.
You can use a variety of tools, such as, AutoAI, SPSS Modeler, or Jupyter notebooks to train the model. In this tutorial, you will train the time series analysis anomaly prediction model with AutoAI. Follow these steps to create the AutoAI experiment:
-
Return to the project's Assets tab, and then click New asset > Build machine learning models automatically.
-
On the Build machine learning models automatically page, type the name:
Stock anomaly experiment
-
Click Create.
-
On the Add data source page, add the training data:
-
Click Select data from project.
-
Select Data asset > training_data.csv, and click Select asset.
-
-
Set the time series analysis settings:
-
Select Yes if you are asked to create a time series experiment.
-
Select Anomaly prediction.
-
-
Select Adj Close for the Feature columns.
-
Click Run experiment. As the model trains, you see an infographic that shows the process of building the pipelines.
For a list of algorithms, or estimators, available with each machine learning technique in AutoAI, see: AutoAI implementation detail.
-
After the experiment run is complete, you can view and compare the ranked pipelines in a leaderboard.
-
You can click Pipeline comparison to see how they differ.
-
Click the highest ranked pipeline to see the pipeline details.
-
Review the Model evaluation page to see the detailed evaluation metrics about the model performance.
The AutoAI tool considers a wide range of criteria to spot anomalies. In the table, you can see the evaluation based on different metrics, such as Average precision and Area under ROC, for each of the anomaly types.
Anomaly types Anomaly Type Description Trend anomaly A segment of time series, which has a trend change compared to the time series before the segment. Variance anomaly A segment of time series in which the variance of a time series is changed. Localized extreme anomaly An unusual data point in a time series, which deviates significantly from the data points around it. Level shift anomaly A segment in which the mean value of a time series is changed. -
Save the model.
-
Click Save as.
-
Select Model.
-
For the model name, type:
Anomaly Prediction Model
-
Click Create. This saves the pipeline as a model in your project.
-
-
When the model is saved, click the View in project link in the notification to view the model in your project. Alternatively, you can navigate to the Assets tab in the project, and click the model name in the Models section.
Check your progress
The following image shows the model.
Task 4: Deploy the model
To preview this task, watch the video beginning at 03:40.
The next task is to promote the test data and the model to a deployment space, and then create an online deployment.
Task 4a: Promote the test data to the deployment space
The sample project includes test data. You promote that test data to a deployment space, so you can use the test data to test the deployed model. Follow these steps to promote the test data to a deployment space:
-
Return to the project's Assets tab.
-
Click the Overflow menu for the test_data.csv data asset, and choose Promote to space.
-
Choose an existing deployment space. If you don't have a deployment space:
-
Click Create a new deployment space.
-
For the name, type:
Anomaly Prediction Space
-
Select a storage service.
-
Select a machine learning service.
-
Click Create.
-
Close the notification when the space is ready.
-
-
Select your new deployment space from the list.
-
Click Promote.
Check your progress
The following image shows the Promote to space page.
Task 4b: Promote the model to a deployment space
Before you can deploy the model, you need to promote the model to a deployment space. Follow these steps to promote the model to a deployment space:
-
From the Assets tab, click the Overflow menu for the Anomaly Prediction Model model, and choose Promote to space.
-
Select the same deployment space from the list.
-
Select the Go to the model in the space after promoting it option.
-
Click Promote.
Check your progress
The following image shows the model in the deployment space.
Task 4c: Create and test a model deployment
Now that the model is in the deployment space, follow these steps to create the model deployment:
-
With the model open, click New deployment.
-
Select Online as the Deployment type.
-
For the deployment name, type:
Anomaly Prediction Model Deployment
-
Click Create.
-
-
When the deployment is complete, click the deployment name to view the deployment details page.
-
Review the scoring endpoint, which you can use to access this model programmatically in your applications.
-
Test the model.
-
Click the Test tab.
-
To locate the test data, click Search in space.
-
Select Data asset > test_data.csv.
-
Click Confirm.
-
Click Predict, and review the predictions for the 62 records in the test data.
-
Check your progress
The following image shows the test results from the deployed model.
Task 5: Gather relevant news articles
To preview this task, watch the video beginning at 05:07.
Although the Prompt Lab can work with structured and unstructured text, it is essential to ensure that you input the right data that the model can process. In this use case, you need to process news article text based on the anomaly dates you obtained from the anomaly prediction. You can integrate an external news API to extract the relevant news during those dates to simplify the data-gathering process. You can do this in a Jupyter notebook with Python code.
Since the foundation models have a limit on the number of tokens they can process in a single prompt (known as the context window), data may need to be chunked or summarized to fit within this limit. This step ensures that the input data is in a format that the foundation model can effectively process without losing essential information.
Follow these steps to run the notebook:
- From the Navigation Menu , choose Projects > View all projects.
- Open the Stock anomalies analysis project.
- Click the Assets tab.
- Click the Overflow menu for the Extract and Chunk Text from News Articles notebook, and choose Edit.
- Complete the Setup section.
- Click the Run icon for the first cell to import the libraries.
- Obtain the necessary API keys:
- Follow the link to create an account and API key at TheNewsAPI.
- Paste the API key in the
thenewsapi_key
variable. - Follow the link to create an account and API key at ArticlExtractor.
- Paste the API key in the
extract_key
variable.
- Run the cell to set the two API key variables.
- Run the cells in the Define the function to get news article URLs section.
- The first cell defines a function to get data from TheNewsAPI's Top Stories and set up parameters to ensure you can get relevant news.
- The second cell defines a function to only get a list of URLs based on the response.
- Run the cells in the Define the function to extract news text section.
- The first cell defines a function to extract news text from a specific news URL using ArticlExtractor API.
- The second cell defines a function to combine news text from all of the article URLs obtained from TheNewsAPI.
- Run the cell in the Define the function to chunk news text section. To ensure the LLM foundation model can take on the information from the text, you need to make sure the token doesn't exceed the context token window limitations. In this example, you define a function to use LangChain to split the character text by taking into account the context of the news text.
- Run the cell in the Execute the functions section. In the response, you can see that the final output of data is ready to be fed into the Prompt Lab. LangChain’s text splitter splits the long text up into semantically meaningful chunks, or sentences, and combines them again as a whole text to be processed. You can adjust the maximum size of the chunks.
Check your progress
The following image shows the completed notebook. You now have the chunked text to use to prompt the foundation model.
Task 6: Prompt the foundation model
To preview this task, watch the video beginning at 07:17.
Now that you have the relevant news article appropriately chunked, you can construct your own prompt templates in the Prompt Lab, or you can use the sample prompt templates in the sample project. The sample project includes sample prompt templates for summarization and question answering tasks. Follow these steps to prompt the foundation model in the Prompt Lab.
Summarization task
-
Return to the project's Assets tab.
-
Click the Summarize News Articles prompt template. This opens the prompt template in the Prompt Lab.
-
Click Edit to open the prompt template in edit mode.
For the summarization task, you use the chunked news article text as the input example, and notes that the stock analyst usually manually writes to explain anomalies as the output example. This is to ensure that the output is similar to what the stock analyst might write themselves.
-
Click Generate to see the summary results.
-
Experiment with different input and output text from the chunked news article in the notebook.
Question answering task
-
Click the Saved prompts to see saved prompt from your project.
-
Click the Question Answer News Articles prompt template from the list of saved prompts.
-
Click Edit to open the prompt template in edit mode.
For the question-answering task, you use questions as the input example, and answers in the level of detail required and preferred format as the output example.
-
Click Generate to see the summary results.
-
Experiment with different input and output text.
Adjust the model parameters
In the Prompt Lab, you can adjust the decoding settings to optimize the model's output for the specific task:
- Decoding
- Greedy: always select words with the highest probability
- Sampling: customize the variability of word selection
- Repetition Penalty: how much repetition is allowed
- Stopping Criteria: one or more strings that will cause the text generation to stop if produced
This flexibility allows for a high degree of customization, ensuring that the model operates with parameters best suited to the task's requirements and constraints.
In the Prompt Lab, you can set token limitations to ensure that the tasks remain within the operational scope of the model. This setting helps balance the response's comprehensiveness with the technical limitations of the model, ensuring efficient and effective processing of tasks.
Check your progress
The following image shows the Prompt Lab.
Next steps
Experiment with prompt notebooks
From the Prompt Lab, you can save your work in notebook format:
- Load a saved prompt template.
- Click Save work > Save as.
- Select Notebook.
- Type a name.
- Click Save, and then explore the prompt notebook.
- Repeat these steps for the other prompt template.
Tune a foundation model
You might want to tune the foundatation model to enhance the model performance compared to prompt engineering alone or reduce costs by deploying a smaller model that performs similarly to a bigger model. See the Tune a foundation model tutorial.
Additional resources
-
Try these additional tutorials to get more hands-on experience with watsonx.ai:
-
Watch more videos.
-
Find sample data sets and notebooks to gain hands-on experience refining data in the Resource hub.
Parent topic: Quick start tutorials