Tutorial: Use the Metrics page to monitor and troubleshoot streams flow

   

Learning objective

This tutorial takes an in-depth look into the Metrics page.

In the first part of the tutorial, you learn how to find real-time metrics, runtime views, and valuable information about the general health of your streams flow.

In the second part of the tutorial, you learn how to work with embedded logging and error notification for troubleshooting your streams flow.

The entire tutorial takes approximately 25 minutes to finish.

Overview

We use the Data Historian example streams flow and sample data from Tutorial: Create and run a Data Historian example streams flow.

The sample data is taken from five weather stations. The data includes weather station ID, time zone, date in Universal Coordinated Time (UTC) format, latitude, longitude, temperature, barometric pressure, humidity, indoor temperature, and rainfall today.

The Data Historian example streams flow has two Aggregation operators and a Cloud Object Storage (COS) operator:

  • The first Aggregation operator partitions the incoming data by weather station ID. Each weather station has its own partition. Within each partition, the data is grouped by weather station. As a result, every partition has one group.

    Every 60 seconds, the data “tumbles out” and a designated function is applied to data in each group. For example, the Average function is applied to rainfall data, but the Min function is applied to the barometric pressure data.

  • The second Aggregation operator ingests the output of the first Aggregation operator. It partitions and groups the data just like the first Aggregation operator, but the data “tumbles out” every 180 seconds.

  • Output data from the second Aggregation operator is stored in COS for further analysis later.

Preview

Watch this video to see how to monitor a running streams flow based on the Data Historian example.

Figure 1. Video iconMetrics page for the Data Historian example streams flow and sample data
This video will demonstrate how to monitor a running streams flow based on the Data Historian example.

   

Table of contents

Prerequisites

Part 1. Explore the Metrics page

Part 2. Troubleshooting your streams flow

Learn more


Now it’s your turn - try out the following tutorial steps in your own environment.

Prerequisites

You must know the project that has the streams flow that you created in Tutorial: Create and run a Data Historian example streams flow.

   

Part 1. Explore the Metrics page

You open the Metrics page of the existing streams flow and explore the information found there. The information can help you to tweak the streams flow to suit your analytic needs.

Part 1 takes approximately 15 minutes to finish.

Do the following steps:

  1. From the Projects menu, click View All Projects View all projects.

  2. Click the name of the project where you put the streams flow that you created in Tutorial: Create and run a Data Historian example streams flow.

  3. In the Project page, click the Assets tab. In the Streams flow area, click the streams flow that is called Data Historian to open its Metrics page. Select DH streams flow

  4. The Metrics page opens, but the streams flow is in Stopped state. Click Run Run button to start the streams flow. Data begins to flow between operators.

   

Let’s look at the various areas of the page: Taskbar, Streams Flow pane, Flow of Events table, Ingest Rate graph, and Throughput graph.

   

Figure 2. Metrics page of the Data Historian example streams flow Metrics page of Data Historian example flow

   

Taskbar

At the top of the Metrics page, you see a taskbar with the Status indicator and various icons. Taskbar

  • Hover your mouse pointer over any of the icons to see what each one does. An icon is disabled if you cannot use it. In our example, the Stop icon Disabled Stop icon is disabled because the streams flow is in Stopped state.
  • Click the Copy icon Copy. In the Duplicate Streams Flow window, let’s type in the name Copy_DH_example_flow, and then click Continue. The new streams flow is opened in the canvas. Click Save, and then click Close. The new streams flow is listed in the current project. Copied streams flow in project

    Tip: You can use the Copy functionality to make versions of a streams flow. Copy the streams flow and put a version number or date in its name. Then, change the copy.

    Return to the streams flow for this tutorial by clicking Data Historian in the Streams flows area of the Project page.

  • Click the Export icon Export. A new file, data_historian.stp, is downloaded to your local disk.

    Tip: You can use the Export functionality to import this file into a different project or Watson Studio instance.

  • Click the Delete icon Delete. A confirmation message opens. Click Cancel.
  • Click the Edit icon Edit. The streams flow opens in the canvas where you can redesign your flow to get the information that you need. All changes that you make are validated before you leave the canvas. You can learn about the canvas in Tutorial: Use the canvas to create your own streams flow.

    Return to the Metrics page by clicking Close.

  • The Notifications icon Notification is used for error notification. We talk more about errors and troubleshooting in Part 2 - Troubleshooting your streams flow.
  • Click the View All Flows icon Notification of error icon to get a list of all stream flows in the current project. You can search for a specific streams flow, list all steam flows of a selected status, or all flows created by a specific person. You can see the stream flows in tile view or in table view.

    When you're finished checking out the page, click Data Historian in the NAME column to return to the Metrics page.

   

Streams Flow pane

This pane shows a dynamic, bird's-eye view of the data as it flows between operators.

  • Note the two Aggregation operators and the flow of data between them. Data is stored in Cloud Object Storage. When the Status is in “Running” state, each data flow has a distinct color.

  • Hover your mouse pointer over the data flow coming from the Sample Data operator. You can see its throughput rate and total number of events per second. In Figure 2, the throughput is 13.8 events per second with a total of 9.5 KB number of events.

Flow of Events table

This table is not open by default. To open the table, go to the Streams Flow pane, and then click anywhere in the data flow between the Sample Data operator and the first Aggregation operator. The Flow of Events table opens to show events in table and in JSON formats.

Metrics Flow of Events table

Use the Flow of Events table to see what events are going in to and out of an operator. This information can be useful when you need to debug your streams flow.

In this example, we are checking the events that go from the Sample Data operator to the first Aggregation operator.

Ingest Rate graph

The Ingest Rate graph shows you the number of events per second that are submitted to the streams flow for each streams flow source. If there is more than one source, each streams flow source has a distinct color. This graph shows that the streams flow is ingesting data.

From Figure 2, we can see a single source of incoming data. We also see that approximately 10 - 30 events come in to the streams flow every second.

Throughput graph

In the Streams Flow pane, click the second Aggregation operator. The number of events that flow in to and out of the second Aggregation operator is displayed in the Aggregation Throughput graph. Move your mouse pointer over the graph to see the number of events and errors at any specific point in time.

Metrics Throughput

Errors include events that are dropped from the network or are not valid for any reason. In our case, no errors are found.

   


   

Part 2. Troubleshooting your streams flow

You learn how to work with embedded logging and error notification on the Metrics page to troubleshoot any problems that you might encounter in your streams flow.

No errors exist in the streams flow that you are using in this tutorial, so you cannot duplicate the steps here in your own environment. Nevertheless, use the following steps as a guide to troubleshooting.

Part 2 takes approximately 10 minutes to finish.

Do the following steps:

  1. Click the Notifications icon to open the Notification pane.

    If errors exist in the streams flow, the Notifications icon Notification of error icon in the upper-right corner of the Metrics page is displayed with a red dot. This icon can indicate several types of errors: validation, compilation, or runtime.

    Details about any errors are in the Notification pane. Here's an example notification: Notification links.

    Click the error message to open the streams flow canvas. More detailed messages will be given there to help you to locate and correct the problem.

    We talk about how to correct streams flow problems in the canvas in Tutorial: Use the canvas to create your own streams flow.

  2. When you click the Notifications icon, a taskbar opens.

    Metrics notification bar

    Click any of the following icons:

    • Streaming Analytics instance icon (Streaming Analytics icon) to check the instance in IBM Cloud. In the Manage page of IBM Cloud, you can start or stop the instance.

    • Download user log icon (Download user log icon) to download the user log file. The user log contains logging messages that you put into the Code operator and the Python Machine Learning operator.

    • Download logs icon (Download logs icon) to download system log files. The log files are saved to your local disk in a compressed format. These logs might be needed if you contact Support.

    • Download code archive icon (Download code icon) to download the code that generates the streams flow. The code can help you to identify the cause of runtime and compilation errors.

If the Metrics page indicates that your streams flow has problems, use the Troubleshooting streams flow to get it up and running again. The guide can resolve common questions and problems.

 

Example

You notice that the Current Throughput shows that data is flowing from the Sample Data operator to the first Aggregation operator. You also notice that no data is flowing out of the first Aggregation operator. You want to check the data that flows out of the Sample Data operator.

Metrics Flow of Events table with error

You click the data flow coming out of the Sample Data operator and note that the ID attribute is empty.

Metrics Flow of Events table with error

The Aggregation operator partitions and groups by ID. As a result, no data flows out of the first Aggregation operator. Following the instructions in the Troubleshooting streams flow guide, you would open the streams flow in the canvas. There, you would correct the problem in the source schema that is used by the Sample Data operator.


   

Summary

Congratulations! You started an example streams flow in the Metrics page and saw weather station data that flows between operators. You learned how to use the different areas of the page to get valuable information about your streams flow. You were shown a basic troubleshooting technique to resolve questions and problems.

   

Learn more

Become familiar with the Troubleshooting streams flow guide.

Learn more about the Data Historian example streams flow scope, operators, and output.

Check out our other tutorials about streams flow.