Tutorial: Design and create a streams flow in the canvas

Learning objective

You learn how to design and create a simple streams flow in the canvas. We supply the Data Historian sample data.

This tutorial is an introduction to the canvas and how you can use it to customize a streams flow to suit your analytic needs. Other tutorials do an in-depth analysis into specific operators that are available in the canvas.

It takes approximately 15 minutes to finish.

Overview

We start with an empty canvas and introduce you to the various features.

In the canvas, you select operators - Sample Data and Cloud Object Storage - and connect them to design a simple flow of data.

The Sample Data operator uses sample data that is taken from five weather stations. The data includes weather station ID, time zone, date in Universal Coordinated Time (UTC) format, latitude, longitude, temperature, barometric pressure, humidity, indoor temperature, and rainfall today. The data is in JSON format.

The Cloud Object Storage (COS) operator stores the sample data. We use COS because it provides cloud storage for massive amounts of unstructured data.

Preview

Watch this video to see how to create and run a simple streams flow with the canvas by using sample data.

Figure 1. Video iconCreate a streams flow with the canvas that uses sample data.
This video shows you how to create a streams flow with the canvas and that uses sample data.

   

Table of contents


Now it’s your turn - try out the following tutorial steps in your own environment.

Prerequisites

You must have a COS instance and a Streaming Analytics service instance that is associated with the project where the streams flow runs. Go to the Projects page of the project, and then click the Settings tab.

Settings tab of a project

  • In the Storage section of the page, check that a COS instance is listed.
  • In the Associated Services section of the page, check that a Streaming Analytics service is listed.

To provision either instance, go to your account in IBM Cloud Dashboard. Click Create resource, and then follow the prompts.

Watch this video to see how to provision the services necessary to create, edit, and run a streams flow.

Figure 2. Video iconProvision the prerequisite services to create, edit, and run a streams flow
This video demonstrates how to provision the prerequisite IBM Cloud services.

   

Create an empty streams flow

To get acquainted with the canvas, do the following steps:

  1. From the Projects menu, click View All Projects.

    View all projects

  2. Click the name of the project where you want to put the new streams flow.

  3. In the Project page, click the Assets tab. In the Streams flow area, click New streams flow.

  4. In the New Streams Flow window, click the Blank tab.

    Blank tab

  5. In the Blank tab of the New Streams Flow window, do the following steps:

    • In the Name field, type in a name for the streams flow. Use Simple Stream Flow in Canvas.
    • In the Description field, type in some text to describe the new streams flow. Type in This is a simple streams flow to get acquainted with the canvas. It uses Data Historian sample data and sends to COS for storage.
    • In the Streaming Analytics service list, the Streaming Analytics service that is associated with the project is already selected. An example might be Blank tab
    • In the Select Example area, click the Manually box, and then click Create. The canvas opens.

   

Explore the canvas

The canvas has several distinctive features to help you to design a streams flow that better suits your analytic needs.

The following screen capture shows the canvas of a new streams flow.

Entire screen

   

Taskbar

At the top of the canvas, note the taskbar with various icons.

Taskbar

Hover your mouse pointer over any of the icons to see what each one does. An icon that cannot be used is disabled.

We’ll try out the icons after we create a streams flow.

   

Canvas palette

A palette with operators is on the left side of the canvas. The operators are grouped by type.

Canvas palette

The groups are Sources, Targets, Processing and Analytics, and Alerts. To list the operators in each group, click the arrow next to the group name.

To collapse or open the canvas palette, click the Palette icon Palette icon in the taskbar.

   

Canvas area

The center of the canvas area is where you design the streams flow and make any changes or corrections to it.

In a new canvas, we list some initial tips to get you started. You can close that graphic if you want by clicking X in the upper-right corner of the graphic.

Design a streams flow

  1. In the canvas palette, click Sources to open the Sources group. Drag the Sample Data operator from the pane and drop it in the canvas area.

  2. In the canvas palette, click Targets to open the Target group. Drag the Cloud Object Storage operator from the pane and drop it in the canvas area.

  3. Link the two operators together by dragging your mouse cursor from the output port of the Sample Data operator to the input port of the Cloud Object Storage operator. Linking operators

Did you notice the Notification icon Notifications icon in the upper-right corner of the canvas? All actions are verified in the canvas. If an error occurs, you are notified immediately so that you can correct the problem.

Looks like we have some errors, so let’s fix them now.

Correct the streams flow

  1. Click Notifications icon to see the specific error message and the operator type. Notice that the operators also have a red error icon Error icon. This icon is helpful because you might have several operators of the same type, but only one of them is problematic. Which one must we fix?

    Error messages in canvas

  2. Click the first error message to open the Properties pane of the Sample Data operator.

    First error messages in canvas

    An error icon Error icon is next to the field that is problematic. In our case, we need to select a topic.

  3. Click Select topic, and then select Data Historian. We selected the Data Historian sample data for the streams flow.

  4. Click Edit Schema Edit Schema to customize the sample data itself.

    For example, suppose that we’re not interested in the indoor temperature and today’s rainfall. We remove the attributes “temperature” and “rainin”, and then save the new schema. Those two attributes are not sent on to Cloud Object Storage (COS) for storage.

  5. In the Edit Schema window, click Close.

    Notice that the Notification messages were immediately updated. One error remains in the COS operator.

  6. Click the error message to open the Properties pane of the COS operator.

    An error icon Error icon is next to the field that is problematic. In our case, we need to select a connection.

  7. In the Connection area, click Select, and then select your COS instance.

  8. In the File path field, click Data Assets. In the Select Data Asset window, select the bucket for your data. Click Select. Add a file name. Let’s use DH_%TIME.csv.

Tip: COS does not create file versions. The system variable %TIME appends the system time to the file name to create a unique file name for each new file. Otherwise, each new file overwrites the existing file.

  1. In the File Creation policy field, let’s select Number of events. In the Number of events field, let’s type in 100.

    The error icons are removed as each error is corrected. The icon OK icon in the upper-right corner of the canvas indicates that the steams flow is now properly configured.

  2. Click Save icon to save all changes.

   

Try out other icons in the taskbar

Before we run the streams flow, let’s look at some helpful features in the taskbar.

  • Click the Settings icon Settings icon. The Settings pane opens to the right of the canvas. You can change the name of the streams flow, the text in its Description field, and the Streaming Analytics service instance that is associated with it. If needed, you can install Python libraries.

    Let’s change the name to be Simple_Stream_Flow_in_Canvas, and then click Save icon. Note that the streams flow name is changed in the project breadcrumbs.

    Breadcrumbs

  • Check out the following icons in the taskbar.
    • Undo Undo
    • Redo Redo
    • Cut Cut
    • Copy Copy
    • Paste Paste
    • Delete Delete
  • Play with the canvas size by clicking the Zoom In Zoom in, Zoom Out Zoom out, and Fit to Screen Fit to screen icons.

  • Display and change properties of an operator by clicking that operator in the canvas. The Properties pane for the operator opens. Any changes that you make are automatically saved. Changes are immediately verified.

  • Change the name of the operator in the canvas by double-clicking the operator name, and then typing in the new name.

    Let’s change the name of the Sample Data operator name to be Data Historian Sample Data. Note that the displayed name is truncated to fit the operator. Hover your mouse pointer over the operator to see the full name.

    Changed operator name

  • Display schema fields of an operator by hovering the mouse pointer over the link. For example, hover your mouse pointer over the Data Historian Sample Data operator.

    Schema fields of operator

    You can see which attributes and their data type are flowing to the next operator. This feature can be a significant help when you use some of the analytics operators or when you need to change the data schema.

    Notice that the attributes “temperature” and “rainin” are not in the schema output.

We created a simple streams flow that uses Data Historian sample data. We corrected all errors, checked the schema of the sample data to be sure it can stream the information that we need, and we set up our COS storage to create a new file with a unique name after every 100 streamed events.

   

Run the new streams flow

We’re ready to run our streams flow.

In the taskbar of the canvas, click Metrics icon. Your new streams flow is automatically shown in the Metrics page. The Status indicator shows that it is in a “Stopped” state.

Click the Run icon Run icon to start the streams flow. The Status indicator changes to “Starting” and then to “Running”.

Notice that until the status is “Running”, the streams flow uses arrows to connect operators.

When the status is “Running, you can see the data as it flows between operators. Put your mouse pointer over a data flow to get real-time metrics. Click the data flow to see the events flow to the next operator.

   

Summary

You just created a streams flow and used the Data Historian sample data. You started the streams flow in the Metrics page and saw the data flow between operators.

   

Learn more

Check out our other tutorials for streams flow.