Tutorial: Design and create a streams flow in the canvas
Learning objective
You learn how to design and create a simple streams flow in the canvas. We supply the Data Historian sample data.
This tutorial is an introduction to the canvas and how you can use it to customize a streams flow to suit your analytic needs. Other tutorials do an in-depth analysis into specific operators that are available in the canvas.
It takes approximately 15 minutes to finish.
Overview
We start with an empty canvas and introduce you to the various features.
In the canvas, you select operators - Sample Data and Cloud Object Storage - and connect them to design a simple flow of data.
The Sample Data operator uses sample data that is taken from five weather stations. The data includes weather station ID, time zone, date in Universal Coordinated Time (UTC) format, latitude, longitude, temperature, barometric pressure, humidity, indoor temperature, and rainfall today.
The Cloud Object Storage (COS) operator stores the sample data. We use COS because it provides cloud storage for massive amounts of unstructured data.
Preview
Watch this video to see how to create and run a simple streams flow with the canvas by using sample data.
To complete this tutorial, follow these steps:
- Prerequisites
- Create an empty streams flow
- Explore canvas
- Try out other features in the canvas
- Run the new streams flow
- Learn more
Now it’s your turn - try out the following tutorial steps in your own environment.
Prerequisites
You must have a COS instance and a Streaming Analytics service instance that is associated with the project where the streams flow runs. Go to the Projects page of the project, and then click the Settings tab.
- In the Storage section of the page, check that a COS instance is listed.
- In the Associated Services section of the page, check that a Streaming Analytics service is listed.
To provision either instance, go to your account in IBM Cloud Dashboard. Click Create resource, and then follow the prompts.
Create an empty streams flow
To get acquainted with the canvas, perform the following steps:
-
From the Projects menu, click View All Projects.
-
Click the name of the project where you want to put the new streams flow.
-
In the Project page, click the Assets tab. In the Streams flow area, click New streams flow.
-
In the New Streams Flow window, click the Blank tab.
-
In the Blank tab of the New Streams Flow window, perform the following steps:
- In the Name field, type in a name for the streams flow. Use
Simple Stream Flow in Canvas
. - In the Description field, type in some text to describe the new streams flow. Type in
This is a simple streams flow to get acquainted with the canvas. It uses Data Historian sample data and sends to COS for storage.
-
In the Streaming Analytics service list, the Streaming Analytics service that is associated with the project is already selected. An example might be
- In the Select Example area, click the Manually box, and then click Create. The canvas opens.
- In the Name field, type in a name for the streams flow. Use
Explore the canvas
The canvas has several distinctive features to help you to design a streams flow that better suits your analytic needs.
The following screen capture shows the canvas of a new streams flow.
Taskbar
At the top of the canvas, note the taskbar with various icons.
Hover your mouse pointer over any of the icons to see what each one does. An icon that cannot be used is disabled.
We’ll try out the icons after we create a streams flow.
Canvas palette
A palette with operators is on the left side of the canvas. The operators are grouped by type.
The groups are Sources, Targets, Processing and Analytics, and Alerts. To list the operators in each group, click the arrow next to the group name.
To collapse or open the canvas palette, click the Palette icon in the taskbar.
Canvas area
The center of the canvas area is where you design the streams flow and make any changes or corrections to it.
In a new canvas, we list some initial tips to get you started. You can close that graphic if you want by clicking X in the upper-right corner of the graphic.
Design a streams flow
-
In the canvas palette, click Sources to open the Sources group. Drag the Sample Data operator from the pane and drop it in the canvas area.
-
In the canvas palette, click Targets to open the Target group. Drag the Cloud Object Storage operator from the pane and drop it in the canvas area.
-
Link the two operators together by dragging your mouse cursor from the output port of the Sample Data operator to the input port of the Cloud Object Storage operator.
Did you notice the Notification icon in the upper-right corner of the canvas? All actions are verified in the canvas. If an error occurs, you are notified immediately so that you can correct the problem.
Looks like we have some errors, so let’s fix them now.
Correct the streams flow
-
Click
to see the specific error message and the operator type. Notice that the operators also have a red error icon
. This icon is helpful because you might have several operators of the same type, but only one of them is problematic. Which one must we fix?
-
Click the first error message to open the Properties pane of the Sample Data operator.
An error icon
is next to the field that is problematic. In our case, we need to select a topic.
-
Click Select topic, and then select Data Historian. We selected the Data Historian sample data for the streams flow.
-
Click Edit Schema
to customize the sample data itself.
For example, suppose that we’re not interested in the indoor temperature and today’s rainfall. We remove the attributes “temperature” and “rainin”, and then save the new schema. Those two attributes are not sent on to Cloud Object Storage (COS) for storage.
-
In the Edit Schema window, click Close.
Notice that the Notification messages were immediately updated. One error remains in the COS operator.
-
Click the error message to open the Properties pane of the COS operator.
An error icon
is next to the field that is problematic. In our case, we need to select a connection.
-
In the Connection area, click Select, and then select your COS instance.
-
In the File path field, click
. In the Select Data Asset window, select the bucket for your data. Click Select. Add a file name. Let’s use
DH_%TIME.csv
.
Tip: COS does not create file versions. The system variable %TIME appends the system time to the file name to create a unique file name for each new file. Otherwise, each new file overwrites the existing file.
-
In the File Creation policy field, let’s select Number of events. In the Number of events field, let’s type in
100
.The error icons are removed as each error is corrected. The icon
in the upper-right corner of the canvas indicates that the steams flow is now properly configured.
-
Click
to save all changes.
Try out other icons in the taskbar
Before we run the streams flow, let’s look at some helpful features in the taskbar.
-
Click the Settings icon
. The Settings pane opens to the right of the canvas. You can change the name of the streams flow, the text in its Description field, and the Streaming Analytics service instance that is associated with it. If needed, you can install Python libraries.
Let’s change the name to be
Simple_Stream_Flow_in_Canvas
, and then click. Note that the streams flow name is changed in the project breadcrumbs.
- Check out the following icons in the taskbar.
- Undo
- Redo
- Cut
- Copy
- Paste
- Delete
- Undo
-
Play with the canvas size by clicking the Zoom In
, Zoom Out
, and Fit to Screen
icons.
-
Display and change properties of an operator by clicking that operator in the canvas. The Properties pane for the operator opens. Any changes that you make are automatically saved. Changes are immediately verified.
-
Change the name of the operator in the canvas by double-clicking the operator name, and then typing in the new name.
Let’s change the name of the Sample Data operator name to be
Data Historian Sample Data
. Note that the displayed name is truncated to fit the operator. Hover your mouse pointer over the operator to see the full name. -
Display schema fields of an operator by hovering the mouse pointer over the link. For example, hover your mouse pointer over the Data Historian Sample Data operator.
You can see which attributes and their data type are flowing to the next operator. This feature can be a significant help when you use some of the analytics operators or when you need to change the data schema.
Notice that the attributes “temperature” and “rainin” are not in the schema output.
We created a simple streams flow that uses Data Historian sample data. We corrected all errors, checked the schema of the sample data to be sure it can stream the information that we need, and we set up our COS storage to create a new file with a unique name after every 100 streamed events.
Run the new streams flow
We’re ready to run our streams flow.
In the taskbar of the canvas, click . Your new streams flow is automatically shown in the Metrics page. The Status indicator shows that it is in a “Stopped” state.
Click the Run icon to start the streams flow. The Status indicator changes to “Starting” and then to “Running”.
Notice that until the status is “Running”, the streams flow uses arrows to connect operators.
When the status is “Running”, you can see the data as it flows between operators. Put your mouse pointer over a data flow to get real-time metrics. Click the data flow to see the events flow to the next operator.
Summary
You just created a streams flow and used the Data Historian sample data. You started the streams flow in the Metrics page and saw the data flow between operators.
Learn more
Check out our other tutorials for streams flow.