Data types

The following sections describe the various data types and their valid values:

Value formats

The section is about parsing and formatting of data, depending on its type.

Parsing values from JSON

The following table gives examples of valid values for the supported data types. Messages with invalid values are skipped and reported in the Notifications pane of the Metrics page. This data type handling applies to all Source operators that output data in JSON format, for example, HTTP, Event Streams, Kafka, MQTT, and Watson IoT.

For valid output values in Code operators, see section Expected Code operator output values.

Flow column data types Valid value examples The default when value is missing
Text “Hello World” ”” (empty string)
Number 12
-7
3.14
“12”
“-7”
“3.14”
0 (zero)
Date “2018-03-26T11:38:47”
“2018 03 26 11:38:47”
“2018-03-26T11:38:47.123456”

For more information, see section Date.
No default value because a missing Date value is not valid. Messages with missing or empty Date values are skipped and reported in the Notifications pane of the Metrics page.
Boolean true
false
“true”
“false”

Note: These values are the only valid Boolean values, and they are case-sensitive.
false
Binary Not supported in JSON parsing. You can ingest binary data as raw data by selecting “None” as the parsing option, where available. N/A

Serializing values to JSON and CSV

Formatted values are in quotation marks (“) only for data types Text and Date. The following table shows examples:

Flow column data types Output value examples
Text “Hello World”
Number 12
-7
3.14
Date (*) “2018-03-26T15:38:47”(with no split seconds)
“2018-03-26T15:38:47.123456”
Boolean true
false
Binary Not supported in serializing. You can write binary data by selecting “None” as the format option, where available. When selecting “None”, there must only be one event attribute entering the target operator.

(*) For more information about inserting Date values into schema-less or schema-based Target operators, see Formatting Date output for Target operators.

Input and output values in Code operators

Data values at Code operator input

The following table describes the data type mapping between flow columns and Python dictionary values at Code operator inputs. It applies to the Code operator in Sources and Targets, and to the Python Model operator in Processing and Analytics.

Flow column data types Mapped to Python dictionary value data type Python value examples
Text str “Hello World”
Number float -2.5
3.14
Number int -5
451
Date datetime.datetime
datetime.datetime object

datetime.datetime(2018, 3, 27, 10, 42, 0, 892960)
Boolean bool True
False
Binary memoryview <memory at 0x103efc708>

Expected Code operator output values

The section applies to the Code operator in Sources, and to the Code operator and the Python Model operator in Processing and Analytics.

The following table gives examples of valid output values, by data type. Outputs with invalid values are skipped and reported in the Notifications pane of the Metrics page.

Flow column data types Output value data types Valid output value examples Default value
Text str

Any non-string values are auto-converted to strings
“Hello World”
“78”

Valid:
- 78 (converts to “78”)
- True (converts to “True”)
”” (empty string)
Number int 0
-4
451
0
Number float -3.14
0.0
17.9
0.0
Date datetime.datetime datetime.datetime object


- datetime.datetime.now()
- datetime.datetime(2018, 3, 27, 10, 42, 0, 892960)
- datetime.datetime.strptime( "2018-03-27T07:58:35", '%Y-%m-%dT%H:%M:%S')
datetime.datetime(1970, 1, 1, 0, 0, 0)
Boolean bool

Any non-bool value except 0, None, empty strings, and empty objects are auto-converted to True
True
False

The following values are also accepted:
“true” (converts to True)
“False” (converts to True) (*)
“go” (converts to True)
451 (converts to True)
”” (converts to False)
“0” (converts to False)
False
Binary bytes
memoryview
bytes.fromhex(‘48 49’)
memoryview(b’HI’)
Empty binary with no elements

(*) False is a Boolean literal, but "False" is a string literal. Any non-empty string is interpreted as True, whether it’s “True”, “False”, or “yes sir”.

When your output dictionary has no entry for an output column, or when its value is None, then the streaming runtime acts in the following way:

  • If a same-name input column with the same column data type exists, its value is used (the value is passed through).

  • Otherwise, the default value for the output column’s data type is used.

Numbers

  • When Number column values are parsed from JSON messages of Source operators such as Event Streams or Watson IoT, the values are always mapped to float values in input dictionaries.

  • When Number column values are coming in from other operators, the Python type can be mapped to float or int values, depending on the operator logic. For example, the count function in the Aggregation operator produces values of type int.

  • In the Code operator, you can output either int or float values for output columns of type Number.

Dates

The following traits apply to Date values:

  • Date values express a point in time, and as such, include date and time information similar to a UNIX timestamp.

  • Split-second precision of Date values is up to microseconds (up to 6 digits after the decimal point).

  • Date values do not carry time zone information.

Parsing dates from Source operator messages

  • When parsing data that comes from Source operators such as IBM Event Streams or IBM Watson IoT, a streams flow assumes that all Date values are given in ISO 8601 format. The streams flow normalizes the date values internally to Coordinated Universal Time (UTC).

  • Separator characters other than -, T, and : are also accepted. For example, the date might use a space as a separator:
    2018-03-26 15:3:47.123456

  • This parsing is applicable to all Source operators except Code (in Sources).

The following table gives examples of supported formats:

Format example Comments Internal time representation
2018-03-26T11:38:47 No time zone. Up to full seconds. 11:38:47 (*)
2018-03-26T11:38:47.123456 No time zone. With microseconds. 11:38:47.123456 (*)
2018-03-26T11:38:47.123 No time zone. With split seconds other than microseconds. 11:38:47.123 (*)
2018-03-26T11:38:47+01:00 With time zone. The time zone is UTC+1, so it is normalized to UTC. 10:38:47
The time in UTC when it’s 11:38:47 in a time zone that’s UTC+01:00
2018-03-26T11:38:47-01:00 With time zone. The time zone is UTC-1, so it is normalized to UTC. 12:38:47
The time in UTC when it’s 11:38:47 in a time zone that’s UTC-01:00
2018-03-26T11:38:47.123456-05:00 With time zone and microseconds. The time zone is UTC-5, so it is normalized to UTC. 16:38:47.123456
The time in UTC when it’s 11:38:47 in a time zone that’s UTC-05:00

(*) When no time zone is indicated, the time is assumed to be given as UTC.

Tip

When processing dates from multiple time zones, their textual representation in source messages ideally have the respective time zone offset. If not, then you can use a Code operator to enhance the Date values with time zone information that is not explicit in the source. You can also use a Code operator for parsing formats other than ISO 8601.

Formatting Date output for Target operators

  • Outputting Date values to schema-less Target operators, such as Event Streams, Redis, or Cloud Object Storage

    The values are formatted according to ISO 8601:

    • 2018-03-26T11:38:47Z (value has no split seconds)

    • 2018-03-26T11:38:47.123400Z (value contains split seconds)

Binary data

  • For ingesting Binary data from source operators that output a single data attribute, select the parsing option “None”. This will stream the data as raw data, without the built-in parsing. The following table gives examples for such operators:

    Source operator Data attribute name
    Event Streams, Kafka event_message
    HTTP http_body
  • For writing Binary data in target operators that usually need a format setting for serializing the incoming data, set the format option to “None” and make sure this operator’s output schema has exactly one attribute. This will pass the data as raw data, without the built-in serialization. Example for such operators is the Event Streams target operator.

  • In operators that receive or output structured data as typed attributes, the attributes can be of type Binary. Examples of such operators are Code and Streams.