Time series functions are aggregate functions that operate on sequences of data values measured at points in time.
The following sections describe some of the time series functions available in different time series packages.
Transforms
Copy link to section
Transforms are functions that are applied on a time series resulting in another time series. The time series library supports various types of transforms, including provided transforms (by using from tspy.functions import transformers)
as well as user defined transforms.
The following sample shows some provided transforms:
Segmentation or windowing is the process of splitting a time series into multiple segments. The time series library supports various forms of segmentation and allows creating user-defined segments as well.
Window based segmentation
This type of segmentation of a time series is based on user specified segment sizes. The segments can be record based or time based. There are options that allow for creating tumbling as well as sliding window based segments.
>>> import tspy
>>> ts_orig = tspy.builder()
.add(tspy.observation(1,1.0))
.add(tspy.observation(2,2.0))
.add(tspy.observation(6,6.0))
.result().to_time_series()
>>> ts_orig
timestamp: 1 Value: 1.0
timestamp: 2 Value: 2.0
timestamp: 6 Value: 6.0>>> ts = ts_orig.segment_by_time(3,1)
>>> ts
timestamp: 1 Value: original bounds: (1,3) actual bounds: (1,2) observations: [(1,1.0),(2,2.0)]
timestamp: 2 Value: original bounds: (2,4) actual bounds: (2,2) observations: [(2,2.0)]
timestamp: 3 Value: this segment is empty
timestamp: 4 Value: original bounds: (4,6) actual bounds: (6,6) observations: [(6,6.0)]
Copy to clipboardCopied to clipboardShow more
Anchor based segmentation
Anchor based segmentation is a very important type of segmentation that creates a segment by anchoring on a specific lambda, which can be a simple value. An example is looking at events that preceded a 500 error or examining values after
observing an anomaly. Variants of anchor based segmentation include providing a range with multiple markers.
There are several specialized segmenters provided out of the box by importing the segmenters package (using from tspy.functions import segmenters). An example segmenter is one that uses regression to segment a time
series:
A reducer is a function that is applied to the values across a set of time series to produce a single value. The time series reducer functions are similar to the reducer concept used by Hadoop/Spark. This single value can be a collection,
but more generally is a single object. An example of a reducer function is averaging the values in a time series.
Several reducer functions are supported, including:
Distance reducers
Distance reducers are a class of reducers that compute the distance between two time series. The library supports numeric as well as categorical distance functions on sequences. These include time warping distance measurements such as Itakura
Parallelogram, Sakoe-Chiba Band, DTW non-constrained and DTW non-time warped contraints. Distribution distances such as Hungarian distance and Earth-Movers distance are also available.
For categorical time series distance measurements, you can use Damerau Levenshtein and Jaro-Winkler distance measures.
Several convenient math reducers for numeric time series are provided. These include basic ones such as average, sum, standard deviation, and moments. Entropy, kurtosis, FFT and variants of it, various correlations, and histogram are also
included. A convenient basic summarization reducer is the describe function that provides basic information about the time series.
Another basic reducer that is very useful for getting a first order understanding of the time series is the describe reducer. The following illustrates this reducer:
The library includes functions for temporal joins or joining time series based on their timestamps. The join functions are similar to those in a database, including left, right, outer, inner, left outer, right outer joins, and so on. The following
sample codes shows some of these join functions:
A key functionality provided by the time series library is forecasting. The library includes functions for simple as well as complex forecasting models, including ARIMA, Exponential, Holt-Winters, and BATS. The following example shows the function
to create a Holt-Winters:
The time series library is tightly integrated with Apache Spark. By using new data types in Spark Catalyst, you are able to perform time series SQL operations that scale out horizontally using Apache Spark. This enables you to easily use time
series extensions in IBM Analytics Engine or in solutions that include IBM Analytics Engine functionality like the watsonx.ai Studio Spark environments.
SQL extensions cover most aspects of the time series functions, including segmentation, transformations, reducers, forecasting, and I/O. See Analyzing time series data.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.