Detecting spikes and troughs in time series data

Spikes and troughs in time series data can be detected using a class of algorithms called change point detection algorithms [1], [2].

A Python package that implements several change point detection algorithms is ruptures.

If you want to start with a simple and interpretable algorithm before moving on to more advanced ones, the rolling z-score heuristic, also known as a Shewhart individuals control chart, could be a good start.

The rolling z-score heuristic works as follows:

  1. For each observation in the time series compute the mean and standard deviation of the previous N observations, where N is the considered window size.
  2. If the observation is more than M (typically 3) standard deviations away from the mean, then report the observation as a change point.

A Python implementation of the rolling z-score heuristic for detecting spikes and troughs is given below.

import polars as pl

df = pl.DataFrame({'value': [0.8, 0.7, 0.9, 0.6, 0.4, 32.2, 31.9, 32.7]})

# Rolling mean and std
window = 4
weights = (window-1) * [1] + [0.1]
df = df.with_columns([
    pl.col('value').rolling_mean(window_size=window, weights=weights).alias("mean"),
    pl.col('value').rolling_std(window_size=window, weights=weights).alias("std")
])

# Compute z-score: (x - mean) / std
df = df.with_columns(
    ((pl.col('value') - pl.col('mean')) / pl.col('std')).alias("zscore")
)

# Filter only spikes or troughs
z_thresh = 3
print(df.filter(pl.col('zscore').abs() > z_thresh))
· spike, trough, time series, cpd