Detecting point outliers in univariate time series data

Let us denote the univariate time series data as $X = [x_0, x_1, …, x_n]$.

Approaches using thresholds

Z-score

\[outlier(x_i) = |x_i - \mu| > k\sigma\]

where $\mu$ represents the mean, $\sigma$ the standard deviation, and the constant $k$ = 2 or 3 typically.

Modified z-score using median absolute deviation (MAD)

\[outlier(x_i) = \left| 0.6745 * \frac{x_i - median(X)}{MAD(X)} \right| > 3.5 \\ MAD(X) = median(|x_i - median(X)|)\]

Interquartile range (IQR)

\[outlier(x_i) = x_i \notin [quartile_1(X) - k * IQR(X), quartile_3(X) + k * IQR(X)] \\ IQR(X) = quartile_3(X) - quartile_1(X)\]

where the constant $k$ = 1.5 typically.

Further reading

For a comprehensive review on point outlier detection in univariate time series data see this paper.

· point, outlier, time series, univariate