Enhancing the move-split-merge metric: efficient distance computation and median algorithms for time series
Loading...
Files
Date
Authors
Publisher
Supervisors
Abstract
The analysis of large time series data requires efficient algorithms. Time series data mining methods, such as classification and clustering, rely on the underlying distance measure to determine time series similarity. The accuracy and performance of these methods highly depend on the selected measure.
Lock-step measures, such as Euclidean distance, compare only matching indices, whereas elastic measures align points more flexibly.
Among elastic measures, the non-metric dynamic time warping distance (DTW) is the most widely used.
The elastic move-split-merge (MSM) distance, which transforms one time series into another using move, split, and merge operations with specific costs, is an alternative to DTW.
Unlike DTW, MSM is a metric that can be used for applications such as metric indexing or clustering within metric spaces.
Recent studies have shown that the MSM distance outperforms DTW in terms of the accuracy of the 1-NN classifier and k-means clustering.
Although DTW and MSM both have quadratic complexity, there exist fast versions of DTW.
The MSM algorithm computes the distance between two time series in quadratic time, even in best case instances.
Thus, the original algorithm is too slow to compete with other state-of-the-art measures, such as these faster DTW versions.
This thesis introduces methods to reduce the cost of computing MSM distances, including bounds that prune paths early in the dynamic programming table.
For constant time series, we provide a linear-time algorithm.
We also propose new linear-time heuristics, some of which rely on metric properties, and adapt DTW-based heuristics for MSM.
In particular, the runtime of our fastest MSM version is faster than that of a state-of-the-art DTW distance computation across the majority of widely used UCR classification datasets.
The second part of this thesis examines the MSM-Median problem, that is, computing a consensus time series under the MSM metric.
An accurate consensus time series is crucial for clustering, as it can serve as the cluster centroid.
We show that the MSM median is always composed of values from the input series, which is key to our MSM-Median algorithm.
Our algorithm runs faster than DTW-based competitors.
We further analyze the complexity of MSM-Median. MSM-Median is NP-hard, W[1]-hard for parameter k, and unsolvable in subexponential time unless exponential time hypothesis fails - even with just three distinct input values.
However, we show that MSM-Median is fixed-parameter tractable with respect to the total distance bound.
Moreover, we offer faster MSM-Median heuristics with minimal loss in accuracy, since exact computation is costly.
Review
Metadata
Contributors
Supervisor:
Dates
Issued: 2026-04-01
Faculty
FB12:Mathematik und Informatik
Language
en
Keywords
Time SeriesSimilarityMetricDistance ComputationMedian AlgorithmsMove-Split-Merge MetricAlgorithms
DFG-subjects
4.43-01 - Theoretische Informatik
show more
Drönner, Jana Lea: Enhancing the move-split-merge metric: efficient distance computation and median algorithms for time series. : 2026-04-01. DOI: https://doi.org/10.17192/openumr/659.
License
Except where otherwised noted, this item's license is described as Attribution-NonCommercial 4.0 International
