This is the implementation of SMATE in the following paper: SMATE: Semi-Supervised Spatio-Temporal Representation Learning on Multivariate Time Series (ICDM 2021). Check full version here.
Learning from Multivariate Time Series (MTS) has attracted widespread attention in recent years. In particular, label shortage is a practical challenge for the classification task on MTS, considering its complex dimensional and sequential data structure. Unlike self-training and positive unlabeled learning that rely on distance-based classifiers, in this paper, we propose SMATE, a novel semi-supervised model for learning the interpretable Spatio-Temporal representation from weakly labeled MTS. We validate empirically the learned representation on 30 public datasets from the UEA MTS archive. We compare it with 13 state-of-the-art baseline methods for fully supervised tasks and four baselines for semi-supervised tasks. The results show the reliability and efficiency of our proposed method.
Key words: Machine Learning, Multivariate Time Series, Semi-supervised Learning, Representation Learning
Figure 1: The architecture of SMATE
- graphviz=2.40.1
- keras=2.2.4
- Matplotlib=3.2.1
- numpy=1.16.4
- pandas=0.24.2
- pydot=1.4.1
- scikit-learn=0.21.2
- tensorflow=1.14.0 with CUDA 10.2
Dependencies can be installed using the following command:
pip install -r requirements.txt
Due to the space constraint, we include only part of UEA-MTS datasets in this repo. However, you can find the full datasets on www.timeseriesclassification.com. We provide the preprocessing code for the Weka formatted ARFF files.
python SMATE_classifier.py --ds_name DATASET_NAME
Fully supervised results on UEA-MTS archive (30 datasets)
Figure 2: Fully supervised results on UEA-MTS archive
Semi-supervised results on datasets from four different domains
Figure 3: Semi-supervised results on datasets from four different domains
Interpretability of the semi-supervised regularisation process & classification results
Figure 4: The t-SNE visualization of the representation space for the Epilepsy dataset, with 10% supervision.
Model efficiency
Figure 5: Training time regarding to: (a) training epochs; (b) TS length; (c) Instance numbers; (d) Variable numbers
If you find this repository useful in your research, please consider citing the following paper:
@inproceedings{zuo2021smate,
title={SMATE: Semi-Supervised Spatio-Temporal Representation Learning on Multivariate Time Series},
author={Zuo, Jingwei and Zeitouni, Karine and Taher, Yehia},
booktitle={2021 IEEE International Conference on Data Mining (ICDM)},
pages={1565--1570},
year={2021},
organization={IEEE}
}
The authors would like to thank Anthony Bagnall and his team for providing the community with valuable datasets and source codes in the UEA & UCR Time Series Classification Repository.