DWT/DFT Feature Selection

When dealing with time series, i.e. measurements taken over time, large amounts of data accumulate quickly. Often, the points in time are not independent, but measurements change rather slowly. Also the measurements are usually more or less noisy. These problems motivate the use of compression techniques to describe long time series with a few features that capture the basic shape of the time series.

Based on two popular signal processing methods, namely the Discrete Fourier Transform (DFT) and the Discrete Wavelet Transform (DWT), we developed a new feature extraction method. It offers a good description of time series with a few numbers, drastically reducing the complexity for further processing. Many existing methods were aimed at fast database access, not neccessarily a good representation of the time series. The optimal method on the other hand, produces representations of time series that are hard to compare. Our method finds the optimal compromise. Producing an optimal reconstruction error given some global contraints, the resulting features can be compared intuitively and traced back to shapes present in the original time series.

You can download the code for Matlab and try it on your own data.

This research was performed under the supervision of Prof. Dr. Alfred Ultsch in the Databionics Research Group, Philipps-University of Marburg, Germany.

Mörchen, F.: Time series feature extraction for data mining using DWT and DFT, Technical Report No. 33, Dept. of Mathematics and Computer Science, University of Marburg, Germany, (2003)

The picture shows how good the new method (green) is compared to the traditional method (red dots) and how close is is to the optimal method (black and blue for two variants) with respect to reconstruction error for one example. Note, that the results differ tremendously, depending on the data used.