Full metadata

Title

Modeling time series data for supervised learning

Description

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning of the relevant patterns This dissertation proposes TS representations and methods for supervised TS analysis. The approaches combine new representations that handle translations and dilations of patterns with bag-of-features strategies and tree-based ensemble learning. This provides flexibility in handling time-warped patterns in a computationally efficient way. The ensemble learners provide a classification framework that can handle high-dimensional feature spaces, multiple classes and interaction between features. The proposed representations are useful for classification and interpretation of the TS data of varying complexity. The first contribution handles the problem of time warping with a feature-based approach. An interval selection and local feature extraction strategy is proposed to learn a bag-of-features representation. This is distinctly different from common similarity-based time warping. This allows for additional features (such as pattern location) to be easily integrated into the models. The learners have the capability to account for the temporal information through the recursive partitioning method. The second contribution focuses on the comprehensibility of the models. A new representation is integrated with local feature importance measures from tree-based ensembles, to diagnose and interpret time intervals that are important to the model. Multivariate time series (MTS) are especially challenging because the input consists of a collection of TS and both features within TS and interactions between TS can be important to models. Another contribution uses a different representation to produce computationally efficient strategies that learn a symbolic representation for MTS. Relationships between the multiple TS, nominal and missing values are handled with tree-based learners. Applications such as speech recognition, medical diagnosis and gesture recognition are used to illustrate the methods. Experimental results show that the TS representations and methods provide better results than competitive methods on a comprehensive collection of benchmark datasets. Moreover, the proposed approaches naturally provide solutions to similarity analysis, predictive pattern discovery and feature selection.

Date Created

2012

Contributors

Baydogan, Mustafa Gokce (Author)
Runger, George C. (Thesis advisor)
Atkinson, Robert (Committee member)
Gel, Esma (Committee member)
Pan, Rong (Committee member)
Arizona State University (Publisher)

Topical Subject

Resource Type

Text

Genre

Doctoral Dissertation

Academic theses

Extent

xiv, 175 p. : ill. (some col.)

Language

eng

Copyright Statement

In Copyright

Reuse Permissions

Primary Member of

ASU Electronic Theses and Dissertations

Peer-reviewed

No

Open Access

No

Handle

https://hdl.handle.net/2286/R.I.15792

Statement of Responsibility

by Mustafa Gokce Baydogan

Description Source

Viewed on July 17, 2013

Level of coding

full

Note

thesis

Partial requirement for: Ph.D., Arizona State University, 2012

bibliography

Includes bibliographical references (p. 164-175)

Field of study: Industrial engineering

System Created

2013-01-17 06:33:25

System Modified

2021-08-30 01:44:51
3 years 2 months ago

Additional Formats