SSTS: A syntactic tool for pattern search on time series |
| |
Authors: | João Rodrigues Duarte Folgado David Belo Hugo Gamboa |
| |
Institution: | 1. Laboratório de Instrumentação, Engenharia Biomédica e Física da Radiação (LIBPhys-UNL), Departamento de Física, Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa, Monte da Caparica, Caparica 2892-516, Portugal;2. Associação Fraunhofer Portugal Research, Rua Alfredo Allen 455/461, Porto 4200-135, Portugal |
| |
Abstract: | Nowadays, data scientists are capable of manipulating and extracting complex information from time series data, given the current diversity of tools at their disposal. However, the plethora of tools that target data exploration and pattern search may require an extensive amount of time to develop methods that correspond to the data scientist's reasoning, in order to solve their queries. The development of new methods, tightly related with the reasoning and visual analysis of time series data, is of great relevance to improving complexity and productivity of pattern and query search tasks. In this work, we propose a novel tool, capable of exploring time series data for pattern and query search tasks in a set of 3 symbolic steps: Pre-Processing, Symbolic Connotation and Search. The framework is called SSTS (Symbolic Search in Time Series) and uses regular expression queries to search the desired patterns in a symbolic representation of the signal. By adopting a set of symbolic methods, this approach has the purpose of increasing the expressiveness in solving standard pattern and query tasks, enabling the creation of queries more closely related to the reasoning and visual analysis of the signal. We demonstrate the tool's effectiveness by presenting 9 examples with several types of queries on time series. The SSTS queries were compared with standard code developed in Python, in terms of cognitive effort, vocabulary required, code length, volume, interpretation and difficulty metrics based on the Halstead complexity measures. The results demonstrate that this methodology is a valid approach and delivers a new abstraction layer on data analysis of time series. |
| |
Keywords: | Meta symbolic language Time series Query search Regular expression Signal processing Grammar |
本文献已被 ScienceDirect 等数据库收录! |
|