Master Thesis Sachin Vittal | Database Research Group

Master Thesis
January 1st, 2014 until June 3rd, 2015

Abstract

Time series forecasting has been a challenge for research in various fields. Researchers have developed various forecasting models ranging from simple statistical regression model to complex hybrid forecasting models involving Artificial Neural Networks and Support Vector Machines. A common approach in any forecasting process is preprocessing datasets in order to clean and filter unwanted data or remove biases and seasonality or some time reduction of huge input dataset by imputation or aggregation. Some researches have suggested that, aggregation of time series data with similar patterns will lead to lower forecasting errors and reduced computation time. Clustering of data can be used for finding time series with similar patterns in a large dataset. Best clusters can be obtained in a dataset depending on the similarity measures used.

In this thesis, the main focus is to analyse the impact of aggregation of similar time series on forecasting errors. We do this analysis by using different standard similarity measures proposed in the research community on a defined subset of renewable energy data. As a second part of the research, we apply hierarchical clustering to find the most similar time series clusters in a subset of data and analyse the impact of aggregation of time series in these clusters on forecast errors. Our experiments covers various aspects of renewable energy forecasting like external weather influences, history length, geographic location and tests were conducted on different kinds forecastmodels proposed in the research community.

An attempt has been made to analyse the use of aggregation as a data preprocessing strategy in a forecasting process which can lead to better forecasting results and reduce computation time. With our experiments, we were able to conclude that, the aggregation of time series with similar patterns will result in better forecast results than simple aggregation of all the time series and lower computation time compared to the forecasting of individual time series. Agglomerative Hierarchical Clustering can be used to find the time series with similar pattern in the dataset and application of this method on a large dataset will result in better forecast results and reduce the computation time.

Student Theses

Systematic Analysis of Impact of Aggregation on Time Series Forecasting

by Sachin Vittal

Abstract

More