Imputation and Prediction of Multivariate Travel Time Data

Viktoria Öllerer

Publikation: AbschlussarbeitMasterarbeit

Abstract

In cooperation with the Austrian Institute of Technology this master thesis was written as a part of the HealthLog-project. The aim of the project was to build a reliable dispatching system for Samariterbund Wien that provides an assignment to the dispatcher focusing on short response time and patients' convenience. This thesis deals only with the static dispatching problem modeling the demanded route travel times from observed link travel times. A central part was devoted to replacement of missing values in the link travel times. The reference data set of taxi travel times was collected on Vienna's ring road G urtel dividing the route Westbahnhof to AKH into 31 links. Data is available from July 1st, 2008 until June 30th 2010. After grouping the data into the four categories 'holidays weekday', 'holiday weekend', 'school day weekday' and 'school day weekend', di erent imputation methods are applied, namely principal component analysis using singular value decomposition and the NIPALS algorithm as well as a nearest neighbour approach. Evaluating the three methods, the nearest neighbour approach performs best, especially for varying missing value rates. Accurate estimates are produced for up to 30% of missing values. To develop methods for the prediction of total travel times, another data set is collected from January 1st, 2009, until December 31st, 2009, consisting of trips that start on link 1 or 2 and end in link 30 or 31. Multiple linear regression is applied to these data and a stepwise regression method using Akaike's information criterion applied to select the most appropriate predictor variables. To ful l the assumptions of the regression model, the response variable is log-transformed, as well as those predictor variables that denote similar characteristics to preserve a good interpretation of the model. The obtained model is afterwards compared to a smaller model omitting average link travel times and deviations from the average speed as well as a deviation model that estimates the di erences between the total travel time and the computed average travel time for the corresponding category (school/holiday, weekday/weekend) and hour instead of the total travel time. v vi The original model and the small model perform nearly similar, whereas the deviation model gives rather poor results. At an average total travel time of about 4 to 5 minutes nearly 50% of the data can be preciously estimated to within half a minute. Furthermore, also observations up to the previous period are included in the model, naturally, improving the quality of the model. Using average link speeds instead of estimated travel times of the whole trip obtained by grouping data into the four categories 'holidays weekday', 'holiday weekend', 'school day weekday' and 'school day weekend' as predictor variables to examine if some important links already give enough information about the route impares the results.
OriginalspracheEnglisch
Gradverleihende Hochschule
  • TU Wien
Betreuer/-in / Berater/-in
  • Filzmoser, Peter, Betreuer:in, Externe Person
  • Rudloff, Christian, Betreuer:in
Datum der Bewilligung19 Apr. 2012
PublikationsstatusVeröffentlicht - 2012

Research Field

  • Ehemaliges Research Field - Mobility Systems

Fingerprint

Untersuchen Sie die Forschungsthemen von „Imputation and Prediction of Multivariate Travel Time Data“. Zusammen bilden sie einen einzigartigen Fingerprint.

Diese Publikation zitieren