Report from a best paper award winner with the paper “Stochastic prediction of train delays in real-time using Bayesian networks” by Pavle Kecman; Francesco Corman; Anders Peterson; Martin Joborn (TUDelft + Linkoping University, Sweden) at the Conference on Advanced systems for public transport, CASPT2015 conference in Rotterdam.Pavle Kecman with Hong Lo, Convenor and Leo Kroon, Chair, of CASPT 2015
Another title of this post might be: should I still trust forecasts? We focus on the problem of predicting train traffic. Normally this is done offline by looking at a fixed minimum travel time between stations (that’s what the NS do in their plan), which is added to the actual delay. A step better is to use historical data and assume that the forecast is based on similar performance of the same system of the past. This is somehow well established in the scientific literature, and a bit less in practice. NS started trying out this idea only recently. What nobody managed to do yet is to combine the actual delay information with the uncertainty relations that can be harvested from past recorded data: that’s what we do in this paper.
In fact, as we move (in time) towards an event in the future, its uncertainty decreases; the expected arrival time becomes more sharp, in probability terms. Once the event has happened, its uncertainty is null. An event in the future has thus a dynamic of uncertainty associated to it. Considering this dynamic delivers much better predictions and probability ranges for those predictions, which are quite useful for the users.
Ok, but how to do that? The uncertainty of future operations is modeled based on Bayesian networks. Railway traffic is modeled by means of a probabilistic graphical model which is compact: conditional dependencies between events allow the efficient computation of joint distributions. This method allows the current information or evidence about a certain event to be propagated. In fact, evidence about realisation of one event reduces the uncertainty of other events. In practical terms, probability distribution of e.g. an arrival delay in a station changes over time in discrete steps as more information becomes available.
This simple but very effective and innovative idea is very important from a passenger point of view, as it improves their perception of travel time and delay (knowing your delay makes you feel much better than being delayed unexpectedly). An even more important outcome is to exploit this model for monitoring and control of the traffic in real time (retiming, reordering, rerouting; dropping connections…).