Forecast Transit Delays with Big Data

transit, forecasting, delay

I used to work in the Transportation field, especially with railroads. Recovering from delays and the ability to forecast transit delays was always a tough task because of the ripple effects” that this article mentions, but Mathematician Wilhelm Landerholm figured out an algorithm to forecast delays 2 hours in advance!

Enter big data. Cars on the highway suffer from two problems: there is no monitoring system for tracking their movements and they are operated independently. Commuter train systems, however, do not have these defects. In fact, modern networks have traffic control centers with computer systems keeping track of each train’s location at all times. Ten years ago, this mountain of data would have been unassailable, but with today’s faster machines and this new algorithm it is possible to make accurate predictions about the future state of the train network in a longer time window. It’s a bit like weather forecasting but for your commute. (ed. emphasis mine)

While forecasting these delays is similar to forecasting the weather – and we all know how inaccurate that can be at times – it’s definitely a step in the right direction. Big Data has a lot of promise but it always comes down to the quality of the data you have. Before you jam every datapoint into your Hadoop cluster, think about how you use it later. Garbage in is always equal to Garbage out.

Tl;dr: Big Data is used to forecast transit delays.