Train delays are among the most complained events by the public communities in urban cities. Train delay prediction is critical for advanced traveler information systems (ATIS), which provides valuable information for enhancing the efficiency and effectiveness of intelligent transportation systems (ITS). However, the train delay prediction problem cannot be easily solved by modeling historical/static data from a single data source. A large amount of data is collected from sensor devices across the cyber-physical networks in the big data era. Multimodal transport management systems offer greater availability of various open data sources, such as General Transit Feed Specification (GTFS) static and real-time feeds. With the development of advanced machine learning techniques, a growing number of open data sources are playing more and more critical roles in planning and operation of transportation services. Recently, very few existing ‘big data’ methods meet the specific needs in rail