Big Data Risk Analysis (BDRA) is a new approach to risk analysis and safety management for the railway industry. Led by the Institute of Railway Research and RSSB, it is based on the intensified use of vast amounts of safety-relevant data, analytic software, non-relational databases and powerful computer systems.

A recent conference on the BDRA programme, held in Birmingham was attended by representatives from Japan, France, Sweden, USA and Korea, had academic representation from the universities of Huddersfield, Birmingham, Cranfield, Lumera (USA) and Imperial College London, and attracted delegates from the air transport industry.

The safety performance of Britain’s railway has improved dramatically over the last 50 years. In the 1960s, up to 100 workers a year lost their lives, while now, in some years, no fatal incidents occur at all. This much-reduced number of incidents makes further improvements more challenging, and so new methods of identifying risks and control measures are needed, which is where BRDA comes in.

Proactive to reactive

In very simple terms, any safety management system consists of three elements – plan, act and review. This can be further broken down into the need to define the objectives, risks, and control measures, then to monitor the improvements and feed them back into a modified plan. Investigations following incidents identify new control measures but, because there are now fewer incidents, new ways of identifying control measures are required. Put another way, safety improvements need to become proactive rather than reactive.

The basic tools of BDRA have been developed by the University of Huddersfield and RSSB, with the objective of an integrated approach to safety and risk assessment, based on data-analytics. It is still early days in the programme but, looking to the future, state-of-the-art risk monitoring technology could pinpoint faster, targeted improvements to safety and reliability on Britain’s railways at the push of a button. The vision is that it will provide the right insight, to the right person, to help them make the right decision.

This technology is already applied in the oil and nuclear sectors and could supply tomorrow’s rail safety manager with a real-time ‘intelligence console’ about incidents, infrastructure and rolling stock faults, providing rapid tactical analysis and automating parts of the existing paper trail. It will also give better information for efficient and robust boardroom decisions.

Challenges

There are many challenges to overcome, such as sharing information between companies, privacy and security, capturing data in a consistent format, and making sure the analytic process allows appropriate human cognitive review. What it must avoid is data overload to engineers and managers so that they can’t see the wood for the trees. What BDRA must do is to extract intelligence from multiple data sets – ideally, in real time.

The Rail Industry’s Data and Risk Strategy, published by RSSB and steered by a cross-industry group, sets out how the railways can make better use of data to improve safety performance, prevent delays and disruption, retain high productivity and reliability, and prevent train accidents.

The first step of the strategy is already in place, with the new Safety Management Intelligence System up and running, and actively in use by Network Rail and train operating companies. SMIS+ is the programme to modernise safety-reporting capabilities, making it easier for people to collect information, and extract intelligence. This could reduce the time taken, from first being alerted to incidents and close calls to making the ultimate remedial decision or investment to manage the risk, from years to weeks in some cases. It will make it easier for companies to report and track safety incidents and investigations, and provide the right risk information in the right format to the right people at the right time.

SMIS+ is a completely new, cloud-based on-line system exploiting commercial off-the-shelf, state-of-the-art safety management software which has replaced the old SMIS. So, while the name is similar, this is a completely new system, denoting a transformation in system capability.

Phase 1 was introduced on 6 March 2017, replacing the old SMIS system, with phase 2 being rolled out later in the year and replacing the existing close call system. This will deliver the ability to record and track ‘close calls’, as well as the ability to use mobile devices.

SPAD management.

Improvements by the industry mean that the risk from signals passed at danger (SPADs) is low, and it is over 17 years since the last fatal train accident was caused by a SPAD. To make the next step in risk reduction, though, it is necessary to look deeper into the circumstances that cause a SPAD, such as how frequently a signal is approached while showing a red aspect.

Rail companies will be able to identify the signals which are most frequently approached at red thanks to a new on-line tool developed by RSSB and the University of Huddersfield. The tool can help to focus attention on signals where SPADs may be more likely. It has been proven successful in trials and it is hoped that it will be used to generate new safety and performance insights for rail companies.

The Red Aspect Approaches to Signals (RAATS) tool uses 420 days of train movements provided by Network Rail through its open data initiative and applies complex algorithms to identify where red signal approaches are happening. The results can be broken down by train type, day of the week or time of day and analysis can be carried out on signal groups. Users can interrogate data within the tool or export it into Excel.

The RAATS tool was released as a prototype in January, and work is underway to refine it, including linking it to live data feeds, before formally launching it later in the year. Looking to the future, it should be possible, with the right collaborative industry approach, to integrate data from on-train monitoring recorders with signal asset condition and maintenance databases, using a BDRA approach to provide a complete proactive SPAD risk management system.
SNCF SPAD experience.

In France, SNCF has also been working on a similar analytical risk system for its SPAD management. It has managed to integrate a year’s worth of on-board data, but identified that the data was overwritten in all the recording systems. This is one learning point for any BDRA system.

The experience of having scattered data across the French network was not an issue, but data quality was a bigger problem, with the accuracy of geographical and time data being vital for robust analytics. Good results have been achieved using text analysis, and those from machine learning supervised classification algorithms are encouraging.

SNCF admitted that the project’s access to the company’s data could have been better and that its IT systems can’t access all the required data across the network. There are also plans to merge event reporting from other sources, such as signal asset data, performance and maintenance records.

Text analysis

Peter Hughes of Huddersfield University explained a process of text-analysis of cold calls, which is one of the tools of the BDRA system. To analyse and identify risk from free-text cold calls requires a very carefully designed set of algorithms to make sure nothing is missed and to provide intelligence to enable improvement plans to be implemented.

NoSQL is the next-generation database used at the heart of the text-analysis system. These were originally called ‘non-SQL’ or ‘non-relational’ databases, reflecting the fact that the database provides a mechanism for data storage and retrieval which is modelled in other means, rather than the tabular relations used in relational databases. NoSQL databases are used by the likes of Facebook, Google and Amazon, and are increasingly being used in big data and real-time applications.

The text analysis is designed to pick out and highlight key terms in free text messages. The requirement is to identify common hazardous events, regardless of the language. It must take into account that the free text may have been generated by a user who may be wearing gloves, stood in poor light and lineside in freezing rain! So, for example, “access” could be entered as “acces” “possession” entered as “posession”. The NoSQL database takes this into account.


This article was written by Paul Darlington