The Office of Rail and Road has set very challenging targets for the CP5 control period (2014-2019), with disruption to passengers to be reduced by eight per cent and to freight customers by 17 per cent. While the railway’s infrastructure is the most reliable it has ever been, it’s also the busiest, so any incident has a much bigger impact. Therefore, asset reliability is of even greater importance in delivering a reliable railway and achieving the targets.

So, what are the issues and techniques that need to be used to improve asset reliability?

Understanding failure

The first step in improving reliability is identifying the root cause of any failure. It is vital to understand how and why assets fail, so that maintenance practices and techniques can be developed to make the infrastructure better and more resilient.

Various levels of peer review and analysis are used to determine the immediate and root cause of failures, and what lessons have to be learned to prevent similar and repeat failures.

Specialists should guide maintenance managers and engineers to carry out root-cause analysis and gather data for the investigating process. This includes collecting data through interviews and analysis, together with applying techniques to identify and know the difference between symptoms and root causes. The objective is always to learn how to avoid future incidents by developing appropriate recommendations to address causal factors and root causes as well as, where appropriate, developing processes to identify systemic problem areas.

Very often, an equipment failure is quickly rectified by replacing a faulty component with a good spare. It is then vital to understand why the component failed. Original equipment manufacturers (OEMs) or repair companies should be encouraged to investigate and identify the root cause of an equipment failure. Sometimes, OEMs or repair companies may not have the incentive or skills to carry out the required forensic engineering and non-destructive/destructive testing process, and specialised investigation companies should be considered.

Good OEMs and repair companies should welcome an independent third-party investigation. As with any design or development activity, independence is key in making sure nothing is overlooked or taken for granted. Experienced test organisations will also have access to calibrated and specialised test equipment, together with the knowledge and experience of the harsh railway – environmental issues, vibration, electrical noise, electromagnetic interference, high-voltage transients, temperature variations and safety requirements.

Improvements in the performance of signals are largely attributed to the progressive introduction of LED signal heads, although LED technology has caused some reliability problems of its own especially in the proving interface with a system designed around incandescent lamps. For points, the introduction of master-class and supplementary drive set-up training, together with the implementation of improvements to address emerging issues following the Grayrigg accident, have all had a positive impact on reliability.

Track circuit performance has improved with the introduction of moulded tail cables, the development and upgrade of TI21 audio frequency equipment, upgrading older installations with duplicated tail cables, and master-class initiatives to share best practise and improve competency with maintaining insulated rail joints.

Predict and prevent

A predict and prevent, rather than find and fix, maintenance strategy is the objective of many infrastructure organisations, including Crossrail and Network Rail. Key to this is remote condition monitoring (RCM), which is an umbrella term for a number of remote monitoring strategies including points and track condition monitoring using analogue sensors, as with the Network Rail Intelligent Infrastructure (II) programme, or event monitoring of signalling control logic using systems such as Balfour Beatty AssetView. These systems are used to monitor and report condition and defects so that action can be taken before failures occur.

Communication links, and the power to provide monitoring systems in remote rail route areas, can be difficult. However, the oil industry, which has even greater challenges than rail with remoteness and with the only communication links being via high latency satellite, has implemented these systems so it can be done. One example where this will be useful in rail is the remote monitoring of such unpowered equipment as gates at farm crossings, which could be met using wind and solar technologies.

One of the business benefits in oil with the use of RCM and II was to reduce the need for maintenance staff to enter hazardous areas, exactly the same requirement as rail. Their experience was that these tools quickly become an essential maintenance and faulting aid, so good resilience to failure and high availability of monitoring systems needs to be provided at an affordable cost. In Great Britain, the cost is normally justifiable financially against the savings in train delay attribution penalties, but there are also reputational and efficiency benefits.

The key to intelligent infrastructure data is to: harvest, transport and extract information, then transform it into a business benefit service. Organisations that have successfully delivered II strategies include train operators, which have reduced man-hours’ maintenance by 50 per cent and failures by 70 per cent.

Good use of both RCM and II is being made in rail, with systems in place to monitor a variety of assets including cable insulation, point motor and track circuit voltages and currents, power supplies, radio and transmission systems, strain gauges on structures and earthworks, and sensors on trains.

One of the first uses of RCM in rail was to monitor point machine power consumption, drive load and switch movement. For cost-benefit simplicity, this is usually confined to operating current, which can be monitored from the control point. However, analytical systems have now been developed which can not only identify defects with point operating equipment, but also those with the track formation that supports the switch and crossing. There is mounting evidence that this is a significant factor in point mechanism stress and leads to excess wear and failure, but some quantifiable data is needed to support this.

Reliability-centred maintenance

Until a few years ago, the common practice in engineering maintenance was for most equipment to be subject to a fixed, planned, preventative maintenance cycle designed to maintain the asset in its optimum condition, or to manage the rate of degradation to a level that was acceptable. These maintenance cycles were mandated in standards and normally fixed, no matter where the equipment was located or how often it was operated.

Reliability-centred maintenance, on the other hand, links maintenance with usage and performance. It identifies historic maintenance tasks that cannot be demonstrated to be beneficial to asset performance, so they can be eliminated or, at least, performed less frequently. It also considers possible additional maintenance tasks and frequencies for assets that are used intensively or are of strategic importance.

Some people may think, wrongly, that this process is just about decreasing maintenance and saving money. However, it is really to ensure that the maintenance resource is utilised more efficiently. Rather than being based on a set time or mileage, the frequency of servicing is now adjusted to match the criticality of the asset, which in some cases could result in additional maintenance.

If one asset with mechanical moving parts is used hundreds of times a day, and an identical asset is used, in another location, only occasionally, do they require the same inspection and maintenance frequency? Logically, the answer is no – the more intensively operated asset may require additional inspection and maintenance interventions.

Of course, the caveat to this is that assets that are used very infrequently must be maintained sufficiently to ensure they do not seize up and fail on the rare occasions that they are needed. (Think here of snowploughs, “Thunderbird” rescue locomotives, rail-mounted cranes and points controlling a lightly used route – all are only rarely called upon but must function perfectly when required.)

In very simple terms, using a risk-based approach focuses maintenance resources onto those areas where they are needed most, which in itself will bring financial benefits by saving the excesses of over- maintenance and avoiding the penalties of under-maintenance.

Whenever the maintenance tasks or frequency are modified, it is essential that the change is formally assessed, documented and approved, and that the asset is monitored to check that the change has not adversely affected reliability. RCM systems can help to provide the necessary data, acting as decision support tools.

Train monitoring

Automating inspection, testing and reporting has very important benefits in terms of both safety and reliability. The safety of track workers, and avoiding the need to have them out and about when trains are running, is a no-brainer, and using technology appropriately and correctly allows defects to be detected sooner than by visual inspection, allowing actions to be planned sooner and resulting in better reliability.

Trains that monitor various aspects of track condition have been used for many years in one form or another, and the New Measurement Train (NMT), a converted 125 mph HST with five coaches including testing and analysis vehicles, has been a great success. Whilst measuring track condition was the primary task, the train is also capable of limited contact wire checking. GPS and tachometers give positional information to a general accuracy of two metres and guaranteed accuracy of 16 metres.

On the West Coast main line, particular care has to be taken to ensure that clearances are maintained for the class 390 tilting trains. Lasers and thermal imaging cameras are mounted on the train and the high-speed cameras are synchronised and capable of taking photographs at 70,000 pictures per second, so at 125mph this gives a picture of the rail every 0.8mm. The downward pointing cameras look at the inner, outer and top sections of the rail for aberrations including rail burn and other heat-related defects.

Overhead contact wire and pantographs

Overhead electrification lines are still monitored using test coach Mentor (Mobile Electrical Network Testing, Observation and Recording) that was introduced in 1973 and is limited to a maximum of 100mph line speed. Equipment to monitor overhead contact wires has also been fitted as required to dedicated service trains, but what is really needed is for monitoring equipment, both for overhead contact wires and other infrastructure assets, to be fitted to in-service trains. While some train operators are keen to be involved with such innovation, the fragmented nature of the GB rail industry does not help, and it very often comes down to how such innovations to improve reliability are funded.

Pantographs, and the thin carbon strips they carry to draw current from the overhead contact wire, are usually subjected to manual inspections during scheduled maintenance. However, with pantographs in constant use and operating under all weather conditions, defects can quickly accumulate. Remote monitoring technology enables the identification of vehicles that are at greater risk of inflicting damage to the network’s wires due to general wear and tear. This can instigate early preventative action and, ultimately, extend the life of both the wires and the pantograph equipment carried by the trains.

PanMon, developed by Ricardo Rail, is a lineside-located system that provides high-definition images of each passing pantograph through a combination of radar, laser, video and photo technology, together with a contactless optical uplift monitoring system.

Using specialist pattern-recognition analysis software, the system automatically interprets the data to provide ongoing condition reports of each passing pantograph. This includes identifying the remaining thickness of carbon strips or any damage to the pantograph’s head, aerofoils or end horns, which can affect a vehicle’s ability to maintain good contact with overhead wires.

Reliability by design – diversity and redundancy

A significant number of failures that delay trains are due to signalling and telecommunications assets. The fail-safe requirement of such assets doesn’t help reliability, however. On an extremely busy network, having numerous trains sitting stationary when failures occur is, in itself, a safety hazard, as are the resulting overcrowded platforms.

New control and communication systems should be designed with better reliability standards than older systems, with diversity and redundancy built in. Processor-based systems with hot standby and double or triple redundancy are now available and in service, and they are also able to have any failed critical components replaced while the system is still operational.

Telecommunication networks, which are now based on packet-routing internet protocol, are able to provide connections for radio, control and electrification systems even if cables are cut or equipment fails. Care has to be taken with the design of such systems to make sure any common elements, such as power and diverse cable routes, are properly designed. There have been occasions where a network designer has allocated two diverse fibres, but these have ended up in the same cable which has then been cut. Similarly, duplicated transmission systems have been fed from the same (failed) power system.

Such networks have, for some time, been provided with extensive centralised monitoring, reporting and management capabilities, enabling faulting interventions to be accurately planned and executed. Similar capabilities are now being provided in new signalling systems.

One remaining, and contentious, issue is – how far are remote interventions permitted to go? It already takes place in some telecoms systems, for example to configure and allocate transmission paths within switches and routers, and similar remote interventions may be possible in signalling control systems, once the security and independent testing requirements have been addressed. Properly executed, such interventions could contribute to reliability, safety and cost savings.

Redundant power, in the form of duplicated supplies and/or uninterruptable battery backed supplies, is also required for all essential control and communications assets.

Sometimes, an interesting problem with diverse and redundant systems is getting access to enable the replacement of ‘failed’ components. Technicians can be called out to equipment ‘failures’ which are not service affecting but need either track or equipment possessions. However, to the operator, there is no failure, as trains are running normally and he has not lost any functionality, so he may be happy to carry on running as normal and not allow access. The risk is that, in the event of a second breakdown, the system could fail totally, stopping trains from running. This has already occurred on at least one occasion.

Keeping staff competent

Another difficulty with reliable and complex systems is that, when they do eventually fail, the maintenance teams may be ‘rusty’ having not worked on the equipment for some time. This is when remote diagnostic and intelligent self- reporting systems come into their own. They need to be designed with intuitive, easy-to-use interfaces so the staff with the correct competency can be quickly deployed and guided.

Training and demonstration reference systems on which staff can train and maintain their familiarisation and competencies can help, and such systems can also be used to soak test any upgrades or modifications before they are installed on the live railway. This will also contribute to reliability.

Planning for when things may fail is also important. Comprehensive action plans need to be in place that deal with escalation, communications, and the use of diversionary routes. They also need to cover access to spares and experts, both within rail, other industries and OEMs.

What next to improve reliability?

The Industrial Internet of Things (IIoT) and Industry 4.0 are the next generations of technology to automate and improve reliability that are likely to be adopted in the rail industry.

IIoT is about the worldwide proliferation of embedded sensors, data analytics and networks such as the Ethernet in manufacturing, while Industry 4.0 is something a little more specific. The IIoT may be an industrial response to a consumer-facing trend (the generic Internet of Things), while Industry 4.0 is more particular to manufacturing industry. However, the two terms refer to similar concepts.

Industry 4.0 originates from the German government, which used it to denote a potential fourth industrial revolution, following the previous three that centred on the introduction of water/steam power, electricity and IT. Germany established an Industry 4.0 working group in 2012 to focus on initiatives such as the refinement of embedded systems (used successfully by car manufacturers) and industrial production. While it is focused on manufacturing, there are elements of Industry 4.0 which rail can adopt.

This vision for both Industry 4.0 and IIoT is to emphasise real-time communications and automation. The implementation of the IIoT can greatly improve connectivity, efficiency, scalability, time and cost savings for industry, while interoperability and security are the two biggest challenges. Businesses will require their data to be secure, as the proliferation of sensors and other smart devices could, if not implemented correctly, result in security vulnerabilities.

Companies that have embraced the IIoT have seen significant improvements to safety, efficiency, reliability and profitability, and it is expected that this trend will continue as IIoT technologies are more widely adopted in all industries.


This article was written by Trevor Bradbeer, specialist signal engineer at Balfour Beatty.