The UK signalling profession took a long hard look at itself at a recent seminar in London organised by the Institution of Railway Signal Engineers (IRSE). This was a chance to reflect on the recommendations and lessons following the enquiry into the tragic event that occurred on 12 December 1988 and to see whether these were now fully adopted. Also considered were the subsequent changes in organisation and technology since 1988 and any associated impact on the design and implementation of signalling projects.
A key note speech by Ken Burrage, the director of signal and telecommunications engineering at the time of the enquiry, gave a hard hitting message that people do make mistakes but accidents are preventable. A disaster occurs because a series of mistakes come together and a typical accident chain comes about because of inferior standards leading to a lack of training, poor installation and inadequate testing. Human error at every level is the normal cause.
At Clapham, the signalling work was being carried out in a series of multiple stages, each of which depended on the integrity of the people involved. The ‘stageworks’ involved modifying 50-year-old signalling equipment to accommodate changes to track layouts and signal positioning in advance of the eventual changeover to power box operation.
The evidence showed that poor installation practice, lack of wire counting, poor supervision and defective testing, coupled with poor communication, lack of staff competence and inadequate monitoring, all added up to a management failure. There were many reasons for this: shortage of staff, difficulty of recruitment, too many re-organisations, lack of audit and pressure on timescales. The lessons were hard ones for everyone involved but the hardest of all has been that management commitment from the top down must be there to prevent anything like the Clapham accident happening again.
An RAIB View
The Rail Accident Investigation Branch did not exist at the time of Clapham, only coming into being following further significant rail disasters, but its subsequent investigations often refer to the recommendations from the Clapham enquiry when assessing what went wrong and what should happen. Mark Turner from the RAIB gave a realistic assessment of the current position, which was not entirely comfortable listening. Acknowledging that the industry structure has changed in 25 years, it may be harder for the modern railway to relate back to the Clapham lessons. Typical examples are:
- Privatisation leading to significant commercial pressures and penalties;
- Human factor issues focussing on the different types of human error;
- IRSE Licensing being fine for technical knowledge competence but doing nothing for attitude and approach;
- Prescription being superseded by risk assessment now largely under the banner of the ROGS regulations;
- The whole safety culture being under scrutiny by health and safety management.
In all of this, and perhaps not surprisingly, accidents involving ‘signalling’ are continuing to occur. Four examples were given: at Milton Keynes a data error led to a green aspect leading into an occupied section; at Stockley Bridge a wiring error led to an unsolicited movement of points in a locked route; at Greenhill Upper Junction a wiring error caused points to be incorrectly detected during installation of a new point machine; at Lindridge Farm User Worked Crossing an incorrect scheme plan caused a signaller to think a train was in a different location.
All of these reflect back into signalling design, installation and testing practices. Computer based signalling does make data errors difficult to spot but there are still far too many installation errors and wrong images on signallers’ displays. Above all, the old problem of interfaces remains the biggest risk.
Signalling Statistics and Incident Causes
David Marriott, a former senior signal engineer on Network Rail and, by chance, a passenger on one of the trains involved in the accident, put accident statistics into context by comparing rail with road. In 2012, there were 1754 people killed on UK roads (much fewer than earlier years) as against 5 fatalities on rail in the period 2011/2 excluding trespass and suicide.
The means of categorising rail signalling accidents has improved considerably so statistical analysis and trends can be measured accurately. A hazard index has been constructed that categorises the most serious incidents. Currently there are about 1000 higher risk failures a year but only 10 were classified significant with none as really dangerous. Seven classes of failure/ incident have evolved:
- Equipment and Installation Shortcomings – Not obeying manufacturer’s instructions, missing out a test, taking short cuts, equipment degradation through age, omitting ‘out of use’ safeguards;
- Staff Communication Error – ‘Thought you had done it’, did more than the tester thought, incorrect comms protocol, technology feature not advised to appropriate people;
- Money & Time – Job larger than available resources, abbreviated testing, rushed design/installation/testing;
- Technology Knowledge Shortfall – ‘I thought I knew how it worked’, unpredicted failure modes, project manager/ TOC pressures to change things, new trains with new failure impacts on signalling;
- Signal Engineer Mentality – Signal engineers and train driver not always having same understanding, problem of layering testing, signal engineer attitude to some systems such as the train protection and warning system (TPWS);
- Incidents That Never Were – Predicted failure modes eliminated at design/ installation/test, engineers ‘play time’ with new systems/facilities;
- Ones That Shouldn’t Have Happened – Potters Bar, Grayrigg, Ladbroke Grove.
Many incidents do not result in accidents but nonetheless do occur: points moving in a locked route, two trains in a section, false clearing of signals, points out of gauge, incorrect overlaps, incorrect route indication, impact of sunlight on aspects, signals dim or out, signallers failing to understand system, public failure to interpret instructions at crossings, poor signal sighting, inadequate driver aids.
The likelihood of future incident causes are seen to be: more pressure on timescale, understanding new technology, poor communications between the many parties now involved, not sorting out root causes, reluctance to air problems in public, inadequate awareness of other disciplines. This should not be alarmist but more an indication of where the effort should be put into accident prevention.
Rolling Stock and Train Borne Systems
Whilst the Clapham accident was due to a signalling deficiency, many of the fatalities and injuries were caused by the poor crashworthiness of Mark 1 rolling stock. Kevin Crofts from Interfleet explained how the accident has influenced carriage design with changes to ensure vehicle boundaries are not breached in a collision and that train ‘furniture’ does not move.
Modifications suggested for Mk 1 stock were not implemented as the carriages had only a limited life. For new stock, research focused on five collision scenarios: end on crash, side swipes, road vehicle level crossing impact, buffer stop impact and derailments. A new crashworthy strategy has evolved based on energy management and absorption backed up by computer modelling and dynamic tests. The Vehicle Acceptance Body (VAB) process has emerged with associated Group Standards and the European standard EN15227.
The emergence of in-cab signalling has created new challenges for managing signalling responsibility with much of the equipment being on board the train. Not only are there external interfaces to balises and radio messaging but also internal ones to traction motors, brakes and odometry coupled with the complications of performance data and train preparation. The challenge is determining who owns ‘the system’ and how integrity across the wheel rail divide will be assured.
The Testing Dilemma
Four contributions focussed on signalling testing and whether the Clapham recommendations were still appropriate. Helen Sumbler from Atkins viewed both the positive and negative: testers now know their limits and are independent but testers are no longer ‘rounded’ signal engineers, the levels of knowledge is declining and the time for mentorship is lacking. Testers are often called upon to get involved in project management at the same time that workload is increasing. Much reliance is placed on time-served engineers who learned their testing expertise through hands on experience but these people will soon be retired.
It is reckoned that general compliance with the recommendations has been achieved but only around 56% are fully implemented. Functional testing remains aligned to interlockings and no national formal training programme exists. The distinction between signalling and software engineers in the testing process and the mentality of both is a future concern.
The need for Principles Testing was questioned by John Arnell from Atkins, especially in the field of data testing. Testing has become more compartmentalised and there is a danger of micro engineering and missing the bigger picture. Perhaps testing is over documented with the GRIP process causing projects to be finalised too early. Twenty five years of data testing in solid state interlockings should mean that requirements driven software is understood.
Whilst computer based signalling allows more functionality, this comes at the risk of introducing errors. It is possible that SSI and its successors Westlock and Smartlock have been pushed too far; errors in source code are resolved by ‘cut and paste’ fixes but there is difficulty in checking data printouts and no real comparator between test and command. The relationship between designer and tester is a risk as it is all too easy for the former to rely on the latter to find mistakes.
There is little doubt that testing computer-based systems cannot be treated in the same way as relay interlockings. The danger is that testing will focus only on what the system is supposed to do and not what it might do. Rod Muttram explained how this problem has been approached in Bombardier by using a process of Formal Methods and Automated Testing. These need planning from the outset.
A complex methodology has been devised for interlockings including development of a formalised programming language plus associated tools, and has resulted in the STERNOL code. This will produce many hundreds of command combinations that can exist in a computer-based system and, if done properly, will lead to automated testing that can be run continuously for long periods of time. It should halve the number of hours required for testing but must be accompanied by an appropriate means of interpreting the results.
Is Britain the only country to have these challenges? Clearly not, and Francois Fleuret from Systra/SNCF gave an account of several high profile French rail accidents with subsequent investigations. As well as leading to improved ATP systems, RFF (the French rail infrastructure organisation), in conjunction with SNCF staff, has devised a process that details who does what in the task of implementing new signalling. Although contractor staff are employed, they are much less dominant than in the UK and do not undertake some key activities such as possession management. RFF witnesses all stages of implementation including factory acceptance tests / software validation, site pre tests, integration testing to adjacent systems and equipment and the actual commissioning process. It is interesting that testing and works on an operational line are performed during normal day time shifts. There are similarities with British practice but there is much more dependency on in-house staff.
Design and Project Management
Investigations into design methodology following Clapham revealed that regional practices prevailed. Drawing offices had their own styles and concentrated mainly on scheme plans and control tables. Written specifications were seldom produced and stage works were often not being formally designed.
Graeme Christmas from Network Rail recounted how this has changed and it is clear that giant strides forward have been made. The Design Handbook, the production of standard drawings and a regularised distribution of instructions has been part of the process. IRSE licences have helped ensure safe designs but perhaps with a resultant loss of efficiency. Recruitment and training of staff has improved but there is still much to do.
Other factors have impacted on the design process: new regulations such as Electricity at Work, CDM, Safety Cases and ROGS all demand compliance to national standards with decisions of principles being made at the centre.
Under ROGS (the Railways and Other Guided Transport Systems (Safety) Regulations 2006), safety has become a shared responsibility including contractors, and embraces both understanding and behaviour.
The new CSM (Common Safety Method) system for risk evaluation and assessment offers a more pragmatic way of achieving compliance. Whilst safety must continue to be a cornerstone of design, recent changes to get efficiency has resulted in a more proportional design philosophy using modules with less detail. The need to get to grips with system engineering may lead to a Standards moratorium so that all interested parties can collectively look at ways of taking this forward.
A pragmatic review of past and present management of signalling projects was offered by Bruce MacDougall, a seasoned signal engineer, with experience on both British and overseas railways. The focus on testing is right but oddly, no amount of testing (except for a wire count) would have found the Clapham error. It was a supervision problem.
More important these days is how to stage and resource a programme. Some issues are:
- When to do work – weekday, weekend or blockade? The debate continues.
- Do staffing levels match the plan?
- Is there a staging strategy risk balance?
- Is it allowable to plan resources on a global basis? » Is it right to ‘wave’ the safety card when things go wrong?
- Should a project be slowed down if resources are lacking?
- How critical are hours of work: should weekends off be enforced, should weekends be part of the scheduled work load, how critical is this to the eventual changeover?
- Recruitment, training and competence: how effective are ‘fast track’ conversions, should courses be linked to competence, how to ensure competence on new systems?
Project and engineering management need to be more closely aligned so as to understand better the use of temporary or displaced staff, the pressures on headcount numbers, the call for arbitrary financial reductions, the leaving of vacant posts unfilled, the investment in training plus other typical business pressures. The question was asked: “do designers spend too long demonstrating competence and not long enough designing?” Perhaps the answer is to move to template design standards that interface with test plans and emanate from manufacturing industry. But would such standards promote best practice or stifle innovation?
The signalling profession finds it hard to critically examine itself, so this seminar was refreshing in that respect. Assurances that safe working practices are now in place were given and Clapham could not happen today. However, the processes now practised may not be entirely suitable and the understanding of new technology gave rise to some worrying questions. Hopefully the concerns raised will be noticed by those in power and acted upon.
Report by Clive Kessell