NewsRail engineeringTransport

Twenty years of Darwin engineering

Listen to this article

Rail Engineer’s sister magazine RailStaff recently reported on 20 years of Darwin, the heart of all customer train information and one of the most critical systems serving the industry. Darwin is maintained and continually improved by Rail Delivery Group (RDG) in partnership with Hitachi Rail, and in this article Rail Engineer looks a little more in-depth at the engineering behind Darwin.

Originally launched as the Real Time Train Information (RTTI) system, Darwin is the rail industry’s official train running information engine, providing consistent real-time arrival and departure information, platform numbers, delay estimates, schedule changes and cancellations. The system obtains train movements from several sources, including signalling track berths.

Darwin ‘works’ by taking the base ‘planned’ timetable and supplementing it with real-time updates from several industry systems. Key among these are real-time movement feeds to track train progress, and control interfaces, such as Tyrell, CIS, and Darwin’s own Workstation client, which allow operators to post additional information when it becomes known. Examples of such updates are cancellation, rerouting, expected delays along with reasons, platform changes, train loading, and more. Darwin’s forecasting algorithm, which predicts future arrival times, is self-learning and dynamically adjustable, so that it can automatically process data based on what actually happens, both in the long-term and short-term, rather than a fixed set of rules.

Darwin is key to providing confidence to the travelling public when choosing rail as their means of transport. The other inputs to Darwin include a daily timetable revision, workstations at train operators controls to provide additional information (for example the reasons for train delays and information from station customer information systems – such as platform changes), and TRUST. This is Train Running System TOPS (Total Operations Processing System), which is the operational system recording details of train data, as compared with a schedule to support the logging of delays and the attribution process. Train describer feeds are the primary source of train movement information, with TRUST filling in the gaps along with GPS train information where available.

Darwin powers the majority of the real-time train information systems, such as station customer information screens at all 2,500 stations, journey planners, the National Rail Enquiries app, and the hundreds of independent and commercial digital systems made by vendors such as Google Maps, Citymapper, and The Cloud. Quite simply, Darwin is the bedrock foundation of all train information, ensuring consistency and that one version of the truth is provided to the travelling public. Darwin now also provides consistent information to railway staff as well as passengers.

Consistent information

Before Darwin there was no universal source of accurate train information. Customers remote from a station had to ring a train enquiry bureau to obtain information and the staff at the bureau very often didn’t have access to accurate train information. At stations, train delays were entered manually into customer information systems from various real-time systems, and many of these used their own algorithms and worked in isolation. Some station customer information systems had very poor, inaccurate free-running processor clocks. This resulted in inconsistency for the customer, and it was possible for a customer information screen at one station to show completely different live information to a customer information screen at another station.

Numerous surveys have identified that accurate, real-time information is a major factor for good customer satisfaction. People want live, accurate updates of what is happening with their train service, so they can confidently manage and plan their time accordingly. When provided with good information, customers are better able to cope with delays or disruption to their journeys. The answer to achieve this was Darwin.

Real Time Train Information was started in 2002 as a pilot covering 200 stations. Following the pilot, a contract was awarded to Thales GTS (now Hitachi Rail). The system was developed to cover all 2,500 stations and was known as Real Time Train Information (RTTI). In 2004, it won Innovation of the Year at the Rail Awards and, in 2008, RTTI 2 (Darwin) was launched with a history facility and train describer feed to improve forecasting.

Darwin had several direct CIS connections from as early as 2004-2005. Each was a ‘bespoke’ interface such as Connex/Southeastern. This was followed in 2010 with filtered push ports to clients such as Real Time Journey Planner and the Virgin West Coast customer information systems. In 2011, all of the bespoke interfaces were replaced by a new standardised one which now powers the majority of CIS as well as other industry systems.

Up to 2014, Darwin had been hosted on computer servers, but with the increased use of Darwin a solution was sought that would enable the volatility of demand to be handled without needing dedicated servers. Consequently, the decision was taken to migrate Darwin to use virtual servers on the Amazon Web Services (AWS) cloud.

Cloud servers

We hear a lot of the ‘digital railway’ and how it will be part of the future of rail engineering. Cloud computing is now also being discussed as a possible future architecture for rail signalling. Rail engineering can also have a reputation of being too risk adverse and behind the times. However, Darwin is very much a tool for the digital age, providing customers with live information at their fingertips, on a platform that has been continuously enhanced since its creation 20 years ago.

Cloud computing can be defined as the ‘on-demand’ availability of third-party computer system resources and data storage, without the direct active management by the user. The cloud functions are distributed over multiple location data centres, with the sharing of powerful expensive computer resources. This improves availability with the use of multiple redundant sites and assists business continuity and disaster recovery. Dynamic scalability is another benefit of cloud computing with the ability to instantly increase computer resources when the information demands rise, or down if the resources are not being used. This is very relevant for an application such as Darwin, with its unpredictable demands.

In the middle of the night, in good weather, the need for train information will be low, but at peak travel times and in poor weather the information demands will be a magnitude higher. Darwin now receives typically over three million enquiries a day. In February 2017, Storm Doris moved across the UK bringing gusts of up to 94mph accompanied by heavy snowfall across Scotland. Overnight and into the morning of 23 February the storm moved through Northern Ireland, across northern England, and out into the North Sea by the early afternoon. During this time Darwin requests peaked at 1,150 requests per second with the previous highest being 860, and there were a total of 39 million requests.

This ‘peaky’ requirement means that significant additional processing capability can be required in a very unpredictable and short time frame. If RDG were to size the Darwin server architecture to cover the peak demands it would be an expensive investment which, at other times, would be far too large. However, the AWS servers are able to ‘spin up’ very quickly to cover any peak Darwin demands.
AWS uses 125 very powerful and secure physical data centres in various locations around the world. Customers of AWS include Adobe, Airbnb, Alcatel-Lucent, British Gas, Hitachi, Netflix, Spotify, UK Ministry of Justice, and WeTransfer.

Darwin has never stood still and has been subject to tweaks, upgrades, and improvements over the years to ensure it takes advantage of the best available data so it can provide the most accurate information for passengers. The changes have been numerous and have included, for example, the addition of GPS satellite train tracking via GPS data to improve forecasting in 2017, and better scalability in 2018. The current upgrade is known as the Darwin Evolution project.

Evolution

When Darwin moved to AWS, it was essentially a ‘lift and shift’ of the application. The Darwin Evolution upgrade will make best use of the AWS architecture enabling a future proof, more flexible solution that adopts the latest cloud-based technological advances.
Darwin is quite a complex system and is not easy to work on. It requires any new software engineer to learn the whole system and certain changes to the system may require the complete Darwin system to be taken down. However, the scalability of Darwin using software ‘containers’ and eliminating the hard-to-support legacy technology will make future changes to Darwin sustainable.

Using software ‘containers’ and the native cloud solution, together with the reduction/elimination of the legacy software, will make it easier to make changes and provide a larger group of engineers to support Darwin. Using containers will also better facilitate DevOps. DevOps is a set of IT practices and tools which integrates and automates the work of software development (Dev) and operations (Ops) to improve and shorten any development life cycle. Using Linux Containers where possible will also significantly reduce the current Windows license costs. Other Darwin Evolution changes include modernising and improving the storage of the internal timetable, replacing custom made codes with the cloud native solution, and retiring some of the legacy services.

The Darwin Evolution project will allow future changes to the system to be undertaken more quickly and will avoid the need to take the whole system down for software enhancements. Making the system easier to understand will also help to maintain the competencies required to work on Darwin.

We often learn of large data network systems that do not work as intended and cause harm to industry and society. The railway industry can also have a reputation for being behind the times and slow to implement new ways of working. However, Darwin is a great example of a data network system that works and delivers great service to various organisations which use the data for their own added-value applications and, of course, the travelling public. It is also worth considering that it was created by the rail industry years before the introduction of smartphones and the digital on-line systems we all take for granted.

Darwin has not stood still since its inception 20 years ago and has been subject to many changes and improvements. Its scalability and ability to change is a credit to the original designers and the Darwin Evolution project will continue to make sure the system is ahead of the game. Everyone involved in Darwin over the years is to be congratulated.

Image credit: iStockphoto.com