Live train tracking sounds like magic, but it is the orchestrated output of three distinct data streams meeting in a single backend. First comes the static timetable, almost universally published in the GTFS (General Transit Feed Specification) format pioneered by Google and TriMet in 2005 and now adopted by hundreds of public transport agencies. A GTFS bundle describes every scheduled service — every stop, every departure time, every calendar exception, every fare zone — as a set of plain-text files. We re-fetch these feeds on a rolling schedule (typically every 24 hours for stable networks, every six hours for fast-changing ones such as the German Deutsche Bahn and the British National Rail) and replay them into our normalised database.
The second stream is real-time positional data, served either through GTFS-Realtime protocol buffers (used by SBB in Switzerland, NS in the Netherlands and many North American agencies) or through bespoke SOAP and REST endpoints exposed by national operators. National Rail Enquiries in the UK, for example, publishes Train Movements via a STOMP message queue derived directly from the Network Rail TRUST and TD systems — meaning every signalling event at every junction in Britain is observable in near real time. We consume that firehose, deduplicate it, reconcile it against the static schedule, and produce a single authoritative status — "On time", "+3 minutes", "Cancelled", "Boarding at platform 4" — that you see in the departure board.
The third stream is contextual reference data: station coordinates, line geometries, rolling stock specifications, operator branding, accessibility flags and multilingual station names. This data comes from a blend of OpenStreetMap, Wikidata, official operator station handbooks and our own editorial team. It is what lets us draw a map of every station in Pakistan with proper Urdu names, or show the correct first-class coach diagram for a Class 800 IET versus a Mark 4 carriage on the East Coast Main Line.
All three streams converge into a denormalised read-optimised store served by an edge cache in 19 regions, which is why a query for "next trains from London King's Cross to Edinburgh" resolves in under 200 milliseconds anywhere on Earth. Our infrastructure runs on Vercel, Cloudflare and Supabase; we use Next.js 16 for rendering, Prisma for the schema, and a custom Rust ingestion pipeline for the GTFS-RT firehose. Every piece of data we display can be traced back to a source we cite, and every source has an attribution page in our docs. Transparency is not a feature — it is the product.