Automatic generation and validation of train schedules

Passenger railroads throughout the world have mechanisms to generate timetables for both service and capital planning purposes. The way I’ve done this in the past is with the Mk. 1 eyeball: you come up with a schedule, maybe draw some stringline diagrams to determine minimum separations, and then shift around the run times to ensure that there are enough resources at each crossing to allow for the desired schedule. Sometimes, of course, this doesn’t work, and it’s painstaking and laborious, and nearly impossible to answer questions like “What’s the minimum investment (in tracks and switches, or in increasing speeds) to allow for the schedule we want?” Actual railroads don’t do it this way, of course — their networks are much more complex, and they have more constraints than are necessarily obvious from aerial photography. They use software to validate their timetables, and in many cases will use linear-programming libraries to find the schedule that maximizes equipment utilization, minimizes capital investment, or optimizes some other criterion.

Last week, the MBTA Fitchburg Line schedules were announced for the spring rating. The Fitchburg Line has been under construction for the entire month of April, so with the majority of the line being bustituted, the MBTA and its contractors chose not to publish the spring schedule at the same time as the other lines. A co-worker of mine lives in West Concord; before COVID-19 regularly took the Fitchburg for her daily commute, and we had an email exchange about how service could be improved over the one-train-per-hour schedule that has been introduced. This line is mostly double-tracked, except for short single-track segments in Fitchburg and Waltham which constrain the schedules that can be operated. I noted to my colleague that I didn’t know much about SAT solvers; I thought it obvious (at least to a computer scientist) that this question could be represented as a satisfiability problem, for which there are lots of known techniques and libraries to perform the computation. (The general boolean satisfiability problem for three or more variables, called “3SAT”, is known to be computationally intractable; someone who found an way to solve it efficiently would instantly win all of the major prizes in computer science and operations research. In the mean time, there’s been a lot of research into making solvers faster even given though there is no categorically efficient algorithm.)

This question was bugging me over the course of the week, so I did a Google search to see if I could come up with some plausible code that might work. I first got distracted into looking at a constraint solver called “kiwi” (which is a reimplementation of an academic solver called “Cassowary” designed for use in responsive user-interface implementations), but found that it was too limited to even figure cycle times, which was my starting point (not even looking at single-track constraints like the Fitchburg or the Old Colony). I went back to Google and added some keywords to suggest that I was interested in how railroads use solvers, and got some different results.

For whatever reason, I landed on Y. Isobe, et al., “Automatic Generation of Train Timetables from Mesoscopic Railway Models by SMT-Solver“, published in IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences in 2019, for which the first author maintains a web page (probably what I first saw in the Google results, rather than the paper itself). The research was a collaboration between a government institute, AIST, and the East Japan Railway Company (better known as “JR-East”). “SMT” is an acronym for “Satisfiability Modulo Theories“, a generalization of the boolean satisfiability problem (plain old “SAT”) to include richer data types, such as sets and arrays, along with integers and real numbers, and relations such as set-inclusion and numerical inequalities. This sounded like not only the exact sort of thing to be looking at, but quite possibly someone had already done the (not inconsiderable) work of proving that it worked on a model of an actual railroad.

After skimming the paper the first time, I realized that I did not understand their railroad infrastructure model, or perhaps I do not understand how JR-East designs railroad infrastructure. I figured I’d work through the examples first, but immediately ran into snags (after first installing the prerequisites). It took an entire evening to figure out that the code was written for an older release of the Z3 SMT solver, and the current version of Z3 has a slightly different output format that needed adjustments in the parser. The code for the paper is written in OCaml, a language I do not know, so I had to figure out enough of the debugger to figure out where the parse was failing, and then learn enough OCaml syntax to figure out what the failing code was looking for and how to make it look for something different.

The examples, as it turns out, didn’t help. Once I fixed the parser, I could run the examples through the solver and it would generate the expected output, looking very much like the solutions shown in the paper. (I should maybe look more closely at the code, because it does a nice job of generating stringlines directly to PDF or SVG using OCaml bindings for the cairo graphics library, and maybe I could steal that even if I can’t make the solver work.) The difficulty was not with the solver itself, but trying to wrap my head around the way it models railroad infrastructure — what the authors call a “mesoscopic model”. (The term isn’t theirs, it’s cited to an earlier paper that I haven’t read.)

In this paper’s version of a “mesoscopic model”, there are two kinds of fixed objects: “links”, which are just given arbitrary unique numbers, and “structures”, which can be stations or interstations. There’s the first loose cobble to trip you up: you’d think a “link” would correspond to the track between stations — i.e., an interstation — but no, that’s a “structure”. Rereading section 2 of the paper doesn’t really clarify matters, but staring at the examples some more makes it clear that a “link” is more like the interlocking at a station throat. The structure model assumes that every track can connect to every other track at such a junction. My first attempt to model the Worcester Line — chosen because it’s the most familiar to me and has capital construction projects that I could model the effects of — foundered on this issue: while there are many universal crossovers on the newer part of the line, not every station has one, no station except Framingham has one at both ends, and when I looked at how train routings were specified, it’s a sequence of “links”, not stations, so I would have to explicitly specify which track each service pattern would use, which is one of the things I had expected the solver to figure out for me.

After looking again at the major worked example in the paper, which deals with sequencing local and Shinkansen trains on the single-track Tazawako Line, I figured that I should start with a single-track MBTA line, and perhaps things would become clearer. Even there I had trouble, because it seemed like, in the example, the trains could only pass each other at stations, and JR had built nearly every station with two or three platform tracks. (In fact, it was only just now, as I am writing this, that I looked at the Wikipedia article and saw that the paper’s authors had modeled two passing sidings on the line as stations! The solver doesn’t distinguish between the two types of “structure”, but the presentation in the paper and in the solver’s graphical output does. And oh, by the way, all of the comments in the example data are written in Japanese, which I can neither read nor display.) I figured the shortest and least complex of those would be the Greenbush Line, so I spent an evening panning through Google Maps using the “measure distance” tool to figure out how long the interstations were and how much of each was single- vs. double-tracked.

That got me into a more fundamental issue, which I freely admit to just fudging as a means of playing with the tool to see if I could get anything interesting out of it. All of the intervals given in the structure model are in units of time, not distance. This makes sense for stations, but when you want to apply it to tracks, you run into the issue that the time is going to depend on the ultimate routing: a train that approaches an interlocking with a green signal is going pass through it faster than a train that has a restrictive signal, and there’s nothing in the model that allows you to tell the solver that. In fact, unless forced to look for a better schedule, the solver would often come up with bizarre delays for no obvious reason, because Z3 isn’t an optimizer — it finds a solution that satisfies the constraints, but not necessarily the best one. The higher-level timetable solver offers some manual knobs that allow you to force this, in particular a max_time parameter, but unfortunately it’s global. The basic assumption is that not only do you already know the route, but you have already modeled the rolling stock’s performance on the route and you know exactly how much time it is expected to take for all phases of the trip. (And probably also that you’re using equipment which has decent acceleration and braking performance in the first place, i.e., EMUs.)

So rather than try to simulate an HSP46 hauling four bilevel coaches, without specific knowledge of track speed limits and minimum separations, I just reverse-engineered times from the published schedules. There is an object in the model which represents a single run of a train, and you can override the times specified in the structure part of the model if you have trains with different performance characteristics or stopping patterns, but I did not explore that aspect — if I did anything more with this solver, it would probably involve creating a higher-level language to describe lines and trains which could then output the enormously redundant input language of the timetable solver. (And at that point, I’d probably be close to ripping apart all of the OCaml bits and interfacing Z3 to my train acceleration model directly. I don’t think I’ll get there, because that’s a full-time job.)

It was then that I learned, rather to my surprise, that the published schedules don’t work. Clearly they must, in reality, since the schedule is what Keolis actually operates, but the solver couldn’t satisfy the schedule as I had reverse-engineered it. Unfortunately, the solver can’t tell you what’s wrong: it just outputs “unsatisfied” and you have to hack away at the model description until you get something that works enough to generate a schedule, and then look at the graphical output to see where delays are being inserted to account for errors in the model. I did finally get my Greenbush Line model to work, although it didn’t end up looking entirely like the MBTA schedule it was based on.

Sometimes it’s not an error in the model; there are some significant limitations in the solver as well. The most important of these is that it can’t handle turning trains, whether on the platform (for a mid-route short turn) or at a stub-end terminal. I was able to make a model of the Fitchburg Line sort of work, by including Westminster Layover as a “station” and making all trains run through to Westminster, where it doesn’t matter if it’s an hour to the next run, rather than turning on the platform at Wachusett, but when I tried to model short turns at Littleton, there was simply no way to tell the solver “no, this specific train has ten minutes to turn around and must immediately head back whence it came, it can’t just sit there blocking the platform while three other trains go by”. I am certain that this requirement (and round trips in general) can be implemented in a solver like this, by adding additional constraints, but again, modifying the solver logic is a job for a professional (and someone who actually knows both OCaml and SMT solvers). The problem is equally significant at the city terminals: the model wants to have all trains arrive at platform 1 and depart from platform 2, and that’s a physical impossibility — just not one that you can encode in the model without explicitly assigning tracks to trains and manually generating separate “inbound 1”, “inbound 2”, etc. — which again is something I wasn’t interested in doing by hand.

One thing that the solver can do is work with “periodic” schedules; i.e., those that are repeated the exact same way throughout the day at a specified interval. It can’t figure the period for you, but if you give it a period, it will add in the modular arithmetic in the SMT problem definition to ensure that multiple trips of the same set of trains don’t conflict with each other. This makes it easy to figure out how “clockface-compatible” a particular service model is: if it is still satisfiable with period = 3600, you can run each train hourly. If it works with period = 1800, you can run two per hour, and so on. (If it only works with period = 4500 then unfortunately you’ve got a service that repeats every 75 minutes, ugh.) This inspired me to look more closely at the Dorchester bottleneck on the Old Colony lines. All trains pass through Quincy Center, and nearly all trains stop at both Quincy Center and JFK/UMass (which was never intended to have a station, but it just barely turned out to be possible to squeeze one in east of the Red Line tracks). I could take my Greenbush model, delete everything south of Quincy Center, and just see directly what the capacity of that bottleneck was.

Stringline diagram showing maximal Old Colony service

This stringline shows a solution for service every 12 minutes between Quincy Center and South Station. Note that the actual trip takes 20 minutes (1200 seconds) in one direction but only 18 minutes the other way.

By fiddling with the period parameter, I was able to get five trains per hour (12-minute headways, which is to say, a 720-second repetition period) to work with the existing model based on slow diesel trains. I then took Alon Levy’s simulation of trip times with a modern EMU instead of antiquated diesel trains, and made an interesting discovery: although modern equipment cuts the travel time in half (to 10 minutes from 20), it doesn’t help with the bottleneck: a repetition period of 600 seconds (10-minute headways, 6 trains per hour) doesn’t work. But, if you could somehow double the Dorchester single-track, then look what’s possible:

Stringline showing 10 trains per hour with Dorchester double-tracking

This stringline shows what happens if you increase frequency to 10 trains per hour by twinning the single-track south of JFK/UMass station.

That (extremely expensive) intervention doubles the capacity of the line, opening the prospect for frequent service as far as Brockton and South Weymouth. At ten trains per hour, you could have service every 15 minutes to Brockton and every half hour to Kingston and Greenbush (assuming other bottlenecks along those lines were relieved — I haven’t checked that the schedule would work because I haven’t actually encoded the other Old Colony branches). This does get into another issue with the railway solver: the period parameter should really be an attribute of the train, not global, because you’d like to jointly solve multiple services with different frequencies that all share a common bottleneck without laboriously open-coding the whole pattern of repetition. (Ideally, you’d like to be able to solve the entire network as a single model!)

Anyway, for those who are interested, you can see my version of the original “RW-Solver” code on my GitHub fork, which includes some of the kluged model files I’ve been playing with.

This entry was posted in Computing, Transportation and tagged , , , . Bookmark the permalink.