In our last installment, I talked about how I wasted an entire weekend developing a not-very-good model of a way to make the commuter-rail line that serves my city suck less. Well, it’s been another nasty rainy weekend here in Framingham, so I spent even more time (also some nights) trying to make my model sufficiently less not-very-good to be able to simulate more interesting variations than the uniform-headway, 100%-local service that I initally implemented, to see whether they were any better (or any cheaper to operate). This was not especially easy, because I don’t actually know R, but a lot more of the work was done entirely by hand. I’m sure actual railroads have good equipment planning and scheduling software, that they probably either developed in-house or gave huge piles of money to a consulting firm for. I on the other hand have the Mk. 1 human brain, which for a problem of this size is more adept at implementing backtracking search internally than it is at writing that down as code. I’m going to describe my process, and while all of the code and data, and most of the results, are available in my GitHub repo, I’m taking the liberty of simplifying out a number of wrong turns and dead ends, and reordering some things for a more understandable presentation.
Code restructuring
As I presented it last week, the simulator consists of two pieces: a demand model, which predicts how many passengers will board a particular train (given the arrival time of that train and the next train at the final destination), and a supply model, which was just a fixed bit of one-off code that generates the particular schedule I was interested in looking at. This is clearly not a good way to do it, because it means that constants scattered throughout the code, many of which are arithmetically related, all have to be changed in unison in order to change either model. I started out by moving the demand model into a separate function, so that the supply model did not need to have any knowledge of the parameters to the demand model — it becomes a higher-order function that accepts the demand model as a parameter. I ended up completely restructuring the demand model anyway, in order to implement some of the features described in the next section, and also because there were a bunch of bugs that I found when I started simulating trips that don’t stop at every station.
The way the old supply model worked was that it took a fixed schedule, and for each train, it iterated over the stations, invoking the demand model for a random sample of passengers’ desired arrival times, and just counted up the total passenger loading — the actual supply part was done by me visually inspecting the loading across all trains to identify impossible loads. This was totally inside out: the correct way is to compute the entire demand vector at each station, and then divide it up among the trains on the schedule, and the way I had done it was exceedingly slow, even by R standards. The way I had done it before was also statistically bogus, because fetching a new demand vector for every arrival at each station means that some passengers (due to random chance) can be either counted multiple times, or missed entirely. In the new code, all of this is parameterized using higher-order functions, and every run writes to disk not only the model output, but also the predicted number of trainsets required (given a seat count) and the actual station-by-station, train-by-train schedule that was used in the computation (which can be inspected if the model seems to have done something strange).
All of this is wrapped up in a nest of new functions which make actual simulator invocations much simpler — for a uniform, all-local-train service like the ones I looked at last weekend, a single line of R code does all the work and tells the simulator where to save its results:
doit("4tph-local.csv", function () make.local.service(360, 720, 4))
(Yes, I am a bit unimaginative with my function names, so sue me.)
Adding a bit more statistical sophistication
If you read last week’s post, you’ll recall that the demand model I used was a very simple one: assume that the demand for travel from a station to the terminal reflects a set of desired arrival times that are uniformly distributed over the period between “this” train and the next. This is obviously unsophisticated, but not obviously stupid. I wanted to do a little better, without putting a lot of work into learning statistical demand models for transportation, so I added two bits of “fuzz” to the demand model. First, I assume that some people on a train would actually rather be arriving a little bit earlier than the scheduled arrival, and that anyone who really wanted to arrive just a little before the next train would suck it up and take the later train — “a little bit” I defined arbitrarily as 5 minutes, which is much smaller than the interval between any of the arrivals in the 2012 CTPS study. (The minimum headway in the 2012 schedule was 12 minutes.) I also figured that there would be some day-to-day variability in ridership, and again entirely arbitrarily I chose to multiply the CTPS passenger count by a normally distributed random variable (μ=1, σ=0.05) — I suspect this effect is overwhelmed by the uniform distribution of arrivals but I left it in without doing any serious examination.
The other big problem with my original model is that it just took a single run of the model as gospel, rather than looking at the behavior of the system across multiple samples. I fixed that by the good old-fashioned Monte Carlo method: the simulator now runs the model 250 times, and computes the 90th percentile over all of the simulated loadings to come up with a more confident guess of how many seats are needed. (It can also compute the median, or indeed any quantile you happen to like, although this isn’t parameterized — I chose 0.9 because that corresponds to my intuition that an operator probably wants enough seats on the train to consistently seat the maximum load at least 90% of the time.) Given those 90% loadings it’s easy enough to divide by the number of seats in an EMU trainset and round up to get the required number of trainsets. As before, I used the JKOY Class Sm5 trainset from Helsinki, which seats at least 232, and which is part of a family of EMUs (the Stadler FLIRT) that is available for sale in the US.
Results of running the updated model
Having made the model at least look more sound, on Friday night I reran the simple schedules I first simulated last weekend, starting with the simplest, four-trains-per-hour all-local service from Worcester. I got a bit of a nasty surprise: although my initial runs on the old model suggested that you could run 4 tph service with nothing longer than a three-trainset consist, the new model predicts one train (the one that would arrive at 8:45) to require four trainsets — and even three-trainset consists are a problem, for reasons I’ll explain shortly. But four-trainset consists are just impossible on the Framingham/Worcester Line, because the MBTA standard platform length is 800 feet and four trainsets are just under 1000 feet long. That said, the 90% loading is only 45 passengers over the 696-seat capacity of a triple, so maybe the cost savings (if there are any) might justify tolerating more crush loads on that train. It’s possible that such a situation would naturally sort itself out, if enough passengers chose to switch to an earlier or later train, but I’m uncomfortable starting out — without accounting for increases in ridership since 2012 — on the basis of a predictably over-capacity train. So I went on look at other service patterns, first the uniform 5-tph and 6-tph ones that I had examined before, and then some other more complicated service patterns once I had implemented the ability to do that (and fixed the bugs in the model that doing so exposed).
Disappointingly, both the 5-tph (12-minute headways) and the 6-tph (10-minute headways) all-local services require three-trainset consists. The 5-tph service has three triples, and a manual inspection of the loadings made me think it really needs five triples, which is a lot — more, in fact, than the 4-tph service. The problem with these three-trainset consists is that there isn’t room to store them at Worcester overnight, where they’re actually needed to provide the service, so you end up wasting a lot of equipment-hours on what are essentially deadheads (there’s just not that much demand for pre-rush-hour seats to Worcester) and when once they get in to Boston you either have to send them back out or you have to store them somewhere. Also, one trainset in a triple can’t platform at Yawkey, because trainsets are 250 feet long and the platform there is only 650 feet, and rush hour is specifically a time when all-door deboarding is necessary to maintain short dwell times. The 6-tph service requires only one triple, which is still more than my first whack at it last weekend suggested. There is just a huge amount of travel demand to get into Back Bay or South Station before 9 AM.
Next, I started looking at short-turn service patterns, where every other inbound train starts at Framingham rather than Worcester (and likewise every other outbound train stops there). This sort of service pattern uses less equipment, in theory, than the all-local service pattern does, but it has the disadvantage that the outer stations receive half as much service during peak periods — which then doubles the ridership on the outer trains. When I simulated the 6-tph service with this model, I found that it was even worse than the all-local service, because the Worcester trains were leaving Framingham already full, and still had to pick up more passengers on the way. (Why not run expresses, like the current service? That would even out the load, but the Regional Rail service is sufficiently faster that a train that runs express from West Natick will catch up with and get stuck behind the previous local train — and of course the inner stations then lose the benefit of the investment in equipment and infrastructure that supports 6-tph operation, because half the trains don’t stop in Wellesley or Newton. That’s likely to be a big loser on Beacon Hill. Hypothetical expresses might as well stop at Boston Landing because the train they’re stuck behind is going to anyway!)
Finally, I investigated other service patterns that would break up the rush-hour demand, and also included reduced frequencies at other times (such as midday) when there is less need for seats. The one I came up with that I like the most is what I call “rush-hour push”: it starts out with 6 trains per hour, then increases frequency to 7.5 trains per hour (eight-minute headways) right around the peak of morning rush, from 8:40 to 9:20, and then drops down to 3 tph after 10 AM. (I chose 8:40 to 9:20 intentionally since those endpoints are divisible by 10.) This knocks the predicted trainset requirement down to two at the peak (in fact, it moves the heaviest-load point earlier, to 8:10). Remarkably, this schedule requires only one more trainset than the 26 I came up with under the old model for 10-minute headways. The lower frequencies during the late morning and early afternoon can be easily sustained without a lot of extra crew expense, but of course it comes at a cost of having to store those trainsets somewhere, which is one of the most limited resources in the current commuter rail system.
Storage
So let’s talk about train storage. There are several places currently used to store trains for the Framingham/Worcester Line (or that could plausibly be so used): Worcester has space for four consists of the current equipment (which I’ve arbitrarily called 3200 linear feet, which should be enough for 12 trainsets of modern EMUs). The rest of the space (in the current service, something like eight consists are required) comes from Amtrak’s Southampton St. yard, which is in South Bay not far from South Station, or in the MBTA’s Readville yard, which is in Hyde Park, 30 minutes away. Last weekend, I looked only at existing facilities, but as I was developing the equipment requirements for this “rush-hour push” service, I found myself thinking, “What about Framingham?” There are three railyards in Framingham: one off the Agricultural Branch west of Franklin Street, one off the Main Line south of Fountain St., and one at the end of what’s left of the old Milford Branch, where it once served GM’s Framingham Assembly plant. Surely some space could be found in one of those — Fountain Street by preference, because it’s right on the Main Line and west of the Framingham station so no reversing is required. Furthermore, having early-morning revenue service from Framingham to South Station has actual transit value, unlike deadheads from Readville or Southampton St., because there are a lot of people who have reason to want to get to South Station between 5 and 6 AM. (Business travelers looking to catch an early Amtrak or a 6:30 flight out of Logan, airport employees, service workers in convenience stores and fast-food places that need to be ready for service by 7 AM, the list goes on….)
So having considered that maybe it might be possible for the MBTA to either rent or construct space at CSX’s Fountain Street yard sufficient for both overnight and midday storage, I reworked the equipment plan some more, and I ended up with a service that adds five early-morning trips from Framingham and requires only two trainsets to be stored at Southampton Street. It does require 1000 feet of space near Worcester Union Station, which may not be available or constructible, in which case those trains would have to deadhead somewhere else (perhaps Grafton or Westborough if not all the way back to Framingham), but all of the remaining midday storage ends up at Framingham rather than in Boston — and because headways are reduced during those same hours, there are plenty of gaps in the schedule for freight moves west of Framingham. Although I haven’t modeled the PM peak at all, those trainsets stored at Framingham would be in the right place to resume 6-tph service in the afternoon, since the reverse-commute demand from Worcester wouldn’t justify ten-minute headways at the time when the equipment would be needed at South Station. Hopefully, this also reduces the number of split shifts for train crews, although I haven’t attempted to model that. (Doing the equipment was hard enough, and of course when you’re running a train every ten minutes, crews can just hop on the next train at whichever station.)
An equipment plan and the fully worked schedule for the AM inbound direction can be found in the spreadsheet in my repository (PDF of the latest version). I also did an equipment plan for a version of the 4-tph service that accepts a crush load on one train inbound, which requires no “storage” at Worcester but does require some place a couple of three-trainset consists can go to clear the platform for the better part of an hour and a quarter; I don’t like it. Someone else could probably come up with a better schedule that doesn’t have that defect; the only advantage of running 4 tph is that it reduces the equipment requirement from 27 trainsets down to 24 — and that’s about $24 million so it’s not peanuts.
So what about that Agricultural Branch, eh?
If we’re going to be storing trains in Framingham at midday, maybe Fountain St. isn’t actually the best place to put them. There are actually three branch lines from Framingham that might serve potentially useful destinations: the Agricultural Branch, part of CSX’s Fitchburg Subdivision (although it doesn’t go all the way to Fitchburg any more), which runs north from the wye at Framingham station; the Milford Branch, which runs south, parallel to Route 126, towards South Ashland and Holliston; and the Framingham Secondary, which runs south through Sherborn and eventually connects to the Franklin Line. The Milford Branch is abandoned south of the yard west of ADESA (the old GM plant, now an auto-auction facility), and in Ashland and Holliston it has now been converted to a trail. The Framingham Secondary is now CSX’s primary freight route to Southeastern Massachusetts, and while the section through Foxborough has a commuter rail station, there is no great travel demand between Framingham and the other towns along the route that would justify even building a station, never mind electrification and double-tracking. So that then leaves the Ag Branch, and that one is very promising indeed. The Ag Branch runs from the Franklin Street yard in downtown Framingham north past the old town incinerator to Framingham State University (three FSU parking lots directly abut the line), and then runs north of Route 9, crosses Baiting Brook, and follows the Turnpike to Framingham Technology Park.
The office park is home to the Sheraton Framingham Hotel and Conference Center, the corporate headquarters of Bose, and a large Sanofi-Genzyme facility, among other important destinations; the corporate headquarters of Staples is not far, on the other side of Route 9, and the park is served by MWRTA buses from Natick and Marlborough. There are oceans of parking, suggesting that most employees at these companies currently drive to work. There are abandoned (and mostly lifted) sidings parallel to the line in the park, suggesting that there is enough right-of-way for two 2000-foot tail tracks (west of the California Ave. grade crossing) and an island platform station with a stub-end track for a shuttle. There are numerous low-rise buildings nearby, including Bose and Nestle Waters distribution facilities, which could be relocated if train service made office development in the park more economically attractive. Finally, the whole complex is just off the Massachusetts Turnpike, an ideal location to intercept commuters who might be coming from the towns north of Route 9 such as Southborough (the Southborough station is at the very southern end of the town and not freeway accessible), Marlborough, and Hudson. I would even suggest not building public parking at a station here: let the landowners decide for themselves whether that’s the best use of their land. Ultimately, from the perspective of this exercise, you could just build the layover facility and not bother with an actual station. But once you’ve built a station, upgraded the tracks, installed switches and signals, and extended electrification (oh by the way, there’s already a substation just south of the tracks), you might as well spring the extra $8 million for a single EMU trainset that can operate a shuttle service during non-pull-in/pull-out periods. The four-mile ride from downtown Framingham to Tech Park would take about eight minutes, a significant improvement on current MWRTA bus service and competitive with driving if the shuttle is timed appropriately to meet through trains at Framingham station.
So what are the issues for implementation? Obviously, the track is in very bad shape, and would need to be improved to at least 39-mph service. (The one freight train a day appears to run at 5 mph; I used to watch it from my old dentist’s office abutting the tracks at the Route 9 grade crossing.) There are remnants of dual-tracking between Mount Wayte and Maple St. that may need to be reinstated, and bridges to replace or renovate. Electrification is obviously required in order to get the EMUs from Framingham station to Tech Park, and you’ll want a side-platform station at either Maple Street (most constructible) or Salem End Road (best location) to serve Framingham State University. A new side platform will also have to be built on the Framingham yard wye, which is complicated by the fact that the wye infield is now used for station parking; probably the CSX westbound leg of the wye will need to be relocated in order to make enough room for the platform. The total cost of this project might be $25-$35 million. Worth it? I’m not sure. Fountain St. would definitely be cheaper, even with takings and construction costs. But it’s worth studying — and it might be worth it to the employers that would be served, if it helps them attract talent.
Thanks for putting up with my transportation rants. We’ll be back to your normally scheduled baking some day soon.