Other people’s recipes: Torrone morbido

This gallery contains 13 photos.

Yes! It’s back! Doing my part to move more of my quality content off of Twitter and onto a platform I control (well, pay for, anyway). Torrone is an Italian confection, popular for the winter holidays, which is widely available … Continue reading

More Galleries | Tagged , , | Leave a comment

Question 1 passed, so now what?

Given the ongoing issues with Twitter, as detailed in my previous post, I’m trying to move more of my mid-length writing back onto the blog so I’m not generating as much free content for the South African emerald-mine heir. This is an experiment; we’ll see how it goes.

Massachusetts voters have approved the so-called “Fair Share Amendment”, which provides for a 4% surtax on incomes over $1 million. Before this vote, the Massachusetts constitution required a flat income tax, without regard to the marginal utility of money. The legislature had previously put amendments before voters permitting (but not requiring) a progressive income tax, including at least once since I moved here in 1994, but the previous attempts — made at a time when Barbara Anderson and the Reganite “tax revolt” were still alive — went down to defeat. This was actually the second try at the “millionaires’ tax”; the first one was struck from the ballot by the Supreme Judicial Court, who ruled that it was improperly drafted and violated the constitution’s prohibition on combining multiple subjects in a single referendum. The proponents went back and were able to get a revised amendment through the legislature and onto the ballot, and that’s what we approved on November 8.

The constitutional infirmity with the original amendment (that should have been on the ballot in 2018) was that it directed how the new revenue was supposed to be spent. The new text leaves it up to the legislature to spend the money, although it is supposed to be spent exclusively on transportation and education. The state will begin collecting this money starting in January, 2023, so the legislature could appropriate as much as half of the revenue in the current fiscal year.

The Tufts Center for State Policy Analysis estimates that the state will collect between $1.3 and $2 billion in calendar year 2023 (although this will be split across two state fiscal years). As soon as the legislature meets, they will start working on the FY24 state budget, but the new governor, once she’s inaugurated, can submit a supplemental state budget for FY23 that would appropriate the new revenue.

Note that the very open-ended way the amendment was worded does not require that the revenue actually be used for new spending — indeed this was one of the criticisms of the amendment raised by opponents. The legislature is perfectly free to simply alter which revenue accounts existing transportation and education appropriations come from, and then reallocating the original funds to some other purpose (including tax cuts if they so desire, and our state legislature is still full of conservative anti-tax types, and not just those with an (R) after their names).

I don’t really have many strong opinions about how education is funded in Massachusetts, other than that property taxes are obviously inequitable and the state really ought to equalize funding per pupil. (Again, the amendment’s text is quite general, and the legislature could spend all of the new revenue on subsidies for UMass and nothing on transportation or local schools if they wanted to.) I’m instead going to concentrate on transportation.

Currently, the MBTA is funded by several sources: aside from fares and other “own source” revenue, the original funding mechanism was the local assessment paid by all cities and towns in the MBTA district. (Fall River voted this election to join the MBTA district, a prerequisite for the start of rail service next year.) The 80s “tax revolt” put paid to the local assessment as a major source of revenue, although the T does bond against that revenue. (The municipalities don’t directly pay the assessment: it’s subtracted from transfer payments they would otherwise receive as state aid.) As a result, the primary source of the MBTA’s non-operating revenue is direct grants from the state, which comes in two forms: “contract assistance”, which is an annual appropriation by the legislature, and one cent of the state sales tax, which flows automatically to the T and is subject to a guaranteed minimum level, allowing the authority to effectively bond against it. In FY23, the contract assistance was $187 million (of which $60 million is bond authorization dedicated to capital programs), and the state sales tax is projected to be around $1.2 billion. (The sales tax being much better than projections for the last three years has been a significant positive for the MBTA’s budget; many other transit agencies around the country were not so lucky and received substantial subsidy cutbacks because their revenues were not guaranteed.) The state’s numerous Regional Transit Agencies get no sales-tax money and depend entirely on fares and annual state appropriations — over which there is considerable fighting in the legislature every budget cycle — which currently runs around $90 million.

So now that there is this new revenue source, how much of it should go to transportation vis-à-vis education, and how should that be structured? I would argue that the first priority should be to create a structure similar to the MBTA’s sales tax dedication: pick a fraction of the surtax revenue and automatically transfer it without further appropriation to specific transportation agencies, with a guaranteed minimum that allows for multi-year planning across budget cycles and administrations. Specifically, I would propose the following allocations:

  • 25% or $250 million to the MBTA
  • 10% or $100 million to the RTAs, using a formula that rewards ridership
  • 15% or $150 million to a new east-west passenger rail authority

In the longer term, this would substitute for the existing (lower) appropriations; in FY23, a special appropriations bill could provide “top-up” funds from the first half-year of revenues. As a condition, the MBTA and the RTAs should be required to implement means-tested fares, using some program that the state already administers for eligibility, and EOHHS should be required to administer it for all of the agencies on a cost-recovery basis. The MBTA estimates (as of October) that implementing a means-tested fare program would cost between $46 and $58 million annually, which is easily covered by the increased subsidy. (At the low end of the Tufts CSPA projection, the MBTA would get $325 million, or $138 million above its current state assistance.)

What else should the MBTA be expected to do with the money? Obviously, close the structural budget deficit first and foremost, including all the new safety hires and more realistic salaries for rail and bus operators and maintenance personnel than were included in the recent 5-year pro-forma. On the capital side, the legislature should finally require electrification of the regional rail network by a date certain — I propose 2035 as a reasonable compromise between rolling-stock lifetimes and when I’m expecting to retire. I would also like to see the MBTA reduce its pass multiplier, reducing the costs of fare collection and inspection by making monthly or even annual passes the default for more riders. (Note that the MBTA currently projects a $208m budget deficit in FY24, so even the entire $138m wouldn’t be enough to solve this it, but the high end of Tufts CSPA projections would. It’s possible that a fare increase will be necessary, which could be paired with the means-tested fare program to reduce the impact on lower-income riders.)

I know the Healey administration’s transition team has put transportation in some good hands (there are few people I’d trust more than Monica Tibbitts-Nutt after watching her for four years on the MBTA’s former Fiscal and Management Control Board) but the state legislature is chock full of suburbanites with windshield brain and actually getting this program passed will require some lobbying — even if it does free up nearly $300 million for them to spend on their own pet projects.

Posted in Law & Society, Transportation | Tagged , , | Comments Off on Question 1 passed, so now what?

The Twitter That Was

Attention conservation notice: 6,000 words about the decline of a social-media platform, none of which are particularly original or well-informed.

Since Elon Musk took control of Twitter at the end of October, like many people I’ve had to ponder a number of questions: Why am I here? Am I still getting what I wanted out of this experience? Would I be better off just logging off? What alternative is there should Musk execute a Controlled Flight Into Terrain? These are hard questions to answer, and even waiting 24 hours to download my account archive hasn’t really made it any easier. I’m not a facile writer, and it’s taken me quite a bit of effort (and powering through a lot of distractions) to get even this much in one place, and it’s probably going to be pretty disjointed. (Generating words has never been a problem for me, it’s getting them in the right order that so often eludes me.)

According to my Twitter data download, I joined Twitter on March 24, 2012. I had entirely refused to be involved with any “social media” until that time; I thought it somewhere between harmful and pointless (perhaps both). But it was a time when I was losing a number of work colleagues I liked, and Twitter had made a few design choices that at least made me willing to consider it (unlike Facebook).

Structurally, Twitter’s model of “following” was much more to my taste than the faux “friend”ship of Facebook and its work-alikes (now largely forgotten). You could “follow” someone on Twitter and see what they had to say, with no expectation of reciprocity. A “favorite” in the original Twitter was more like a bookmark; it wasn’t a shadow-retweet like “likes” are today. And even things we take for granted on Twitter now weren’t part of the data model in 2012: it was all just plain text. Retweets were just posts that started with “RT” and a handle — they were often edited to add comments or just to fit into 140 characters.

That, in combination with an open API, made it possible for there to be multiple, even open-source, user interfaces to Twitter, something that wasn’t possible with Facebook. That allowed me to exclusively use a terminal-mode Twitter client for some time, keeping their tracking cookies out of my browser and avoiding the excessive use of addictive smartphone apps. This didn’t last long: within a few years, Twitter started making life difficult for third-party user interfaces and restricted full use of the API to only “official” clients. However, I still keep the terminal-mode client running 24×7 under script, so I have a running log (as rendered text, rather than JSON objects) of my entire timeline. (With the API restrictions, it can no longer show inbound notifications and frequently hits rate limits.)

Frustratingly, although the account archive does include all of my tweets and retweets, and all of my “likes” (even from back when they were still “favorites”), the following, follower, block, and mute lists are all uninformative: they contain no dates, are not sorted in any obvious order, and do not identify the user being referred to. I had hoped, when I requested the archive, to be able to look back at my follows in chronological order and that just isn’t data they seem to keep. (I find this particularly surprising given the amount of advertising-related data they do keep.) The archive does tell me that I have blocked 13,178 accounts (!), and muted 553, but doesn’t give me a count of either followers or following. (grep works, though: 1194 and 1008.) So the archive seems on the one hand excessive for normal uses and on the other annoyingly incomplete.

The first thing I ever tweeted was a link to a blog post by Mike Konczal (@rortybomb), just a month before he closed up shop at his old free-tier WordPress.com blog. While I was never particularly tuned in to the “blogosphere”, as the big (largely political) bloggers called their community, it’s clear that I was actually reading quite a lot of them, using Opera’s (RIP) built-in RSS reader. (I kept using Opera long past its sell-by date just to have that reader in my home browser.) Many of my first month’s tweets were links to, or inspired by, blogs I was reading at that time: Language Log, Three-Toed Sloth (Cosma Shalizi), Antick Musings (Andrew Wheeler, the former SFBC editor), SCOTUSblog, Jack Balkin’s “Balkinization“, and the new-deleted airline pilot blog Flight Level 390.

My parents were still living in Massachusetts at the time, so I also tweeted a bit about my father’s dog Mocha, and our Sunday dinners, which I really miss since they moved (five times since 2012). Mocha was a two-year-old rescue mutt when my father got him, and in 2012 he was hit by a car — thankfully my parents could afford the veterinary care to save him. He’s still alive today but quite old for a dog and spends most of his time sleeping, except when barking at the neighbors’ golf carts. That spring, my cousin the airline pilot got married, and I tweeted about my trip to San Francisco and the North Bay (her Scottish husband worked in tech and they lived in SF at the time). Around the same time, I was apparently reading Alon Levy, although I didn’t leave any trace of how or why, because it was another five years before I got involved with “Transit Twitter” (after a revelatory trip to Helsinki). That summer, I drove my parents up to Carol Noonan’s Stone Mountain Arts Center in Brownfield, Maine, to see a Knots & Crosses reunion show, and tweeted about it. In the early time, I often went days without tweeting anything — of course there were no Threads back then.

As I mentioned, the impetus for signing up with Twitter in the first place was a number of my colleagues all leaving at the same time. Because of a bad decision I made in 2001, I am stuck living out in the suburbs and as a consequence have no social life; I had hoped to be able to keep up with what my former coworkers were doing through Twitter. That very quickly failed to work out: most of those colleagues used Twitter seldom if at all, and even by setting up the phone app to notify me whenever they tweeted something, genuine interaction was infrequent and overwhelmed by the “firehose” of news and politics that most regular users end up with. Journalists were the first large-scale adopters of Twitter as a medium — both to advertise their work and also to cultivate sources — and the new-user suggestions at the time were very heavily weighted towards high-volume news, “entertainment”, and tech-industry accounts.

I did make an effort to follow a number of my then-current colleagues, but very few of them were or are active even daily; like my ex-coworkers, they were largely not using Twitter for interaction. Many of them were students, and at least my corner of Comp Sci Twitter largely uses it as one vehicle among many to promote their research (and for faculty, their students). There’s nothing wrong with this, but I really appreciate those people who actually use Twitter for something other than professional advancement — whether it’s Rod Brooks flaming autonomous-vehicle boosters, Mark Handley and Dave Andersen posting COVID stats for places I don’t live, or Hari Balakrishnan flaming about The Cricket. Other people I wish would tweet more, if for no other reason than to reassure me that things are still going OK for them.

Coming into Twitter in 2012, I already had a substantial variety of interests, as you can deduce from some of the blogs I mentioned above: I was nearing the end of my particular interest in broadcasting facilities, but SF, language, science, and international sports had been abiding interests of mine since the 1980s, and I gravitated towards many of those communities on Twitter. I ended up following far more economists than I would ever have expected, as well as more wildlife biologists and even more fantasy authors. (With I think one exception, the authors I follow are not the ones I read much if at all: I have a thing about knowing too much about authors.)

I created a WordPress blog in 2013, and immediately started using my own Twitter account, which probably had a hundred followers, to promote it. It wasn’t my first blog, but I had given up on trying to maintain blog software locally — there were too many diseconomies of scale compared to simply paying WordPress.com $100 a year to deal with PHP security holes and spam. (In fact, my very first blog post was about that precise decision.) Many of the things I posted in the early years on the blog would today probably be Twitter threads, but that was before Twitter actually implemented threads in their data model, let alone the mobile clients, which made a much clearer distinction between “where researched, long-form writing goes” and “where offhand, slice-of-life observations go”.

One of the first things I started blogging was recipe walk-throughs, originally as photo essays, and in conjunction with this I started following a bunch of cookbook authors, so that I’d see when they had something interesting coming out (and also to make it easier to tag them when I wanted to ask questions about one of their recipes). I followed a bunch of musicians early on, too, especially the ones whose new projects I probably wouldn’t find out about through traditional radio. I quickly realized that most of these accounts were run by publicists and not actually connected to the artists, which makes them read quite oddly — but there were once some substantial exceptions. I think Rosanne Cash may be the only one left; most of the others have simply left the platform. (In 2013 I was still regularly commenting on what my music player was playing when the mood struck, something I almost never do now.)

Looking back at my very earliest follows: of course the first people I followed were my few friends and those colleagues. I followed the official accounts of a bunch of radio shows I listened to, some defunct like Studio 360, and others still on the air like the incredible Ideas from CBC Radio. The first authors I followed were Diane Duane and Tom Limoncelli, and the first non-trade-publication journalist was Lyse Doucet, the BBC’s Canadian-born chief international correspondent (and a fellow descendant of Acadiens). Since I was still in the radio hobby, I followed a bunch of people who either were radio engineers or who wrote about it. The first musical artist I followed was Catie Curtis, followed very soon after by others from the folk scene like Patty Larkin, David Wilcox, Jonatha Brooke, and Lucy Kaplansky. I continued to add many other people in linguistics, economics, radio, and science; at one point I tried to follow all of the BBC announcers presenting on the World Service (many of whom have since left).

At some point, I made an effort to find and follow as many faculty and grad students (or former students) from our lab as I could — and I very quickly found that the Pareto principle applied to my academic colleagues as well as my co-workers: the vast majority of them said almost nothing, and certainly not nearly enough to break through the chatter in a busy timeline. Somehow, I think probably through #MarchMammalMadness, I ended up latching on to the wonderful world of “bio twitter”.

Then came November 8, 2016.

Like many people around the world, I recoiled in horror when the revolting Donald Trump won the presidency. I found many like-minded people on Twitter, but more importantly, it gave me a way to follow the day’s news without having to actually listen to the news — which had become intolerable as every A-block for three straight years started with a Trump sound bite or “hey look what racist garbage Trump tweeted today”. Thanks to blocks and the setting to disable auto image loading on mobile, I was able to avoid actually seeing much of Trump’s vomit actually on the platform itself, while still keeping abreast on what harm the administration was doing day to day. I didn’t bother following the big-name political reporters: I could be sure that the other reporters I followed would retweet if they said anything notable.

The Trump election also made for a meaner and less pleasant Twitter experience for a lot of people, and looking at accounts I followed before that election, I see that a fair number of them simply dropped off — either became entirely lurkers, or just stopped using the site — around 2017-19. But for me, 2017 was a year when I significantly broadened my interests.

This actually goes back to February of 2016, when I was driving to work one morning and saw a billboard advertising the World Figure Skating Championships. I had been a big fan of the sport as a kid (solely in a spectating role; I never learned to skate) but had dropped off after I moved to Boston and couldn’t watch Canadian television any more. I started paying attention to the Winter Olympics again in 2010, but it wasn’t until the Worlds came to Boston that I actually thought that I could actually go see a major international competition in person. I was able to buy tickets for some events at the 2016 Worlds, and I bought the program book — which included an advertisement for the 2017 Worlds in Helsinki.

I’m not going to recapitulate my history as an exchange student in Finland, but suffice it to say that I thought enough time had passed (28 years) that I would be comfortable traveling there again and not have to feel embarrassed at my lack of facility with the language. Having not visited any European country in this millennium, I was bowled over, especially by the transportation infrastructure, and made a series of blog posts about it after I got back — in addition to taking thousands of pictures of the skating and running out of space on my laptop’s tiny SSD. My experience made me feel comfortable enough going back to Finland that August for the fabled World Science Fiction Convention: something that I had heard about, but assumed was only for really hard-core fans and not for someone as ill-read as me. (It was not a coincidence that both events were in Helsinki that year: it was the centenary of Finnish independence and substantial grants were available to bring events of international importance to the country.)

My blogging about both trips to Finland got me into Transit Twitter, which has a pretty big overlap with Urban Planning Twitter and Housing Twitter. I also started to get into Information Security Twitter, although I’m not sure what the impetus was — it may well have been something that a FreeBSD developer retweeted. I started following a lot of athletes, I think in the lead-up to the 2018 Olympics, although it’s hard to be sure. (I followed a bunch of SF-related accounts around the same time, so Worldcon 75 seems likely.)

I started getting more politically engaged in 2018 thanks to Maciej Ceglowski (@Pinboard), who ran independent fundraising campaigns in both 2018 and 2020 to try to get Democrats elected in “winnable” seats that the national apparatchiki had deemed not worth pursuing — but the only real success was Jared Golden in ME-02.

Ah, yes, 2020. The year of the pandemic, when every journalist (and economist) was doing double duty as an epidemiologist. I found a very small number of people who actually were virus experts, like Trevor Bedford and Emma Hodcroft, but mostly I was relieved to be able to watch the news while I was stuck at home, alone, because the virus had pushed Trump out of the A-block. I started watching BBC World News over lunch, since I couldn’t go into the office even if I had a need or desire to do so, and probably blocked more dishonest bloviators than any year before or since. I finally gave up on The Atlantic, which had succumbed to terminal Washington brain after its unfortunate relocation from Boston some years previously. (Sorry, Ed Yong!) Of course, the pandemic canceled all of my 2020 travel plans and much of 2021’s as well, starting with the World Figure Skating Championships that were supposed to start in Montreal the week the travel restrictions started. (The 2020 Worlds were rescheduled to 2024.) Similarly, the 2021 Bobsled & Skeleton World Championships were to have been in Lake Placid (at a newly renovated facility!) and were also postponed. The practical upshot of all this was that I had a lot of time stuck at home, alone, unable to travel and with nothing much else to do for leisure except constantly scroll Twitter.

It was at this point that I realized I was in real danger of an overuse injury if I didn’t start to limit my “phone time”. I used the “Digital well-being and parental controls” feature on my phone to limit my use of the Twitter app to just two hours a day. I later tried to crank this down to one hour, but found myself constantly bypassing the restriction because I wanted to tweet something before I forgot what it was. I also started using Web Twitter much more — I had previously only used it to do things that are difficult on the app, like posting long threads summarizing MBTA board meetings back when those were in-person. Since Web Twitter can be scrolled with a mouse or the keyboard, it’s less stressful for the forearms than holding a phone up in front of my face and tapping on the touchscreen.

There aren’t that many pleasant surprises in my experience on Twitter. The communities I found myself connected with were the biggest, of course, and the willingness of some Big Names (but not New York Times reporters) to actually engage with their readers. Beyond that would be the wide variety of friendly and helpful bots — thanks to some really really bad reporting in 2016, “bot” somehow got attached to the idea of an underpaid Moldovan kid posting election disinformation under a thousand different identities, but there are actually a lot of honest bots that automatically post legitimately interesting content. Among the best was @_everybird_, which has now sadly retired, but still going (for now) are Joe Sondow’s @EmojiTetra; @pomological, which posts early-20th-century fruit pictures; the various “every lot”, “every tract”, and “every USPS” bots; @sansculottides, which tweets the current date in the French Republican calendar; @tinycarebot, which reminds you to take care of yourself; and especially @hourlykitten, which at the top of every hour posts a freely-licensed photo of a kitten from Flickr.

On the negative side, there were a few surprises. I’ve already touched on how much time I ended up spending on the app, which came to be rather concerning, and has certainly taken away from the time I could be spending doing anything productive (from cycling to reading to writing). It’s definitely made it much more difficult to get out of bed in the morning: way easier to just grab the phone and scroll for an hour than to actually brush my teeth, let alone getting my bike kit on and going for a chilly morning ride. I think it’s also made my vision worse, although it’s hard to prove that it’s anything other than natural presbyopia setting in as I approach 50, combined with a really crappy fit on my most recent pair of glasses.

Another surprise was the number of “bad bots” that exist solely to steal other people’s content — often scraped from Reddit, complete with erroneous captions, but with the original creator’s identity filed off. There’s a whole ecosystem of these, such that accounts have popped up to warn people about them, provide accurate descriptions of the scraped artwork, and maintain records of how frequently these accounts get banned and then recreated under a slightly different name. (@PicPedant is one re-identify-er that I follow; there is also @HoaxEye.)

A different genre of “bot” is structured as a “honey trap”: an account with a random female-coded name tweets out three stolen pictures of an attractive woman then follows a thousand or more randomly selected users in the hope they will follow back. The account then goes silent for months before launching a spam campaign (which eventually gets them banned). Similarly annoying are the “Kibo” bots — these search constantly for any mention of keywords they are programmed to trigger on, and then interact with those tweets in some unhelpful way, like retweeting to an unexpected audience or causing another account to reply with spam or abuse. (I have named these after James “Kibo” Perry, who in the days of yore would search the Usenet feed on world.std.com for any mention of his name, and join the conversation. He was less annoying.)

This sort of behavior is something that Twitter really ought to have had the capacity to detect and block, but never seemed to manage. Some of it seems to have accelerated since Musk’s takeover, as if the bad actors are testing the limits of the (now greatly diminished) trust & safety team.

A surprise that’s hard to characterize as either good or bad is the level of data about advertising that’s included in a Twitter account archive. There’s a file, ad-impressions.js, that lists every ad Twitter has ever presented to your account, along with the exact targeting information specified by the advertiser — even including the advertiser’s names for the prospect lists they uploaded for the campaign.

[It was at this point, a week ago, that I had to stop writing, and when I picked it back up I was not sure what direction I wanted to take.]

As I said, it was my original hope when I signed up for a Twitter account that I would be able to use it to keep up with former colleagues and coworkers who had moved on to other jobs and institutions. That largely didn’t happen; while there are a few of these people who are (or were) regularly active on Twitter, their use of the platform has been quite different from mine: with maybe one or two exceptions I can think of, largely lurker-ish and to a much greater degree, “professional”. There are a few people I’ve enabled mobile notifications for, so I actually see every time they tweet (even when they immediately delete the tweet afterward), and I think it’s fair to conclude that they’re much quieter and share far, far less of their personal lives than I do — and I don’t have much of a personal life to share. I’d feel better if I saw more of these people complaining about the MBTA, or bad business travel experiences, or even posting cute cat pictures, but they don’t.

As you might conclude from that, Twitter certainly hasn’t done anything for the loneliness, either. I have met a grand total of one person thanks to Twitter, which is far fewer even than Usenet; everyone else I follow who I have met is someone I had some prior connection with. Twitter isn’t substitute for companionship, however much I might like it to be, and I’m still stuck off here in my own isolated corner of the world. (It certainly does not help at all that the only people I ever do meet these days are in their mid-20s. That may not have been so much of a problem when I moved out to the suburbs at the age of 28, totally oblivious to the social isolation, but it is a very big issue now.)

Some of the communities that have given me value from Twitter have been moving to Mastodon. There are some real issues with this platform that lead me to think it’s not going to be an effective long-term solution to the collapse of Twitter under the overbearing weight of its new billionaire owner. I’m going to use “Mastodon” has a shorthand, because it’s by far the most popular, but technically Mastodon is a specific implementation (software package) of an open protocol called “ActivityPub”, and the whole intercommunicating network of servers using this protocol is referred to as “the Fediverse” (because it’s “federated”, or decentralized and operated by many cooperating but independent server owners).

Twitter itself has never been more than barely profitable. The company has been able to raise capital to keep going in the hope that something will come along that makes it more profitable, but at least that has come largely in the form of equity rather than the huge debt that Musk’s leveraged buyout has saddled the company with (and which is likely to be its, and Musk’s, downfall). Mastodon, however, has an enormous missing money problem: without any meaningful way to sell contextual advertising, Mastodon server operators have few options to raise revenue — donations, subscription fees, or just volunteerism are three common options today. Some very large servers may be able to sell sufficient advertising to support their operations, but as Twitter demonstrates, even with a very large audience, the costs grow faster than the revenue does. Many people have laughed at Musk’s insinuation that he can fund Twitter through subscription fees after scaring away all the blue-chip advertisers, and with good reason.

You might ask why this matters: if people are willing to volunteer their time or their money to run a Mastodon server, who’s to say that there’s any money “missing”? I think there are a number of reasons why relying on decentralized, volunteer labor and donated resources is a problem for large-scale adoption of Mastodon.

The first and most significant is that moderation is a difficult, time-consuming task. There are communities where volunteer moderation works, but they have a few features in common: the number of participants is small, the participants share a common purpose, and there is usually a formal body or person that is empowered to make the final decisions if the moderators get it wrong. Any community even a tenth the size of Twitter, if it is to be effectively moderated, is going to require a significant amount of full-time community management, and someone has to pay for that somehow.

Mastodon’s design makes this much harder, because moderation decisions have to be made by every server operator with respect not only to their own users but with respect to every other server, from the smallest single-user server (which can be spun up by anyone at any time) to big servers with thousands of users. Because moderation decisions are made by individual server operators, policies are guaranteed to be inconsistent. Because federation decisions are also made at the server level, users who find a server with a satisfactory moderation policy may find their ability to communicate with users on other servers substantially limited — indeed, users on any server may find themselves “islanded” based on something they have no control over (the behavior of other users), or may be unable to find a server that federates with all of the servers for the users they want to communicate with.

Black astrophysicist and bestselling author Chanda Prescod-Weinstein posted a thread on why this is a problem, especially for minoritized communities. The kicker.

At the level of an individual server, Mastodon’s design is not scalable. That’s probably not inherent to the ActivityPub protocol, but actually available implementations require significant investment (either engineering effort to rework the design, or simply throwing compute resources at the existing code until it works or completely breaks). Mastodon has been around, under the radar, for several years without attracting a significant user base; the Twitter exodus (exoMusk?) has highlighted the difference in a design for a few thousand users with low fan-out and the design required to serve a few million users with high fan-out. In a reply to Dave Guarino, Dan Hon writes about how the low-cost tiers of his hosted Mastodon server were inadequate, requiring him to upgrade to $19 a month — for one user with a few thousand followers. That does not bode well for a government or a news organization that needs to host hundreds of official accounts broadcasting information to potentially millions of followers on thousands of federated servers.

Twitter is able to handle this sort of load (now, unlike ten years ago) because it has both made significant capital investments (building distributed data centers, completely rewriting its core code base including both the user interface and the message routing system underneath it) and has (or had, until Musk fired them all) a significant operations staff who were knowledgeable about the implementation and could respond to issues before they caused major outages. (When I signed up in 2012, the “fail whale” was still a thing, although on its way out — that was Twitter’s custom version of the “500 Server Error” response that every Ruby on Rails app generates if the application code raises an unhandled exception or fails to start in a reasonable time.)

Twitter has been able to make significant architectural changes — like making tweets longer, threadable, and deletable — because of its centralized but globally distributed infrastructure. Mastodon and the Fediverse could conceivably evolve in similar ways, making the software more scalable, but it’s a substantial lift without multiple large engineering teams. The likely best case for this involves most Twitter refugees landing on one or a small number of Mastodon servers, which gets us back to the issue of the missing money. If a million ex-Twitter users land on masto.social (or pick your favorite other instance), are they going to bring in sufficient revenue (either direct payments or advertising) to even be able to pay for server operations, let alone engineering effort to scale their server to that level? Many Mastodon servers and hosting providers being closed to new users/customers at the moment suggests that they’re having trouble doing that now, with only the cognoscenti trying to make the jump away from Twitter.

It is part of the nature of the Fediverse that there is no central list of servers or directory of users: servers can and do come and go at arbitrary times, and the only notification to anyone is that a user on that server starts subscribing to the feed of a user on a different server. As a result, there is no way to directly search for anything or anyone globally: you can search the users on the server you’re using, and their contacts on other servers, but that’s it unless you know the correct remote server to search. There’s no global identity; at best, we could end up with Big Name users on an instance that their employer, or talent agency, or publicist runs. The first of those options at least works the same as email, and would allow employers to enforce a stronger separation of “work” and “play” than they currently are able to do with centralized identity on Twitter. (Sometimes the first public notice of a journalist’s new employer has come when their Twitter handle suddenly changed!)

This creates a Sybil problem, however: not only can anyone create an account impersonating a famous person or brand (what Twitter’s @verified program was created to handle, in response to a lawsuit and subsequent consent decree with the Federal Trade Commission), but for almost no money (the cost of a domain registration and hosting) anyone can create a whole Mastodon instance to impersonate a brand — and every other individual Mastodon server operator will have to decide whether to federate with it or not. Different server operators will have different policies and will undoubtedly be subject to different legal regimes regarding parody, defamation, and trademark laws. (Almost all of The Discourse about the collapse of Twitter had completely ignored the millions of users in other countries where monetization is difficult — these are effectively subsidized by Twitter revenue in the US but in a federated system may be left out in the cold.) Supposing Merck & Co., the giant pharmaceutical company, wants to start its own instance; will Mastodon instances in Europe be required to refuse federation with it on the grounds that Merck KGaA is “the real Merck” in Europe?

Nicholas Weaver points out the underlying issue with identity. Twitter got along perfectly well with lots of pseudonymous users; indeed, many Twitter users follow a good number of them. “James Medlock” is a real person but that’s not their actual name; they’re not impersonating anyone so there’s no reason that it should matter what the government calls them. But for public officials, institutions, media outlets, firms, and brands, it matters greatly that the public is not confused about who speaks for them. For any entity trying to do customer support over social media (which if they’re smart, they stopped doing last week), it is absolutely essential that customers be able to tell legitimate support accounts from scammers. Scams have real-world effects: Someone spent $8 this past week to create a fake “verified” Twitter account claiming to be Eli Lilly (using Musk’s new “pay for Twitter Blue and get this free hat blue badge” policy) for a hoax post about insulin prices that tanked the company’s stock. (As someone who has received more than his share of pharma ads on Twitter, this must have sent shock waves through the industry; it’s not just Musk setting his own money on fire here.)

So it’s clear that under Musk, Twitter doesn’t necessarily have any better answer to the impersonation problem than the Fediverse does — but because most Twitter users have a reliable pre-Musk history of accounts they follow and are followed by, new hoaxes are at least plausibly discernible. (Historically, the way this sort of hoax was run was to compromise an existing verified user’s account and change the display name, because in the old system, verification was tied to the account’s @-handle and not to its “name”, allowing once-verified users to change their names without losing the badge.) That account history also means that the discovery problem is temporarily papered over by the continued existence of Twitter itself: people leaving Twitter are using Twitter-based apps to find the Mastodon handles of their (presumed reliable) Twitter contacts, and so they’re not currently falling victim to the Sybils — but it will be a much larger problem if Twitter completely collapses and there’s no other widely agreed source of online identity.

People can put their Fediverse identities on their web sites, for sure, but the @foobar@baz.quux format is not very friendly for many of the other channels through which people distribute contact information — spoken on the radio, written on a sign or a note card, in six-inch type on a transit bus — advertising really needs a flat, easy to type “AOL Keyword”, not a structured hierarchical naming system. (And yes, that’s something of a surprise coming from me of all people; my thinking has evolved on that subject, as I’ve learned to consider social and user-interface concerns and not just the “purely technical” aspects of a design. I don’t think Musk has.)

OK, so you’re 6,000 words into this essay and probably wondering when I’m going to get around to what I, personally, am going to do about this.

Attention is a finite resource. Even in my ADHD-addled brain, attention is limited (and I guess I have ADHD Twitter to thank for the understanding that I display classic symptoms of adult ADHD, because no medical professional ever has). I already spend too much time scrolling Twitter; I do not use any other social network (most of which are even more evil than Twitter — that’s how I made that choice) and cannot have additional apps demanding even more of my time and attention. Nor do I particularly desire to bridge between two networks — although before Twitter closed their API you could have built an application that seamlessly integrated Twitter and ActivityPub and RSS and Jabber and Slack and all manner of other text-oriented publish/subscribe mechanisms.

For the moment, that means sticking with the devil I know, Twitter. There may come a time — probably long after it’s clear to the rest of my network — that the social center of the word people has decisively moved to another platform, and assuming it’s not run by Facebook Meta I’ll probably switch over completely to that, whatever it may turn out to be. But my follow network extends in a lot of different directions, and it seems possible that not everyone will end up in the same place, and then I’ll have to choose who to abandon and which connections to maintain. I don’t relish the prospect. Quoting @inthefade:

today’s timeline feels like we’re all standing around a hospital bed waiting for grandma to die. unfortunately, grandma held the family together and when she finally dies, we scatter like leaves in the wind

(link) (hat tip: Doug Newman)

Posted in Administrivia, Broadcasting & Media, Computing, States of mind | Tagged , , , | Comments Off on The Twitter That Was

More comments on the MBTA’s capital plan

Since my last post, the state legislature has gotten down to work in earnest on the FY23 budget, but unfortunately I have not had time to do a dive into the Senate version of the budget before the logrolling started due to work commitments. I did, however, have time to review the Boston Region MPO’s five-year Transportation Improvement Plan, part of a federally-mandated public process for transportation agencies that receive federal subsidies. My comments were published as part of the MPO board materials for the May 26 meeting. I also watched the MBTA board’s Audit and Finance committee meeting, at which the final CIP was previewed, and needless to say, the T gave no sign of having responded to public comment in any meaningful or constructive way. They do say that they will publish the public comments in the summer some time.

Because of those work commitments I was a bit behind schedule to leave a voice message for the board, so that the board would be forced to ignore me in near-real time rather than not even bothering to read my comments, and there is now a day-before cutoff for voicemail. There might well be a similar cutoff for email, given the early hour that chair Betsy Taylor likes to have these meetings. I sent email anyway, but since the T refuses to publish the written comments the board receives, I am publishing the text of my comment below. You’ll note (as it was intended for a voicemail) that it’s more focused on commuter rail rather than the laundry list of projects I commented on in the official public engagement for the CIP.

[salutation and introductory material deleted]

Unpowered, locomotive-hauled coaches have been functionally obsolete in passenger service since the 1980s. There is no situation in which it makes sense for the MBTA to be purchasing new coaches at any time in the future: all future rolling-stock procurements MUST be modern self-propelled equipment, not unpowered coaches. If, due to management’s foot-dragging on electrification, additional diesel-powered rolling stock is required, domestic manufacturers are ready and able to supply EPA Tier IV compliant diesel multiple-unit vehicles which can operate on the Old Colony lines, including South Coast Rail, which obviates the need for any additional coach purchases beyond the procurements the Authority has already awarded.

This management team has had two and a half years to make progress on electrification. As far as this CIP indicates, they have absolutely NOTHING to show for it — and this board has certainly done nothing to hold them to your predecessors’ commitment. In that time, the MBTA has announced ONLY ONE of seven required high-platform projects on the Providence Line, none (of two) on the Fairmount Line, none (of two) on the Stoughton branch, and one (of fifteen) on the Eastern Route. Meanwhile, plans have moved forward for high platforms at seven stations on the Worcester Line (although not the most important one, Back Bay), which benefits me and my neighbors by speeding up our trains, but is a complete strategic mismatch with the Authority’s ostensible priorities for electrification.

The previous Secretary of Transportation was under the mistaken impression that high platforms were solely an accessibility issue. They are not. A standard platform height (whether low or high) is a “can we buy modern rolling stock in a competitive procurement” issue. I urge the board to direct management to adjust its priorities accordingly. If that means we get electrification on the Worcester Line by the end of this decade, I won’t complain — but the course the Authority has set in this CIP is to not electrify anything before the end of the decade, and that does not serve the interests of riders or the Commonwealth.

Posted in Transportation | Tagged , | Comments Off on More comments on the MBTA’s capital plan

Comments on the MBTA’s FY23-27 Capital Investment Plan

At its last board meeting in March, the MBTA released its first five-year Capital Investment Plan (CIP) since the COVID-19 pandemic sent the agency scrambling. This marks a significant change from prior CIP cycles, in which the MBTA’s and MassDOT’s projects have been combined together in a single state CIP for all transportation investments. The old CIPs, done by MassDOT on MassDOT’s schedule, made public comment pointless, since the plan had to be approved by the end of June, only a few weeks after it was published in late May, and staff responses to comments would not be published until late summer, when it was too late to have any effect.

What follows is my submission, lightly edited for formatting, to save you the effort of using the public records law to find out what I said (since the MBTA will only publish comments in aggregated form).


I will begin my comments with some general remarks. The timing, structure, and presentation of this CIP are a significant improvement over recent years, and the introduction of information about project phasing in particular is a welcome addition. On the content, however, it is still quite disappointing that the staff evidently decided to “wait the old board out” and make no progress whatever on the electrification of the commuter rail system that was directed by the FMCB. We should, had the Authority not decided to drag its feet, now be seeing significant capital programs for new vehicles, platform upgrades on the Providence and Fairmount lines, and design contracts for electrification on the Fairmount Line and the Eastern Route. None of these are anywhere to be found in this document.

In particular, the Authority was directed to proceed without delay to a pilot of electric multiple unit service on the Providence Line — a line which has six low-platform stations that all must be upgraded to high platforms in order for to support modern rolling stock. In one of the FMCB’s last actions before its termination, the board also published a document describing ways to improve worker productivity and safety, which highlighted, among other issues, the problem of “traps” on the Authority’s existing, obsolete commuter-rail equipment. Nonetheless, senior management has allowed the staff to continue to treat low-level platforms as solely an accessibility issue and not a sine qua non for modernization of the service (and a necessity for purchasing modern off-the-shelf multiple-unit rolling stock).

It is still somewhat difficult to figure out exactly what some of these projects actually are. I would suggest that every project over some threshold dollar value (say, $25 million?) should have a project page on mbta.com and the CIP document should link to project pages whenever they exist so that the public may provide more informed comment.

My comments on individual projects and groups of projects follow, indexed by project ID.

P1108
erroneously categorized as commuter rail. What is the division of responsibility between the MBTA and municipalities for these street improvements? How does this differ from P1113?
P1005a, P1005b
Strongly support these bus priority projects. Center-running bus lanes with dedicated stations are among the most effective investments the MBTA can make for bus passengers.
P0940
Please clarify the locations and time scale involved. The budget seems quite low, based on recent MBTA commuter rail construction projects, so assuming this is a design-only project, there should be other projects within this CIP that would fund the construction — otherwise it’s anything but “early action”.
P0906
You told us in February that you were going to destroy the trolleybus infrastructure. I support the modernization of the North Cambridge trolleybus network as this project proposes, and the replacement of the existing fleet (and expansion of emissions-free service) using extended-range battery trolleybuses similar to those deployed in Dayton, San Francisco, and Seattle. But there’s nothing else in this CIP to suggest that you have even considered that (unsurprising given the lies told by staff at the last public meeting).
P0889
Obviously this project is in progress, but “South Station Expansion” is unnecessary and a waste of money given the significant inefficiencies of the current South Station terminal operation. Fix the operational problems (such as slow clearance of platforms, unreliable diesel locomotives, and low-speed turnouts) before pouring more concrete. (And obviously the North-South Rail Link tunnel should be built and would completely obviate any need for more surface platform capacity at South Station.)
P0869
This was approved by the board more than a year ago. What is holding up the final conveyance?
P0752
Blandin Ave. does not cross the Worcester Line. Is this project actually on the Framingham Secondary, a freight line connecting Framingham and Mansfield?
P0705c
If you’re going to destroy the trolleybus infrastructure, why do you still need a duct bank in front of Mt. Auburn Cemetery? What purpose would it serve?
P0692
One hopes that “capital improvements” includes upgrades to Sharon substation to support electrification of Providence and Fairmount commuter rail.
P0261
Strongly support reconstruction of stations on this line to reduce dwell times and improve passenger safety and accessibility. Unclear to me that restoring a third track for the full length offers significant benefits, because no operational model is specified and it’s not clear how it relates to rail transformation. (A three-track line is not especially useful for frequent, all-day bidirectional service as called for by the previous board — you would need to restore all four tracks for that; Amtrak or FRA grants should pay for it if the only benefit is to infrequent intercity trains.) Support proceeding with the design process so that some of these constraints can be fleshed out in public.
P0214
Support completing this project. The existing project page on mbta.com has not been updated since 2020 and needs to explain where the project is at and what the revised timeline is.
P0206
The “Foxboro Pilot” seems to be dead and probably not coming back, but I guess you’ve already spent most of the money….
P0164
Needs complete description.
P1101, P1010, P1011, P0920, P0921
Strong support.
P1002
Arborway is being replaced by 2027. Why build permanent buildings at the old location now?
P0952
Layover facilities should be located at the ends of the lines; rolling stock should not be stored in Boston.
P0863
Planning for maintenance and layover facilities should ensure that they are capable of handling articulated (non-separable) multiple-unit sets of at least 330 feet in length, to unconstrain choices of future rolling stock. Construction at Readville must accommodate electrification.
P0671, P0671a, P0671b
While recognizing the need for swift action to replace Quincy garage, the cost of $3.35 million per bus is unacceptable, and designs for subsequent bus facility replacements like Arborway must be constrained to a more reasonable amount.
P0671c
Do not support destruction of trolleybus infrastructure — at a time when all major builders can supply extended-range battery trolleybuses — merely to satisfy Vehicle Engineering’s unjustified desire for all buses to be identical. Safety and accessibility in the Harvard bus tunnel require left-hand doors.
P0609
Sorry, what does GLX have to do with something 20 miles away in Billerica?
P0515
Support. This is only item in the CIP that demonstrates recognition of the requirement to electrify commuter rail service.
P1150
Where? For only $10 million, at MBTA costs that’s like one station’s worth of high platform.
P1025
Support. When I visited in 2021, the garage was both nearly empty and visibly in very poor condition. Lynn would do much better with transit-oriented development on this site to take advantage of more frequent regional rail service. In the mean time, the garage needs either to be properly maintained or to be taken down.
P1009
Where?
P0970
Attleboro station requires high-level platforms. $1.2m really ought to be enough, but on the basis of recent projects that’s low by about an order of magnitude.
P0890
Support.
P0761, P0689c
I recall numerous presentations to the previous board about bus stop amenities, and discussions of a new contract for modern shelters. Is this all that’s become of that extensive discussion? Another priority that management has just dropped the ball on?
P0395
Strongly support the completion of this project, which will relieve a significant bottleneck on the Worcester Line.
P0179, P0178, P0174
Support.
P0173
You’re just going to have to go back and put in full-length platforms, which FTA should never have allowed you to leave off this project.
P0170
Strongly support the revised scope of this project with platforms serving both tracks.
P0168, P0129, P0117
Support.
P1152, P0652
No no no no no. Unpowered coaches are generations-obsolete technology and the MBTA should not be planning to still be operating them in 2050, regardless of motive power. To the extent new diesel-hauled equipment is necessary as a result of management’s foot-dragging on electrification, the Authority should be purchaing dual-mode (diesel/electric), single-level, articulated multiple-unit vehicles with a passenger capacity between 250 and 400, with a manufacturer option to remove the diesel prime mover at mid-life overhaul. Such vehicles could be deployed immediately on the Old Colony lines (including South Coast Rail), allowing the 67 obsolete coaches (81 including the SCR order) to be moved to other lines until high-platform and electrification construction has progressed sufficiently. This is sufficient to retire all currently active BTC-1C, BTC-1A, and BTC-3 coaches.

This style and passenger capacity “right-sizes” vehicles for the future all-day bidirectional service, optimizing the use of equipment and personnel by eliminating separate compartments on most trains, improving acceleration, and reducing dwell times, thereby allowing equipment to cycle faster. All major domestic builders offer families of multiple-unit equipment with a variety of power sources, allowing for parts and training commonality and reducing maintenance expense.

I cannot emphasize this enough: there is no “leapfrog” move available here. The MBTA must electrify its commuter rail network, it must do so using standard 25 kV overhead catenary, and it must purchase modern rolling stock and construct uniform high platforms. Additional investment in the current operating model, inherited from the freight railroads 50 years ago, is unacceptable.

The IIJA includes funding programs to support mainline rail electrification, grade crossing elimination, and station accessibility; the MBTA must aggressively pursue these opportunities as they are opened for applications.

P0893
I understand that this contract has already been executed; otherwise the same comments apply as for P1152.
P0653
See my comments above with respect to buses and bus facilities.
P0369
Strong support for the type 10 program and related capital improvements to bring the Green Line closer to industry standard for light rail facilities. This will improve service reliability and reduce operating costs over the lifetime of these vehicles.
P0362
The public deserves more frequent status updates regarding this procurement, quarterly at a minimum.
P0866
Strong support for Red-Blue Connector; continued design should be funded to support applications for discretionary FTA grants.
Posted in Transportation | Tagged | Comments Off on Comments on the MBTA’s FY23-27 Capital Investment Plan

A big PostgreSQL upgrade

I don’t write here very often about stuff I do at work, but I just finished a project that was a hassle all out of scale with the actual utility, and I wanted to document it for posterity, in the hope that the next time I need to do this, maybe I’ll have some better notes.

Introduction

A bit of backstory. I have three personal PostgreSQL database “clusters” (what PostgreSQL calls a single server, for some reason unknown to me), plus I manage two-and-a-half more at work. All of them run FreeBSD, and all of them are installed from a custom package repository (so that, among other things, we can configure them to use the right Kerberos implementation). These systems were all binary-upgraded from PostgreSQL 9.4 to 9.6 in the summer of 2020, but all are much older than that — one of the databases started out 20 years ago running PostgreSQL 7.0, and another one was at some point ported over from a 32-bit Debian server (although not by binary upgrade). We have a shared server for folks who want an easy but capable backend for web sites that’s managed by someone else, and that has about 50 GB of user data in it (including, apparently, an archive of 4chan from a decade or so ago that was used for a research project). For my personal servers, I have applications that depend on two of the database “clusters” (one on my home workstation and another on my all-purpose server), and the third “cluster” is used for development of work projects. At work, in addition to the big shared server, we have a couple of core infrastructure applications (account management and DNS/DHCP) that depend on the other “cluster” — these are separate to avoid dependency loops — and the “half” was originally supposed to be an off-site replication target for that infrastructure server, but since I never managed that, I could use it for testing the upgrade path.

Now, as I said, we were running PostgreSQL 9.6. As of December 1, that version was still supported in the FreeBSD ports collection, and so we could build packages for it, but it had just gone end-of-live as far as the PostgreSQL Global Development Project was concerned. The FreeBSD ports maintainers have a history and practice of not keeping old versions of packages that are no longer supported upstream — unlike Linux there’s no big corporate entity selling support contracts and funding continued maintenance of obsolete software packages and therefore no “megafreeze”. So it was clear that we needed to get off of 9.6 and onto something modern — preferably 14.1, the most recent release, so we don’t have to do this again for another few years. But if I got stuck anywhere in the upgrade process, I wanted it to be as recent a release train as I could possibly get onto. For that reason, I decided to step through each major release using the binary pg_upgrade process, identifying and resolving issues at a point where it would still be relatively easy to roll back if I needed to do some manual tweaking of the database contents (which as it happened did turn out to be necessary).

All but one of these databases are small enough that it would be practical to upgrade them by using a full dump and restore procedure, but of course the 50GB shared database is too big for that. I wanted to maximize my chances of finding any pitfalls before having to upgrade that database, which meant the same pg_upgrade in-place binary upgrade process for all of them. Running pg_upgrade on FreeBSD is a bit involved, because different PostgreSQL versions cannot be installed together in the same filesystem, but this part of the procedure is fairly well documented in other sources online. I have two separate package build systems, one for work and one for personal, because the work one doesn’t need to bother with time-consuming stuff like web browsers and X, whereas the personal one is what’s used by all my workstations so it has all of that. In both cases, though, package repositories are just served directly from the repository poudriere creates after every package build.

Building packages

Because poudriere builds packages in a clean environment, there is no difficulty in building a package set that includes multiple PostgreSQL releases. Where the challenge comes in, however, is those packages for which PostgreSQL (or more specifically, postgresql-client) is an upward dependency — they can only be built against one PostgreSQL version, either the default one defined in the FreeBSD ports framework, or (most relevant for my case) the one set in /usr/local/etc/poudriere.d/make.conf in the DEFAULT_VERSIONS variable. poudriere has a concept of “package sets”, packages built with different build parameters but otherwise from the same sources and build environment, which makes it easy to build the six different package repositories required for this project: we can just create a pgsql10-make.conf, pgsql11-make.conf, and so on, and then use the -z setname option to poudriere bulk to build each repository.

Now, one of the things poudriere is good at, by design, is figuring out which packages in a repository need to be rebuilt based on current configuration and package sources — so we don’t need to actually rebuild all of the packages six times. First, I added all the newer PostgreSQL packages (postgresql{10,11,12,13,14}-{client,server,contrib,docs}) to my package list, and built my regular package set with my regular make.conf and all of those versions included. (I lie: actually I’ve been doing this for quite some time.) Then, I made six copies of my repository (I could have used hard links to avoid copying, but I had the disk space available) using cp -pRP, after first checking the poudriere manual page to verify where the setname goes in the path. (For the record, it’s jail-portstree-setname.) Then I could step through each setname with poudriere bulk and only rebuild those packages which depended on postgresql-client. All I needed to do was make these additional sets available through my web server and I would be ready to go.

Because I knew this was going to be a time-consuming project, I chose to freeze my ports tree after the Thanksgiving holiday: I could either manage a regular package update cycle (which I usually do once a month) or I could do the database upgrade, not both.

The “easy” systems

For obvious reasons, I started out working on my personal machines; they all have slightly different configurations, with their own motley collections of databases, loaded extensions, and client applications. They were set up at different times and were not under any sort of configuration management, so they were all subject to some amount of configuration drift, but none support remote access, so I didn’t have to worry about synchronizing pg_hba.conf or TLS certificates — which was a concern for the work servers, which had exclusively remote clients. And since I was the only direct user of these machines, it was easy for me to make the upgrades in lockstep on all three servers, so I wouldn’t get stranded with different machines requiring different PostgreSQL releases. (That wouldn’t have been a crisis on any of those machines, but would have been a bigger issue at work where the package repository configuration is under configuration management.)

The whole process actually was pretty easy, at least at first, and worked pretty much as the various how-to articles suggest: create a directory tree where you can install the old packages, stop the old server, initdb, run pg_upgrade, start the new server, and do whatever pg_upgrade told you to do. This is not automatable! You have to actually pay attention to what pg_upgrade says, which will vary depending on database configuration, what extensions are loaded, and on the specific contents of the database cluster, in addition to which “old” and “new” PostgreSQL releases are targeted. (You must always use the pg_upgrade supplied with the new server release.) I’ll give a full rundown of the process at the end of this post.

The first showstopper issue I ran into is that PostgreSQL 12 dropped support for WITH OIDS. If you’re not familiar, in early versions of PostgreSQL (indeed, even before it was called that), every row in every table would automatically get a column called oid, which was a database-wide unique numerical identifier. There are a whole bunch of reasons why this turned out to scale poorly, but the most import of these was that the original implementation stored these identifiers in an int32 on disk, so if you had more than four billion tuples in your database, they would no longer be unique (and nothing in the implementation would enforce uniqueness, because that was too expensive). The oid type served a useful function in the internals of the database, but by PostgreSQL 9.x, actually using the oid column was deprecated, and the default was changed to create new tables WITHOUT OIDS.

It should not surprise you, if you’ve read this far, that some of these databases dated back to PostgreSQL 8, and therefore were created WITH OIDS, even if they they didn’t actually use the implicit oid column for anything. (I had to carefully check, because some of my applications actually did use them in the past, but I was able to convince myself that all of those mistakes had been fixed years ago.) None of this was an issue until I got to the step of upgrading from PostgreSQL 11 to 12 — because PostgreSQL 12 entirely dropped support for WITH OIDS tables: you can’t upgrade from 11 to 12 without first either dropping the old tables or using ALTER TABLE ... SET WITHOUT OIDS while running the old release of the server. pg_upgrade can’t patch this over for you. On more than on occasion, I got the 12.x packages installed only to have pg_upgrade fail and have to roll back to 11.x.

The first time this happened, I was almost ready to give up, but I was able to find a howto on the web with the following extremely helpful bit of SQL to find all of the WITH OIDS tables in a database:

SELECT 'ALTER TABLE "' || n.nspname || '"."' || c.relname || '" SET WITHOUT OIDS;'
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE 1=1
  AND c.relkind = 'r'
  AND c.relhasoids = true
  AND n.nspname  'pg_catalog' 
ORDER BY n.nspname, c.relname;

(Hint: replacing the semicolon at the end with \gexec will directly execute the DDL statements returned by the query, so you don’t have to cut and paste.) Note that this procedure must be run on every database in the cluster, using the database superuser account, to get rid of any remaining OIDs.

Another important thing I ran into is that some PostgreSQL server-side extensions define new data types, and the code in those extensions must be available to the old server implementation for pg_upgrade. The easiest way to ensure this is to make sure that the old versions of all of the extension packages are installed in the same temporary filesystem as the old server packages are. In my case this was easy because I was only using extensions included in the postgresql-contrib package, which in the process above was built for every version I was stepping through.

Once I fixed the WITH OIDS issue, I completed the upgrade to 12.x and let it burn in for a day before continuing on with 13 and 14, so the whole process took about a week, but I was confident that I could do it in under four hours for the work servers if I could just deal with the OID issue.

The hard systems

I used the query above to check all of my work databases for OIDful tables, and there were quite a bunch. I was able to confirm that the ones for our internal applications were just ancient cruft. Frankly, most of the data in our shared server is also ancient cruft, so I largely did the same there, but several of the tables belonged to someone who was still an active user, and so I asked first. (He told me I could just drop the whole schema, which was convenient.) Finally, I was ready, and sent out an announcement that our internal applications would be shut down and the shared database server would be unavailable for some unknown number of hours this week. The process ended up taking about 2½ hours, most of which was spent copying 50 GB of probably-dead data.

pg_upgrade has two modes of operation: normally, it copies all of the binary row-data files from the old version to the new version, which allows you to restart the old database with the old data if necessary. There is another mode, wherein pg_upgrade hard-links the row-data files between the two versions; this is much faster, and obviously uses much less space, but at the cost of not being able to easily roll back to the old server version if something goes wrong. All of our servers use ZFS, so a rollback is less painful than a full restore from backups would be, but it’s still much better if I don’t have to exercise that option. On the big shared server, it would simply take too long (and too much space) to copy all of the data for every upgrade, but it made sense to copy the data for the upgrade from 9.6 (old-production) to 10.x, and then link for the each successive upgrades, guaranteeing that I could restart the old production database at any point in the process but not worrying so much about the intermediate steps that would be overtaken by events in short order.

Those other servers are included in our configuration management, which I had to stop during the upgrade process (otherwise it would keep on trying to revert the package repository to the production one and then reinstall the old PostgreSQL packages). This also required paying more attention to the server configuration files, since those were managed and I didn’t want to start the database server without the correct configuration or certificates (having had some painful and confusing recent experiences with this). I had to stop various network services and cron jobs on half a dozen internal servers, and call out to our postmaster to keep the mail system from trying to talk to the account database while it was down (all of these applications are held together with chewing gum and sticky tape, so if someone tried to create a mailing-list while the account database was down, the result would likely be an inconsistent state rather than a clean error). I started by copying the existing network and accounts database to the off-site server, so that I could run through the complete upgrade process on real data but on a server nobody was relying on. (I initially tried to use pg_basebackup for this, but it didn’t work, and I fell back to good old tar.) It was in running through this process that I discovered I had neglected to account for a few pieces of our managed configuration. That dealt with, I then proceeded to the production account and network database, and finally the big shared database full of junk.

The actual upgrade procedure

Note that as a consequence of having previously upgraded from PostgreSQL 9.4 to 9.6, our package builds override PG_USER and PG_UID to their old values; thus, the database superuser is called pgsql and not postgres as in the current stock FreeBSD packages. The procedure assumes that you are typing commands (and reading their output!) at a root shell.

Preparatory steps on each server

Before doing anything, check /etc/rc.conf and /usr/local/pgsql to verify that there is nothing special about the database configuration. Most importantly, check for a postgresql_data setting in rc.conf: if this is set, it will need to be changed for every step in the upgrade. Also, check for any custom configuration files in the data directory itself; these will need to be copied or merged into the new server configuration. (Because of the way our configuration management works, I could just copy these, except for two lines that needed to be appended to postgresql.conf.)

zfs create rootvg/tmp/pg_upgrade
cd /usr/src
make -s installworld DESTDIR=/tmp/pg_upgrade
mount -t devfs none /tmp/pg_upgrade/dev

Note that these steps will be different for servers running in a jail — you will probably have to do all of this from the outside. The mount of a devfs inside the destination tree is probably unnecessary; it’s not a complete chroot environment and the package setup scripts aren’t needed.

At this point, I would stop configuration management and set downtime in monitoring so the on-call person doesn’t get paged.

conffiles="cacert.pem keytab pg_hba.conf pg_ident.conf postgresql_puppet_extras.conf server.crt server.key"

This just sets a shell variable for convenience to restore the managed configuration before starting the server after each upgrade. Depending on your environment it might not be necessary; you will almost certainly have to customize this for your environment.

The following is the main loop of the upgrade procedure. You’ll repeat this for each release you need to step through, substituting the appropriate version numbers in each command.

# install the _old_ database release in the temporary install tree
pkg -r /tmp/pg_upgrade/ install -y postgresql96-server postgresql96-contrib
# Update the package repository configuration to point to the repo for the _new_ release
vi /usr/local/etc/pkg/repos/production.conf 
cd /usr/local/pgsql
service postgresql stop
# If and only if you set postgresql_data in rc.conf, update it for the new release
sysrc postgresql_data="/path/to/data/$newvers"
# Try to install the new release. This is known to fail going from 9.6 to 10.
pkg install postgresql10-server postgresql10-contrib
# If the above fails, run "pkg upgrade" and then repeat the "pkg install" command

# Create the new database control files. Will fail if the data directory already exists.
service postgresql initdb
# Check whether the upgrade can work
su pgsql -c 'pg_upgrade -b /tmp/pg_upgrade/usr/local/bin -B /usr/local/bin -d /usr/local/pgsql/data96 -D /usr/local/pgsql/data10 -c'
# The most common reason for this to fail is if the locale is misconfigured.
# You may need to set postgresql_initdb_flags in rc.conf to fix this, but you
# will have to delete the data10 directory and redo starting from the initdb.

# Same as the previous command without "-c". Use "-k" instead for link mode.</i<
su pgsql -c 'pg_upgrade -b /tmp/pg_upgrade/usr/local/bin -B /usr/local/bin -d /usr/local/pgsql/data96 -D /usr/local/pgsql/data10'

# pg_upgrade will likely have output some instructions, but we need to start
# the server first, which means fixing the configuration.
for a in ${conffiles}; do cp -pRP data96/$a data10/; done
tail -2 data96/postgresql.conf  >> data10/postgresql.conf
service postgresql start

# If pg_upgrade told you to update extensions, do that now:
su pgsql -c 'psql -f update_extensions.sql template1'
# if pg_upgrade told you to rebuild indexes, do htat now
su pgsql -c 'psql -f reindex_hash.sql template1'

# For a big database, this can be interrupted once it gets to "medium"
# (make sure to let it complete once you have gotten to the final version).
su pgsql -c ./analyze_new_cluster.sh 
# If the new version is 14.x or above, run the following command instead:
su pgsql -c 'vacuumdb --all --analyze-in-stages'

Before moving on to the next version in your upgrade path, you should probably check that the server is running properly and authenticating connections in accordance with whatever policy you’ve defined.

I followed this procedure for all of my servers, and ran into only one serious issue — of course it was on the big 50GB shared server. pg_upgrade -c fails to diagnose when the database contains a custom aggregate as described in this mailing-list thread, and the upgrade process errors out when loading the schema into 14.x. The only fix is to drop the aggregates in question (after first reinstalling and starting the 13.x server) and then recreating them the “new way” after completing the upgrade. Thankfully this was easy for me, but it might not have been.

After all that, once you’ve verified that your applications are all functioning, you are almost done: rebuild the production package set, restart configuration management, and remove the temporary install tree (there’s nothing in it that is needed any more).

Posted in FreeBSD | Tagged , | Comments Off on A big PostgreSQL upgrade

The Turnpike Extension is too wide

For two decades, my homeward commute (when I’m driving, and these days when I’m also not working from home) has been the same: head south on Mass. Ave. to Newbury St., take the on-ramp formerly known as “exit 21”, and then the Mass. Pike west to Natick. This on-ramp has always been dangerous, with very limited visibility and no merge zone; I’ve narrowly avoided crashes innumerable times by either slamming on the brakes or flooring the accelerator. I didn’t avoid a crash once, many years ago, and got rear-ended by someone roaring down the ramp behind me as I stopped for heavy traffic. This two-mile stretch of highway is four lanes in each direction, with no shoulder; you can see the area I’m talking about on this map (from Google Maps):

This design made a modicum of sense in the original configuration of the Turnpike Extension, from 1963 until the introduction of physically separated E-ZPass lanes in the 2000s: with a mainline barrier toll at the Allston interchange as well as exit and entrance tolls there, westbound traffic was frequently slowed or stopped as far back as Comm. Ave. if not all the way to Copley Square. But when MassDOT implemented all-electronic tolling on the Turnpike in the 2010s, the Allston interchange was reconfigured to have three full-speed through lanes in both directions — which eliminated the bottleneck at Allston. To be clear, this change didn’t increase the capacity of the Turnpike; it simply meant that traffic was not stopped in Allston in addition to being stopped at the Copley (eastbound) and Newton Corner (westbound) lane drops at times of congestion. MassDOT is now engaged in a process that will hopefully replace the outmoded and wasteful Allston interchange, which was designed the way it was to support a connection to the never-built I-695 Inner Belt freeway.

The Prudential Center, one of the first major freeway air-rights projects in the US, was built in conjunction with the Turnpike in 1961–63. The John Hancock Tower and Copley Place followed in the 1970s and 1980s, and the state has long been looking to generate even more revenue from its valuable Back Bay real estate. Since late 2020, two air-rights projects have been under construction between Mass. Ave. and Beacon Street — with a third one under contract but yet to begin — and as a result there has been a lane restriction in both directions between Copley and Beacon St. to allow the construction crews to safely install the buildings’ foundations. After much hue and cry over how much of a traffic impact this would have, the end result has been … nothing. (Granted that traffic has been reduced somewhat as a result of the pandemic, but not so much as that.) With the bottlenecks through Allston and Copley already being limited to three lanes, the traffic capacity simply isn’t limited by the work zone — at least not any more than it is limited by the unsafe merges that were always there. Eastbound traffic still backs up at the Copley lane drop, and westbound traffic doesn’t back up until Market St. or even closer to Newton Corner. On the westbound side, there simply isn’t that much traffic entering at Copley Square or Mass. Ave. and exiting at Allston — that’s simply not a route that makes sense for most of the trips that could conceivably use it — so the traffic that takes exit 127 westbound is traffic that came from I-93 or points east, and without a toll barrier on the exit ramp, that traffic can queue along the whole length of the ramp without backing onto the mainline Turnpike.

This aerial photo shows the air-rights parcels (under-construction parcels shaded in purple, future parcel in green):

(I may be misremembering the parcel numbers, and didn’t bother to look them up, so they’re not labeled on the map.)

One of the most controversial aspects of the project to replace the Allston interchange (which is also supposed to include a bus/train station and significant new construction on the Harvard University-owned former railyard) has been the area called “the throat”, where MassDOT is trying to thread a two-track railway and twelve lanes of freeway (eight lanes of the Turnpike, four of Soldiers Field Road) in a very narrow stretch of land between the Boston University campus and the Charles River, without impacting traffic on the MBTA Worcester commuter rail line or on the Turnpike or on Soldiers Field Road during construction, and without any permanent structures in the Charles, and without unduly limiting use of the Dr. Paul Dudley White path, which parallels Soldiers Field Road. That section is highlighted in the map below:

Community advocates have largely recognized this as a fool’s errand, and contrary to the city’s and state’s climate goals to boot. Fixing the interchange and removing the viaduct through the throat, and of course building the new train station, are all recognized as positives, but even with a Federal Highway Administration waiver for limited shoulder widths, it has proved impossible to squeeze all of these roadways into the space allowed without either elevating one over the other or building into the Charles. Of course, the “impossibility” is a sham: it’s only “impossible” because both MassDOT (through the effort of former MassDOT secretary Stephanie Pollack) and the Department of Cars and Roads Conservation and Recreation have refused to countenance any reduction in freeway capacity. It is quite clear from the results of the current work zone that the Turnpike is two lanes too wide between Allston and Copley, and it would be a tremendous boon for safety and the environment if it was reduced to six lanes with full shoulders and safe merges at the entrance ramps. Likewise, although I do not have as much direct knowledge, the bottlenecks on Soldiers Field Road are all downstream, on Storrow Drive, especially at the Bowker Interchange and at Leverett Circle: Soldiers Field Road could stand to be just two lanes wide in this stretch.

These width reductions (24 feet on the Turnpike, 20 feet on Soldiers Field Road) would have a very limited impact on traffic, if DOT and DCR stopped their obstructionism, and the result would be a much cheaper, more constructable Allston project with bigger buffers between the freeway traffic and park users.

Posted in Transportation | Tagged , | Comments Off on The Turnpike Extension is too wide

A busy week for transportation legislation

It’s Memorial Day weekend, and in Massachusetts that means two things: the state legislature is debating the budget and the state’s draft Capital Investment Plan has been published for comment. In Washington, the Biden Administration has published its official budget request for Congress (which will go right in the trash; Congress writes its own budget), but the administration’s so-called “American Jobs Plan” is being debated (and torn to shreds) by Congress, and in addition, the five-year surface transportation authorization — which was given a one-year extension last year because of the pandemic — is also being debated.

As a result, the transportation funding situation for the coming year is even less clear than it usually is. The US Department of Transportation released a glossy brochure attempting to explain the President’s budget request, which called out some worthwhile initiatives but largely failed to clarify which programs were associated with which funding requests. Amtrak released its own glossy brochure, explaining to congressional delegations what sort of service enhancements it was thinking about (if only Congress would appropriate more money) — largely devoid of HSR or any particularly compelling program for capital investment or frequent service.

I pointed out on Twitter a few weeks ago that Joe Biden is both the first and last president of his generation — the Silent Generation. He took the oath of office as a United States Senator just a few days after I was born; his political formation is quite different from both the G.I. Generation who preceded him in birth and the Baby Boom Generation who preceded him to the presidency. This is reflected in his speechmaking, in his non-confrontational approach to governing, and especially in his approach to policy: he is going to defer to Congress, and while he will privately jawbone Manchin and Sinema, ultimately he is going to sign whatever Congress passes, and he isn’t going to make a big deal of it if some of his major legislative initiatives founder in the Senate.

While I do not doubt the sincerity of Biden’s support for the proposals his administration has put forward, I believe he values the appearance of bipartisanship more than he values livable cities or indeed an inhabitable planet, and fundamentally, when the autoists in Congress finish chewing up and spitting out his infrastructure proposals, turning them into more of the same polluter-subsidizing climate arson, he will acquiesce without protest. His only “red line”, so far as I can perceive, is his refusal to increase taxes on polluters — who by and large make less than $150,000 a year and drive a light-duty truck. The rich are simply not numerous enough for any taxes aimed at changing their behavior to have any meaningful effect on the existential crisis of our time. We need more and heavier sticks, like a carbon tax, like a national VAT, that would actually be paid by the vast majority of people (myself included), in order to incentivize a meaningful amount of behavioral change.

OK, enough about the situation in Washington — what’s going on on Beacon Hill?

This past week, the Senate unanimously passed its version of the state budget on Thursday. I believe there is one more procedural vote to come, this Tuesday, and then it will go back to the House, which will take another procedural vote, and then both houses will appoint members of a conference committee to hammer out the numerous small differences between the chambers. (Of course, the conferees know who they are already, even though they haven’t been formally appointed, and through this whole process, the staff of both chambers has been tracking the points of disagreement so they know what they need to hammer out.) This whole process might take a week or two, then both chambers will vote again to accept the conference report (by a veto-proof majority) and then deliver the final engrossed text to the governor’s desk, at which point he will veto a bunch of items, and then both chambers will have a day-long vote-a-thon to override most or all of the individual vetoes. Hopefully they can get this all done by the end of June, at which point the legislature will have to take up the FY21 close-out supplemental budget (to reconciles the budget that they passed back in January with what the state actually spent).

The budget as passed by the Senate does not include any revenues from the Biden administration’s “American Rescue Plan”, for the simple reason that the Baker administration doesn’t yet have guidance from Washington on how the state is allowed to use the money. This is also an issue for the Capital Investment Plan, which I’ll discuss next. (The MBTA budget assumes the availability of ARP funds to cover operating expenses, but is not itself subject to legislative approval.) It does include a significant drawdown of the state’s “Rainy Day Fund”, which will presumably be reversed in a supplemental budget once the federal guidance is received on eligible expenses. (ARP funds are generally available for allocation through the end of calendar 2023, but cannot be used to reduce state tax rates, fund pension obligations, or various other things state legislatures might want to do, so each federal agency charged with disbursing ARP money has to go through a rulemaking or similar procedure to issue official guidance about which expenses are or are not eligible and what the state must certify in order to access the funds — the previous Coronavirus relief bills, “CARES” and “CRRSSA”, had similar administrative complications but subtly different requirements.)

Without debate, the Senate adopted an amendment by Sen. Joe Boncore, chair of the Transportation committee, which restructures and increases the fee charged for TNC (i.e., Uber and Lyft) rides, adds additional reporting measures, and creates a trust fund into which the state portion of the fee revenue is paid. Most significantly, the Boncore amendment creates a “public transit access fee”, an additional twenty-cent charge for trips that both begin and end within “the fourteen cities and towns”, and paid into a segregated account to support a low-income fare program to be established by the MBTA. (Which are “the fourteen cities and towns”? They are the communities in the original, inner-core MBTA district, which still receives the majority of bus and rapid transit service.) The amendment passed 37–2 on a non-recorded standing vote, but it remains to be seen whether this language will survive the conference committee; if it does, expect the governor to veto it. (It’s likely that the legislature will override the veto if it gets that far.) Unlike with the bond bill back in January, the legislature will remain in session after the budget is passed, so there is no possibility of a pocket veto.

I should note here that neither House nor Senate budget includes provisions for an MBTA board of directors, to replace the current five-member, unpaid Fiscal and Management Control Board, which expires at the end of June. The governor’s budget as filed included such language, but it was dropped from the budget by House Ways & Means, and it was also left out of the Senate budget. (Senate Minority Leader Bruce Tarr proposed an amendment to restore the governor’s language, but it was “bundled” with several other amendments and rejected on a voice vote.) There are companion House and Senate bills in the Transportation Committee which would establish a new board, but thus far, with only five weeks left of the FMCB’s legal existence, the committee has not chosen to advance either one. The bills are H.3542 by Rep. Meschino and S.2266 by Sen. Boncore; the language in both looked the same on spot-checking but I did not do a line-by-line comparison. In addition to expanding the board to seven members, S.2266 would provide for some local representation, by giving the existing municipally-appointed Advisory Board the right to appoint one MBTA board member, and would authorize an annual stipend of $12,000 for each board member except the Secretary of Transportation. While this is not the exact structure I would prefer, time is of the essence and I would like to see one of these bills reported out of committee within days to allow for a smooth transition from the old board to the new.

Finally, on to the state’s Capital Investment Plan (CIP), which the MassDOT board and the FMCB voted to release for public comment last Monday, six weeks before the deadline for it to be adopted and generally much too late for any public comments to make a significant impact. As with last year, significant uncertainties related to the aftermath of the pandemic and the availability of federal support have been used to justify a one-year “maintenance of effort” CIP rather than the five-year CIP the law requires. These uncertainties do not get the state out of its federally-mandated five-year State Transportation Improvement Plan, through which all federal grant programs flow, so we still have some idea of what might be funded in the out years simply because it has to be programmed in the STIP; the Boston Region MPO will meet on Thursday to endorse its FY22-26 TIP, which is the largest regional contribution to the STIP, and this will then flow through to the state CIP, which both the MBTA and MassDOT boards must formally adopt at their joint meeting at the end of June. The state and the federal government operate on different fiscal years (the state’s is July to June and the feds use November to October), which means the exact alignment of the two plans depends on the exact scheduling: some SFY22 projects are funded with FFY21 obligations.

One thing we do know about the Rescue Plan is that it includes an additional $175 million of funding to recipients of Federal Transit Administration capital improvement grants in fiscal year 2019 which have not yet entered revenue service. Up to 40% of this could go to the Green Line Extension — except that the GLX project is running under budget and may not need any more money. The text of the act makes the distribution of funds non-discretionary, but the agency will have to determine what Congress intended by this provision in the case of funds to be distributed being in excess of obligations. The MBTA and FTA are in discussions to see how the GLX money could be reallocated — the original funding agreement includes a clawback provision for Cambridge and Somerville if their local contributions turn out not to be required, but if the surplus comes in beyond that amount, after all contractor claims are resolved, the MBTA would like to use the money for other priorities. This should be resolved by the time the boards vote on the CIP in four weeks, so it’s likely that there will be some transfers of these funds in the final CIP that aren’t shown in the draft.

Having said all of that, and in the knowledge that my comments will have no meaningful effect on the process, I still chose to email the state with my comments, which will probably get a formal reply from the staff some time in September. Here is what I said, lightly edited for presentation here:


I will begin my comments with some process issues.

  • While the “accessible” PDF version of the draft is definitely more accessible than the “story map” (which has undocumented requirements for computer hardware and is difficult to navigate or resize), it is still lacking in some basic information, such as the actual location (at least city or town) of projects in the MBTA section of the plan. Many of the “project descriptions” are quite cryptic, even for someone who regularly attends/watches the board meetings, and need a more complete explanation. That said, the breakdown of programmed expenditures in the last four columns of Appendix A is an appropriate and helpful way to present the status of a project in the absence of an itemized plan for the out years.
  • To repeat my comments from previous years, the schedule for comment is far too rushed. Anyone who has followed this process over time knows that all of the important decisions have been made by the staff already, sometimes months ago, and as a result there is almost zero chance that public comment will result in any changes to the draft before the boards vote to adopt it in a few weeks. The capital planning process needs to be open and transparent, and that starts with publishing the universe of projects and assigned priorities well before the end of the fiscal year so that members of the public can develop reasoned arguments about which should be advanced or delayed.
  • While I am sympathetic to the desire to do a short-term capital plan given the uncertainties over whether Congress will pass an infrastructure bill, it is unfortunate that the draft CIP only shows one year, and does not show projects in the out years that might have an opportunity to be accelerated if additional funding is made available. This is important information and citizens deserve to have at least some details so that we can make a case to our representatives and before the various boards. Many agency priorities have changed and numerous projects have been accelerated, so it is not possible to refer back to the FY20-24 CIP for information about FY23 and FY24 projects.

I have one comment regarding Highway Division programs: I am disappointed that the Weston Route 30 reconstruction and bike/ped safety improvements project was not programmed by the Boston Region MPO and said so in comments on the draft TIP. Should additional funding become available before the end of FY22 I urge that consideration be given to programming this important project.

My remaining comments are all regarding the MBTA section of the draft.

  • Should the Green Line Extension come in under budget (as suggested at last Monday’s board meeting), and should FTA allow the MBTA to reprogram the FFGA funds, I strongly support funding the Red/Blue Connector and/or advancing the bus facility modernization program by replacing Arborway garage.
  • However, I remain unalterably opposed to the destruction of the North Cambridge trolleybus network as currently proposed by MBTA staff. Trolleybuses are inherently more efficient than battery buses, do not require supplemental diesel heaters, and are already zero-emissions vehicles; North Cambridge has been a trolleybus carhouse since it was converted from streetcars in the 1950s (when most Mattapan carlines were dieselized, contributing to today’s environmental injustice in that neighborhood). The Transportation Bond Bill specifically authorized “transit-supportive infrastructure” program funds to be used for trolleybus infrastructure, including electrification, and the MBTA should be making plans to expand North Cambridge trolleybus service to other nearby bus routes such as the 68, 69, and 77, by extending the trolley wire and/or acquiring battery-trolleybuses with in-motion charging.
  • The continued progress on making commuter rail stations fully accessible is laudable. However, I continue to be concerned that upgrading stations to full high-level platforms is being approached solely as an accessibility issue, and thus being advanced piecemeal, rather than as a significant constraint on operations, staffing rationalization, and competitive rolling stock procurement — as was obliquely pointed out by Alistair Sawers in his presentation before the boards last Monday. While I strongly support completion of the current platform accessibility projects (all of them on the Worcester Line), future investments in platform upgrades need to be done more strategically.
  • In particular, given the response of the FMCB to Mr. Sawers’ presentation — specifically, endorsing the idea of proceeding with a traditional procurement for EMU rolling stock — construction of high-level platforms on the remaining Providence and Fairmount Line stations needs to be prioritized, packaged as a single unit of design to control costs, and put out to bid ASAP, preferably in FY22, to ensure that these lines will be able to use standard rolling stock purchased in a competitive marketplace rather than bespoke trains with nonstandard multi-level boarding. Platform upgrades on other lines should be prioritized on a line-by-line basis, so that remaining diesel lines can be converted to remote door operation and the reprocurement of the operating contract can go to bid without the burden of unnecessary assistant conductors.
  • The placeholder commuter-rail project labeled “future fleet” should obviously be reprogrammed as an explicit EMU procurement. The General Court has made it quite clear that it is the policy of this Commonwealth to electrify the commuter rail network, using overhead catenary electrification and EMU rolling stock, and has authorized nearly a billion dollars in bond issuance over the next decade to put it into practice. It is time for the MBTA and MassDOT to get in line.

Project-specific comments:

  • P0170: station design for full access to both platforms should be advanced.
  • P0261: the description says “3rd track feasibility study” but other MBTA documents and presentations have implied that the third track was actually going to be progressed to design and eventual construction. Please clarify.
  • P0650 and others: since coach availability has been an issue recently, even during the period of pandemic-reduced schedules, I support continued lifetime extension and overhauls of legacy rolling stock to keep this equipment running while electrification is being pursued at the greatest practical speed.
  • P0863: strongly support construction of a south-side maintenance facility, but caution that the design needs to be able to accommodate articulated EMUs which are several times longer than legacy coaches, so as not to constrain rolling stock procurement.
  • P1009: what FTA compliance actions are these? For a $57mn program this needs to be spelled out explicitly.
  • P1011: GLX hasn’t even finished constructing the maintenance facility, and you’re already looking to spend $12m to modify it?
Posted in Transportation | Tagged , | 1 Comment

Weekend excursion: Stations of the B&M New Hampshire Main Line/MBTA Lowell Line

As I’ve neared the end of this series of posts, I’ve gotten a bit better at procrastinating, so most of the photos this post is based on (see the associated photo gallery) were taken a month ago now, and I’m drawing a lot on unreliable memory and aerial photos (and a bit of Wikipedia) to bring this together. It’s an interesting time to be writing about this line, for a number of reasons I’ll try to articulate.

As the title suggests, today’s Lowell Line was historically the Boston & Maine’s New Hampshire Main Line, with passenger service north through Nashua, Manchester, and Concord into the White Mountains and through Vermont to Montreal. In Chelmsford, north of Lowell, the line connects with the Stony Brook Railroad and becomes part of Pan Am’s freight main line from Western Massachusetts to Maine. There have been discussions on and off about re-extending commuter rail service to Nashua (where the historic B&M station apparently still stands) and even Manchester, but the discussions have always foundered on New Hampshire’s refusal to subsidize anything other than private automobiles. Recently, Amtrak released a map of possible service extensions which included service as far as Concord — Amtrak, unlike the MBTA, has both the legal right to operate on any railroad and a mandate to provide interstate service, and already operates Downeaster service along the line as far as Woburn.

As I described in more detail in my survey of the Western Route, the B&M planned in the 1950s to run all longer-distance services north of Boston via the NHML, with trains to Haverhill and Portland using the “Wildcat” Branch in Wilmington to access the northern part of the Western Route. This made a good amount of sense (and still does, which is why the Downeaster does so) because the NHML is the highest capacity line on the ex-B&M network, and the second-highest-capacity on the entire MBTA system: it’s the only North Side line with no single-track bottlenecks and no drawbridges other than at North Station; even slow diesel trains can maintain decent speeds because the stations are few in number and fairly widely spaced. All of the stations have platforms for both tracks, allowing bidirectional service without scheduling difficulties, although with the exception of the recently constructed Anderson RTC/Woburn station, they are all low-platform (all except West Medford and closed-for-demolition Winchester Center have mini-highs).

Which then brings me to the saga of Winchester Center, one of the two stations that got me started on this series of travels back in March. Winchester was scheduled for accessibility upgrades, with final design nearly complete and construction supposed to be put to bid in the second half of this year, when regular inspections early this year revealed safety issues with the old station’s platforms. Rather than perform emergency repairs, the MBTA chose to simply demolish the old station early, while commuter-rail ridership was low due to the pandemic, and remove the demolition from the scope of the reconstruction contract, reducing the cost and allowing construction to proceed more quickly. While I did not make it to Winchester Center in time to see the old platforms, I did get pictures of the demolition work in progress. (Not literally in progress, though, because I made my visit on Easter Sunday when no work was taking place.)

So with all that out of the way, let’s go station-by-station. With the historic stops in East Cambridge, Somerville, and Medford Hillside all long gone, the first stop on the modern Lowell Line is at West Medford. The station is located next to the West Medford post office (in fact the inbound shelter looks to be attached to the side of the building) and it is inaccessible, with only low-level platforms on both tracks. A few years ago, the MBTA’s system-wide accessibility program rated West Medford one of the highest priority stations to receive full accessibility upgrades, but I haven’t seen anything to indicate that this has been advanced in the capital program since then, not even as far as a 10% design. In the 2018 passenger counts, about 600 people a day used West Medford — which is pretty good for a commuter rail station but only the fourth-busiest suburban station on this line. Much of the station’s popularity can be explained by its assignment to the inner-core fare zone, zone 1A, so travel to North Station costs only as much as a subway fare and is much faster than taking the bus to Sullivan and then transferring to the Orange Line into town. (It will be interesting to see how the popularity of this stop changes when the Green Line Extension opens, since it will operate much more frequently and offer bus connections closer to West Medford.)

The route runs on a viaduct through much of Winchester; Wedgemere station is located near the south end of that viaduct, where the railroad crosses the Aberjona River at the north end of Upper Mystic Lake. In the middle of a wealthy residential neighborhood and without practical bus connections, Wedgemere gets a surprising amount of traffic compared to its 120-stall town-owned parking lot, about 300 riders in 2018. With the closure of Winchester Center station, only four tenths of a mile to the north, Wedgemere is currently the only station in the town of Winchester, but with much more development within walking distance, Winchester Center had about 50% more traffic.

North of Winchester Center, a long-abandoned branch once led to downtown Woburn, with the main line running through a largely industrial area on the east edge of the town, before crossing under Route 128 into a truck-oriented wasteland of industrial parks. At the Route 128 overpass, Mishawum station was formerly the primary station serving Woburn, located between two toxic-waste cleanup sites, “Wells G & H” and “Industri-plex”. It used to be accessible, and was upgraded with a ramp system on the inbound platform and mini-highs on both platforms before being abandoned in favor of a new station half a mile deeper into auto-dominated industrial-park hell. The former parking lot, shared with the Woburn Logan Express, has turned into a bank office building and a Dave and Buster’s. The station still stands, and still seems to be receiving some maintenance, but at some point in the last decade, the mini-high platforms were partially demolished to reuse the folding steel platform edge at another station. As a result, Mishawum is the only MBTA station to have been accessible, and then made inaccessible. As late as 2018, long after the new station was opened, Mishawum was (apparently illegally?) still being served as a flag stop by a handful of trains a day; the 2018 traffic counts (32 passengers a day) are the most recent mention of any kind I can find of it. The town of Woburn apparently wants to see service maintained at Mishawum, because as we shall see, its replacement is even farther from where any humans can be found without a steel exoskeleton, whereas there is a residential neighborhood not too far southwest of Mishawum. But it’s no longer shown on public schedules, and with the MBTA’s slow diesel trains it’s really too close to the new station to even be a flag stop. Even with electrification, a new station at Montvale Ave. or Salem St. would have a much larger catchment of Woburn residents and result in a more appropriate interstation.

The new station in question is of course Anderson Regional Transportation Center, which is an enormous ocean of parking, nearly 2,000 spaces, accessible only via an unwalkable car sewer with a direct exit off I-93, connected to a combination bus stop and train station, and owned and operated by Massport. Of course, it hardly matters that it’s unwalkable, because in the middle of this toxic waste site (the Woburn Industri-plex Superfund site) there’s nothing you’d want to walk to or from. For train facilities, the station has two overhead pedestrian bridges, one connecting the high-level center platform to the second floor of the station building, and the other, at the far northern end of the platform, connecting to the northwest edge of one of the enormous parking lots. In addition to the MBTA commuter trains, the Downeaster stops here, and presumably if the proposed Amtrak service to Concord ever gets off the ground, it would as well (and probably Lowell, too). When I visited, the parking lots were barren, and Logan Express bus service had been suspended due to the pandemic. Despite the horrible location, the station definitely got plenty of use, with more than 1,200 passengers a day in 2018. (One wonders how many of those passengers are actually driving down I-93 from New Hampshire.)

The next station north, Wilmington, is where the Wildcat Branch diverges to the north as the main line heads north-northwest. The turnout is located just north of the outbound platform, resulting in offset platforms. The single-track Wildcat only connects to the outbound track, but a universal crossover south of the station allows access to both tracks; passenger service using the Wildcat does not currently make a stop a Wilmington, so it matters little that the branch only serves one platform. There is a 200-space MBTA-owned parking lot on the east side of the tracks, but this is far too small to account for the average daily ridership of 575; there is also an apartment complex, “Metro at Wilmington Station”, at the south end of the inbound low-level platform.

There’s a long interstation, about 6 miles, between Wilmington and North Billerica, but the line runs through wooded, low-density areas nearly the entire length. Just south of North Billerica is the B&M’s former maintenance yard, now an industrial park called Iron Horse Park, a 553-acre Superfund toxic waste site, including numerous landfills and former waste lagoons, which are contaminated with a variety of solvents, heavy metals, asbestos, and pesticides. Iron Horse Park is in its 37th year of EPA-supervised cleanup, partially funded by the MBTA, which made the mistake of acquiring 150 acres of the property in the 1970s as it began the process of taking over the B&M’s commuter rail operations. The MBTA’s new backup rail operations center is being constructed in a less restricted part of the park.

I actually went to North Billerica station first, before heading down to Iron Horse Park. It’s another two-track station with low-level platforms and mini-highs, made slightly more interesting by its 19th-century station building (although it’s been extensively renovated, to the point that I had figured it was new-old-style rather than Actually Old when I visited). The station has two surface parking lots, operated the Lowell RTA, with 540 spaces between them, and is also served by two LRTA bus routes, helping to explain its over 900 daily riders in the 2018 statistics. As the sun was setting, I did not make it all the way to Lowell on my initial trip, but returned a week later as part of a wrap-up trip that also included stops in Worcester, Lawrence, Rowley, and Newburyport.

At Lowell, I found LRTA’s exceedingly expensive and aggressively human-enforced parking garage, located over the rail line and next to LRTA’s central bus hub. Google Maps initially wanted to take me into the west garage entrance, which I found blocked with Jersey barriers, and when I found the entrance that was nominally open, I found that it was (unlike every other RTA garage) not equipped with automatic ticketing and payment systems, and the human who was supposed to sell me a ticket was not in their booth. I moved on, not wanting to spend $8 to park for 15 minutes, stopping to take a few quick pictures of the bus hub and the commuter-rail platform, but was chased away by an LRTA employee in a pickup truck. The platform here is a low-level center platform, between the westernmost pair of tracks, with a half-length high-level platform accessed from the 700-space garage, which is built across the tracks. The line quickly narrows to two tracks north of the station before crossing the Pawtucket Canal, and narrows to a single track at the wye with the Stony Brook. There is no layover facility on the Lowell Line, so trains entering and leaving service must do so at Boston Engine Terminal in Somerville; the lack of such a facility is one of the major constraints on increasing service on the line (because there is little room to store additional trainsets that would be required). In 2018. more than 1,500 people a day used the station.

That concludes the March–April run of MBTA station “weekend excursions”, but the project as a whole is far from complete: in addition to the new stations currently under construction (six stations of South Coast Rail, to open 2024; New Chelsea, opening later this year; New Natick Center and Central Falls–Pawtucket, opening next year) there still remain all of the stations that I avoided because they were in crowded areas and there’s still a pandemic on: Boston Landing, Lansdowne, Back Bay, Ruggles, Forest Hills, South Station, JFK/UMass, Quincy Center, Braintree, North Station, Malden Center, and Porter, plus the rest of the Fairmount Line and three stations in Rhode Island. In addition, Mansfield station, which I last saw while it was still under construction, fully opened in 2019. I’ll be fully vaccinated in a few days, and weather permitting, I still have plenty to do and see.

Posted in Transportation | Tagged , , , , | Comments Off on Weekend excursion: Stations of the B&M New Hampshire Main Line/MBTA Lowell Line

Automatic generation and validation of train schedules

Passenger railroads throughout the world have mechanisms to generate timetables for both service and capital planning purposes. The way I’ve done this in the past is with the Mk. 1 eyeball: you come up with a schedule, maybe draw some stringline diagrams to determine minimum separations, and then shift around the run times to ensure that there are enough resources at each crossing to allow for the desired schedule. Sometimes, of course, this doesn’t work, and it’s painstaking and laborious, and nearly impossible to answer questions like “What’s the minimum investment (in tracks and switches, or in increasing speeds) to allow for the schedule we want?” Actual railroads don’t do it this way, of course — their networks are much more complex, and they have more constraints than are necessarily obvious from aerial photography. They use software to validate their timetables, and in many cases will use linear-programming libraries to find the schedule that maximizes equipment utilization, minimizes capital investment, or optimizes some other criterion.

Last week, the MBTA Fitchburg Line schedules were announced for the spring rating. The Fitchburg Line has been under construction for the entire month of April, so with the majority of the line being bustituted, the MBTA and its contractors chose not to publish the spring schedule at the same time as the other lines. A co-worker of mine lives in West Concord; before COVID-19 regularly took the Fitchburg for her daily commute, and we had an email exchange about how service could be improved over the one-train-per-hour schedule that has been introduced. This line is mostly double-tracked, except for short single-track segments in Fitchburg and Waltham which constrain the schedules that can be operated. I noted to my colleague that I didn’t know much about SAT solvers; I thought it obvious (at least to a computer scientist) that this question could be represented as a satisfiability problem, for which there are lots of known techniques and libraries to perform the computation. (The general boolean satisfiability problem for three or more variables, called “3SAT”, is known to be computationally intractable; someone who found an way to solve it efficiently would instantly win all of the major prizes in computer science and operations research. In the mean time, there’s been a lot of research into making solvers faster even given though there is no categorically efficient algorithm.)

This question was bugging me over the course of the week, so I did a Google search to see if I could come up with some plausible code that might work. I first got distracted into looking at a constraint solver called “kiwi” (which is a reimplementation of an academic solver called “Cassowary” designed for use in responsive user-interface implementations), but found that it was too limited to even figure cycle times, which was my starting point (not even looking at single-track constraints like the Fitchburg or the Old Colony). I went back to Google and added some keywords to suggest that I was interested in how railroads use solvers, and got some different results.

For whatever reason, I landed on Y. Isobe, et al., “Automatic Generation of Train Timetables from Mesoscopic Railway Models by SMT-Solver“, published in IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences in 2019, for which the first author maintains a web page (probably what I first saw in the Google results, rather than the paper itself). The research was a collaboration between a government institute, AIST, and the East Japan Railway Company (better known as “JR-East”). “SMT” is an acronym for “Satisfiability Modulo Theories“, a generalization of the boolean satisfiability problem (plain old “SAT”) to include richer data types, such as sets and arrays, along with integers and real numbers, and relations such as set-inclusion and numerical inequalities. This sounded like not only the exact sort of thing to be looking at, but quite possibly someone had already done the (not inconsiderable) work of proving that it worked on a model of an actual railroad.

After skimming the paper the first time, I realized that I did not understand their railroad infrastructure model, or perhaps I do not understand how JR-East designs railroad infrastructure. I figured I’d work through the examples first, but immediately ran into snags (after first installing the prerequisites). It took an entire evening to figure out that the code was written for an older release of the Z3 SMT solver, and the current version of Z3 has a slightly different output format that needed adjustments in the parser. The code for the paper is written in OCaml, a language I do not know, so I had to figure out enough of the debugger to figure out where the parse was failing, and then learn enough OCaml syntax to figure out what the failing code was looking for and how to make it look for something different.

The examples, as it turns out, didn’t help. Once I fixed the parser, I could run the examples through the solver and it would generate the expected output, looking very much like the solutions shown in the paper. (I should maybe look more closely at the code, because it does a nice job of generating stringlines directly to PDF or SVG using OCaml bindings for the cairo graphics library, and maybe I could steal that even if I can’t make the solver work.) The difficulty was not with the solver itself, but trying to wrap my head around the way it models railroad infrastructure — what the authors call a “mesoscopic model”. (The term isn’t theirs, it’s cited to an earlier paper that I haven’t read.)

In this paper’s version of a “mesoscopic model”, there are two kinds of fixed objects: “links”, which are just given arbitrary unique numbers, and “structures”, which can be stations or interstations. There’s the first loose cobble to trip you up: you’d think a “link” would correspond to the track between stations — i.e., an interstation — but no, that’s a “structure”. Rereading section 2 of the paper doesn’t really clarify matters, but staring at the examples some more makes it clear that a “link” is more like the interlocking at a station throat. The structure model assumes that every track can connect to every other track at such a junction. My first attempt to model the Worcester Line — chosen because it’s the most familiar to me and has capital construction projects that I could model the effects of — foundered on this issue: while there are many universal crossovers on the newer part of the line, not every station has one, no station except Framingham has one at both ends, and when I looked at how train routings were specified, it’s a sequence of “links”, not stations, so I would have to explicitly specify which track each service pattern would use, which is one of the things I had expected the solver to figure out for me.

After looking again at the major worked example in the paper, which deals with sequencing local and Shinkansen trains on the single-track Tazawako Line, I figured that I should start with a single-track MBTA line, and perhaps things would become clearer. Even there I had trouble, because it seemed like, in the example, the trains could only pass each other at stations, and JR had built nearly every station with two or three platform tracks. (In fact, it was only just now, as I am writing this, that I looked at the Wikipedia article and saw that the paper’s authors had modeled two passing sidings on the line as stations! The solver doesn’t distinguish between the two types of “structure”, but the presentation in the paper and in the solver’s graphical output does. And oh, by the way, all of the comments in the example data are written in Japanese, which I can neither read nor display.) I figured the shortest and least complex of those would be the Greenbush Line, so I spent an evening panning through Google Maps using the “measure distance” tool to figure out how long the interstations were and how much of each was single- vs. double-tracked.

That got me into a more fundamental issue, which I freely admit to just fudging as a means of playing with the tool to see if I could get anything interesting out of it. All of the intervals given in the structure model are in units of time, not distance. This makes sense for stations, but when you want to apply it to tracks, you run into the issue that the time is going to depend on the ultimate routing: a train that approaches an interlocking with a green signal is going pass through it faster than a train that has a restrictive signal, and there’s nothing in the model that allows you to tell the solver that. In fact, unless forced to look for a better schedule, the solver would often come up with bizarre delays for no obvious reason, because Z3 isn’t an optimizer — it finds a solution that satisfies the constraints, but not necessarily the best one. The higher-level timetable solver offers some manual knobs that allow you to force this, in particular a max_time parameter, but unfortunately it’s global. The basic assumption is that not only do you already know the route, but you have already modeled the rolling stock’s performance on the route and you know exactly how much time it is expected to take for all phases of the trip. (And probably also that you’re using equipment which has decent acceleration and braking performance in the first place, i.e., EMUs.)

So rather than try to simulate an HSP46 hauling four bilevel coaches, without specific knowledge of track speed limits and minimum separations, I just reverse-engineered times from the published schedules. There is an object in the model which represents a single run of a train, and you can override the times specified in the structure part of the model if you have trains with different performance characteristics or stopping patterns, but I did not explore that aspect — if I did anything more with this solver, it would probably involve creating a higher-level language to describe lines and trains which could then output the enormously redundant input language of the timetable solver. (And at that point, I’d probably be close to ripping apart all of the OCaml bits and interfacing Z3 to my train acceleration model directly. I don’t think I’ll get there, because that’s a full-time job.)

It was then that I learned, rather to my surprise, that the published schedules don’t work. Clearly they must, in reality, since the schedule is what Keolis actually operates, but the solver couldn’t satisfy the schedule as I had reverse-engineered it. Unfortunately, the solver can’t tell you what’s wrong: it just outputs “unsatisfied” and you have to hack away at the model description until you get something that works enough to generate a schedule, and then look at the graphical output to see where delays are being inserted to account for errors in the model. I did finally get my Greenbush Line model to work, although it didn’t end up looking entirely like the MBTA schedule it was based on.

Sometimes it’s not an error in the model; there are some significant limitations in the solver as well. The most important of these is that it can’t handle turning trains, whether on the platform (for a mid-route short turn) or at a stub-end terminal. I was able to make a model of the Fitchburg Line sort of work, by including Westminster Layover as a “station” and making all trains run through to Westminster, where it doesn’t matter if it’s an hour to the next run, rather than turning on the platform at Wachusett, but when I tried to model short turns at Littleton, there was simply no way to tell the solver “no, this specific train has ten minutes to turn around and must immediately head back whence it came, it can’t just sit there blocking the platform while three other trains go by”. I am certain that this requirement (and round trips in general) can be implemented in a solver like this, by adding additional constraints, but again, modifying the solver logic is a job for a professional (and someone who actually knows both OCaml and SMT solvers). The problem is equally significant at the city terminals: the model wants to have all trains arrive at platform 1 and depart from platform 2, and that’s a physical impossibility — just not one that you can encode in the model without explicitly assigning tracks to trains and manually generating separate “inbound 1”, “inbound 2”, etc. — which again is something I wasn’t interested in doing by hand.

One thing that the solver can do is work with “periodic” schedules; i.e., those that are repeated the exact same way throughout the day at a specified interval. It can’t figure the period for you, but if you give it a period, it will add in the modular arithmetic in the SMT problem definition to ensure that multiple trips of the same set of trains don’t conflict with each other. This makes it easy to figure out how “clockface-compatible” a particular service model is: if it is still satisfiable with period = 3600, you can run each train hourly. If it works with period = 1800, you can run two per hour, and so on. (If it only works with period = 4500 then unfortunately you’ve got a service that repeats every 75 minutes, ugh.) This inspired me to look more closely at the Dorchester bottleneck on the Old Colony lines. All trains pass through Quincy Center, and nearly all trains stop at both Quincy Center and JFK/UMass (which was never intended to have a station, but it just barely turned out to be possible to squeeze one in east of the Red Line tracks). I could take my Greenbush model, delete everything south of Quincy Center, and just see directly what the capacity of that bottleneck was.

Stringline diagram showing maximal Old Colony service

This stringline shows a solution for service every 12 minutes between Quincy Center and South Station. Note that the actual trip takes 20 minutes (1200 seconds) in one direction but only 18 minutes the other way.

By fiddling with the period parameter, I was able to get five trains per hour (12-minute headways, which is to say, a 720-second repetition period) to work with the existing model based on slow diesel trains. I then took Alon Levy’s simulation of trip times with a modern EMU instead of antiquated diesel trains, and made an interesting discovery: although modern equipment cuts the travel time in half (to 10 minutes from 20), it doesn’t help with the bottleneck: a repetition period of 600 seconds (10-minute headways, 6 trains per hour) doesn’t work. But, if you could somehow double the Dorchester single-track, then look what’s possible:

Stringline showing 10 trains per hour with Dorchester double-tracking

This stringline shows what happens if you increase frequency to 10 trains per hour by twinning the single-track south of JFK/UMass station.

That (extremely expensive) intervention doubles the capacity of the line, opening the prospect for frequent service as far as Brockton and South Weymouth. At ten trains per hour, you could have service every 15 minutes to Brockton and every half hour to Kingston and Greenbush (assuming other bottlenecks along those lines were relieved — I haven’t checked that the schedule would work because I haven’t actually encoded the other Old Colony branches). This does get into another issue with the railway solver: the period parameter should really be an attribute of the train, not global, because you’d like to jointly solve multiple services with different frequencies that all share a common bottleneck without laboriously open-coding the whole pattern of repetition. (Ideally, you’d like to be able to solve the entire network as a single model!)

Anyway, for those who are interested, you can see my version of the original “RW-Solver” code on my GitHub fork, which includes some of the kluged model files I’ve been playing with.

Posted in Computing, Transportation | Tagged , , , | Comments Off on Automatic generation and validation of train schedules