SREcon Americas 2023, a report

Attention conservation notice: 6,200 words about conferences you didn’t attend and idiosyncratic constraints that make it unlikely I will attend again any time soon. My apologies that this report is so long, but it would take much too long to make it shorter. If you’re just looking for a summary of the actual program content, skip to the third section, no hard feelings.

The week of March 20–24 I attended a professional conference for the first time since the fall of 2019, Usenix‘s SREcon Americas 2023. I am writing this report for several different audiences, so depending on which one you are in, you may find parts of it redundant or uninteresting; apologies in advance. I’m going to start with the background (my professional background as well as the tech-industry trends that made this conference), then talk a little about my travel experience and the venue itself, before moving on to a discussion of the conference program.

Background

For those who don’t know me, or who have only ever glanced at my social media, you might not know what I actually do. My day job is at MIT Computer Science and Artificial Intelligence Laboratory, the largest interdisciplinary research laboratory at MIT, with about 110 Principal Investigators (faculty and research scientists) and over 1,000 active members and affiliates. I work in The Infrastructure Group, a team of about a dozen people which is responsible for providing computing, storage, and networking infrastructure to the Lab. This is an unusual situation: most university departments, labs, and centers are simply not large enough to afford a support group of this size, and many have no shared computing platform other than what is provided by the university’s central IT group, if even that. For the last 26 years, I have run the network for CSAIL (and its predecessor, the Laboratory for Computer Science), a stark contrast to people in the tech industry who might expect to spend 26 months at a single employer.

“I run the network” is what I usually say, without elaboration, when asked what I do. That isn’t particularly complete, but actually explaining what that means isn’t usually required or wanted; years ago, I found that many people had a misapprehension that I was involved somehow in running Microsoft Windows servers, but as ordinary people have become more distanced from operating systems and server technologies, that’s less of an issue. What I actually think of as my primary job is literally running the network: specifying and configuring the routers, switches, firewalls, and wireless access points that provide Internet access in CSAIL’s physical building and remote data center site. That additionally means I handle our relationship with the central IT group as far as our connection to campus and to the outside world goes, but it also means I’m responsible for a bunch of servers that provide fundamental network services: DNS servers, DHCP servers, the provisioning database from which they get their data, and the antique artisanally crafted Perl scripts that provide a self-service interface those services. I also run our network authentication service, our user account management system (of which I am also the principal developer), and one of our three network storage platforms. As the longest-tenured staff member in our group, I also help out with the budgeting and try to serve as a repository of our institutional memory.

For most of the years from 1998 through 2019, I attended Usenix’s premier winter conference, known as “LISA”, an acronym for “Large Installation System Administration” — the origin story says that when asked to define “large” for the first LISA conference back in the late 1980s, the program chair said “at least five computers”. Back then, advanced computers of the sort that Usenix members cared about, running the Unix operating system, were still thin on the ground; a company might have one or two large VAX minicomputers, with a bunch of terminals or a modem pool for remote access, but Digital and other companies like Sun, Hewlett-Packard, Apollo, and even IBM were selling numerous “engineering workstations” that were designed to run Unix and to be connected together in a local-area network, to companies, government agencies, and universities. LISA was started to help develop a vendor-neutral community and good practices around administering networks of many Unix systems. By the late 1980s, with the introduction of the 32-bit Intel 386 processor, it became practical to use regular (but high end) desktop computers to run Unix and perform many of the functions of these much more expensive workstations, and in a few short years, with the appearance of the free-to-use and free-to-copy 386BSD and Linux operating systems, inexpensive PCs came to dominate the Unix workstation and eventually server market, aside from a few application areas like high-performance storage or graphics that required specialized hardware.

This shift enabled the “Dot-Com Boom” of the late 1990s: all of the major web servers were developed on and for Unix systems; the databases and storage systems that were required for the first generation of e-commerce sites, ran on Unix; and there was a huge boom in both the number of companies that had big networks of Unix systems and the number of people employed in administering those systems. There was a clear need for better system administration practices, something that would allow administration to scale, especially as new startups like Hotmail demonstrated the feasibility of serving millions of simultaneous users on thousands of small rack-mounted PCs rather than a smaller number of much larger and more expensive computers. (Hotmail at this time was one of the biggest — when I saw their data center, Google was still a small startup — and rather than buying servers in cases, Hotmail mounted bare motherboards directly to sliding metal shelves. Hotmail at the time was the largest public user of the FreeBSD operating system, of which I was then a developer.)

LISA boomed with the Internet boom, expanding to include workshops, a multi-day program of tutorials, and eventually the conference program expanded to five tracks with nearly 2,000 attendees. LISA survived the “Dot-Com Crash” following the first boom, I think mainly thanks to having hotel contracts already signed three years in advance, and became one of the few places where people engineering web services and people operating traditional data centers and engineering computing systems would regularly meet and interact.

Then in 2006, Amazon launched what became known as “cloud computing”: rather than owning their own servers, network equipment, and storage, businesses — especially web sites — could simply rent them from Amazon. This was initially less of an engineering shift, as it later came to be, and more of a finance play: just as a business will often prefer to lease an aircraft or a warehouse or a retail storefront, even though they will have to pay a premium compared to the cash cost of just buying the property, with cloud computing you can rent a server by the hour instead of having to pay for it all up front, and as with those other examples, the business doesn’t have to carry those depreciating assets on its books and doesn’t have to hold reserves for their ultimate replacement — what both Wall Street and Silicon Valley investors consider a more “efficient use of capital”. Amazon would own all the servers, would own the data center, would pay for power and cooling and network connectivity, and would allow you to change how much you use almost instantly to match your actual business needs — they took away the risk of buying more servers, more disks, or a bigger network connection than your business actually required. This was great for start-ups, because they could start out small and buy more service as and when customer revenue demanded. It was also very attractive to many large enterprises, who could outsource a “non-core” part of their business, especially if they were facing a major facility upgrade or relocation, freeing up the capital (and real estate) that would have gone into a new data center to be used for something else.

For technology companies, the calculus was different: they were competing in the marketplace for talented engineers and developers and had to offer very generous pay, benefits, and often equity compensation. There was a business imperative to optimize their productivity, by building systems and practices that would allow them to make changes on the level of an entire data center as easily as (or even more easily than) they could on their own laptop. A number of companies, most notably Google (we’ll get back to them shortly) invested heavily in developing their own services and platforms for internal use to increase developer productivity. Many of these were implemented as “web services”, or what now are mistakenly called “APIs”: services that speak web protocols to exchange small blobs of data representing requests and responses, which don’t care whether the client is a browser, a stand-alone application, an embedded device, or another web service. Even more higher-level services were needed to “orchestrate” these arrays of services, to make sure that all of the required services are running, to deploy new instances and terminate old ones when the code gets updated, to increase the processing capacity when things get busy and shut idle servers down when no longer needed.

This resulted in a clear bifurcation of the LISA audience and the LISA program. There were still a good number of us in attendance who represented the old audience of education, enterprise IT, research, and government, but we were vastly outnumbered by the people working either for tech companies or for the operations groups of banks, media companies, and large industrial concerns — all of which had their own large-scale application development groups and were trying to support more and faster development with fewer and fewer operators. I can recall as early as 2010 noticing that the LISA program was less interesting and had less “business value” for me than it had ten years previously — I would not have been invited in 2010 to give the talk I did give in 2005, about CSAIL’s move to our then-new building. I got more value from the social side of the conference (the “hallway track” and the Birds-of-a-Feather sessions) than I did from the actual conference sessions. I still kept on going, because where else was I going to go?

This bifurcation was made particularly obvious by the introduction of a new term, and arguably a new concept, in the system administration literature: “DevOps”. This was the idea, which was quite radical at the time, that application developers — or at least, application development teams — ought to operate their own infrastructure, not just for testing but the actual public web services that users of their web site interact with. This was done by treating the underlying infrastructure that these services run on as yet another component of a software system that could be modified programmatically, making operations more like programming (and as a side effect, deprofessionalizing the actual maintenance and operation of the real underlying physical servers, which was now outsourced). This seems like a reasonable approach (although many developers objected) if you’re stuck in the tech mindset of “more, faster, cheaper”, but for us? We don’t have “developers”! (Well, we have a developer, for internal applications, supporting our thousand-person user base.) Another way of looking at it is that we have eight hundred “developers”, but they’re called “graduate students” and they are working individually on eight hundred individual products called “Ph.D. Theses”. There is no shared objective or profit motive, and unless they crash out, we have to live with them, and they with us, for six to eight years, during which time they would like us to please not rip the guts out form under their research, thank you very much.

Google was in an particularly influential position, as a major sponsor of LISA and Usenix conferences generally, as the employer of a large minority of LISA attendees and invited speakers, and of course as the operator of both a public cloud computing infrastructure and numerous internal platforms that support the vast reach of its services. Internally, Google had been developing a set of organizational structures and practices that came to be called “site reliability engineering” or SRE. This garnered almost as much buzz as DevOps (because if Google is going it…). In 2014, Usenix launched a new conference, called SREcon, and two years later, O’Reilly Media published Site Reliability Engineering: How Google Runs Production Systems, which I am told has been O’Reilly’s best selling book title for seven years straight. (Looking at O’Reilly’s web site you wouldn’t even know they were a publisher of actual ink-on-dead-tree books; I had to go to Amazon to look up the title!) SREcon was so successful that Usenix ended up running it three times a year, on different continents, something they had never done before. This drew a lot of the audience away from LISA, and with it, talk and paper submissions declined precipitously (the refereed paper track was abolished shortly thereafter), and so did the sponsorship that made LISA financially practical for Usenix as an organization.

I continued to attend LISA, as did several other people I had come to know over the years, because SREcon was clearly not pitched toward our professional needs or line of business. (I often wondered in those years if anyone attended SREcon who didn’t already work for Google.) On occasion, when LISA would be held in Boston — Usenix had a policy of regularly rotating between eastern and western North America — I could use my travel budget to attend another conference, since the airfare and lodging were the most significant expenses, rather than the registration fee.

I last attended LISA in 2019, when it was at the Hyatt Regency San Francisco Embarcadero. That was the last ever in-person LISA conference, as it turned out: LISA 2020 was canceled due to the COVID-19 pandemic, and LISA’21 — shifted to the summer rather than LISA’s traditional late-autumn slot — was a virtual-only event that I didn’t feel the need to watch. After LISA’21 concluded, Usenix decided to end the conference. (A retrospective was published in the Usenix magazine ;login: in August, 2021.) By that time, I had really been attending LISA for the people more than for the technical content, and I saw little reason to expect to travel to any conference again.

So why am I writing this? Why did I think it was worth attending a conference that was very much not targeted at me?

Travel and venue

Some time in mid-February, I was sitting alone in my office, and I had opened up my phone for some reason and randomly looked at Twitter. I stumbled across a tweet from @usenix advertising the SREcon Americas 2023 program. For whatever reason, rather than ignoring it, I clicked on the link and scrolled through the listings. I think I copied and pasted a quote from one of the talk descriptions to our work Slack, because it sounded interesting, and mused about possibly going — since I hadn’t done any work travel for three years it seemed like an opportune moment. Once given the go-ahead, I started looking into the mechanics of travel.

Before the pandemic, I had an MIT Travel Card, so I was not being asked to front the Institute thousands of dollars for conference registration, airfares, and lodging. The program said I had until February 28 to get the “early bird” registration and reserve a room in the conference hotel — but when I went to Concur to try to book a flight, it failed with an odd error message. It turned out that, since I had not traveled in three years, the MIT Travel Office had decided that my (unexpired) card had been lost and canceled it. I would have two weeks to try to get a new travel card, during which time I could reconsider whether I still wanted to go or not (again, I wasn’t going to put it on my own card, for which the bill would need to be paid before I would even be eligible for reimbursement). It took a while to find the right contact address for the Travel Office (during which time I had other more important things on my plate), but they confirmed that the card had been canceled due to inactivity and they’d have to order a new one. (Unbeknownst to me, registrations were running below target and Usenix had extended the “early bird” pricing and hotel block until March 3, but this didn’t enter into my calculus at all.)

SREcon Americas is slotted into the Usenix calendar for the end of March. Normally, I would not consider going to a conference near the end of March, because the World Figure Skating Championships are held every year during the last or next-to-last week of March. This year, however, that event was held in Japan, and I won’t fly trans-pacific, so I unusually had no conflicting travel plans. (The next three years will be in Montreal, here in Boston, and Prague, so there’s no chance I’ll attend SREcon again or do any other business travel in March before 2027 at the earliest.)

One factor that had me reconsidering while waiting for my travel card to be reissued was the venue. This conference was being held at the Hyatt Regency Santa Clara and the adjacent Santa Clara Convention Center, a facility that I remembered from having previously attended FAST there. It is a dreadful location, at the intersection of two traffic sewers in the middle of low-rise Silicon Valley sprawl. The principal value in having a conference there, at least in the Before Times, is that more than half of the attendees would be driving there anyway. (Transit access is about what you’d expect from Santa Clara, and although the VTA Light Rail has a stop within walking distance of the hotel, it doesn’t go to either airport, and in fact there is absolutely nothing of any value within walking distance of the hotel. (When I was there for FAST, I had to walk half a mile to the nearest sandwich shop for lunch, but for SREcon the sponsors paid for free lunch every day.) On my trip to FAST, I had flown into SFO and taken a shuttle van ride down to Santa Clara, but this time I did not want to spend so much time sitting in traffic, so I decided to stick to San Jose flights instead.

My replacement travel card arrived on February 28th, so after trudging through the rain to pick it up at the Travel Office, I set to booking. I first hit the conference registration, still not noticing that the deadline had been extended (I could have saved a few dollars if I had noticed, because there was a discount available for educational institutions had I been willing to wait for an email round-trip). I booked the (extremely expensive) hotel next, and then opened Concur to retry the flight search. Searching by schedule, all of the options looked terrible and expensive, but before I booked the least-bad expensive flight option, Concur asked me if I had considered this other itinerary that was only half the cost. It was actually better by most standards than anything I had seen, except that on the return I would have to get up at 6 AM and endure a four-hour layover in Denver. (This was at still better because it didn’t involve flying hundreds of miles out of the way to make a connection in Los Angeles, Seattle, or Houston! I’m pretty sure that there is at least one direct round trip between Boston and San Jose but I didn’t find it — maybe the return flight is a red-eye?)

I ended up flying on United, as you can tell from the Denver connection, except for the outbound DEN–SJC sector which was on a United Express RJ. (I think a flying time of over two hours on a tiny CRJ700 with a single lav and inadequate overhead space is a bit much.) MIT’s policy with respect to getting a tolerable seat, or indeed to getting a seat that comes with a full carry-on allowance on United, is unfortunately unclear and apparently left up to individual departments to figure out for themselves. As a result, I ended up spending $250 out-of-pocket to get better seats and priority boarding, which I have no idea whether CSAIL will reimburse. The equipment on the other sectors was B737-800 outbound, and A320 then B737-900 on the return; only the initial BOS–DEN leg had any in-flight entertainment, leaving me to greatly miss the moving map on the other three legs.

I’ll mention at this point that starting in late February and continuing to the present, I have been suffering from some sort of condition — by the time you read this I’ll have finally seen the orthopedist, who was scheduling three weeks out — that makes sitting in a chair for any length of time quite painful, especially in the morning: not exactly the ideal conditions to fly for six hours, then sit in a chair at a conference venue for eight hours a day, then fly back home for another six hours (plus four hours in Denver airport). Of course, there’s also still a pandemic going on, but I saw very little evidence that anyone around me was paying attention: I was the only person in sight wearing a mask, whether in the airport, on the plane, or at the hotel. (I’ll give Usenix credit for including information about the hotel’s air filtration in the program, but aside from eating and drinking I kept my mask on whenever in a public space for the duration of the trip.)

Conference program

I told several people that I was justifying my travel as “professional development”: even if nothing at SREcon was remotely connected to what I actually do for my employer, it would help me keep up to date with what is going on in the tech industry, because many trends (like cloud storage, infrastructure-as-code, continuous integration/deployment, containers, “serverless”, and “microservices”) that start out there ultimately become something we have to deal with (or work around).

The conference program was fairly neatly divided into two tracks, numbered rather than named: track 1 was more “social” and track 2 was more “technical”, and I spent most of my time in track 1. The opening plenaries set the theme (historically you’d call them “keynotes” for that reason, but “keynote” in tech conferences now seems to mean some Big Name Speaker With Something To Sell Who Nonetheless Got Paid To Speak); the phrase “sociotechnical systems” was repeated frequently throughout the program, emphasizing that this conference was not just about software but also about the organizational structures that produce it and make it operational as A Thing You Can Pay For (and that’s as reliable as your Internet connection). I had to learn a bit of new jargon to understand some of the presenters, like “toil” (which apparently what they call “actually typing stuff in a shell” now), “SLO” (“service level objective”), and “LFI” (an initialism for “learning from incidents”). Track 1 was in fact quite heavy on incident response, incident analysis, and retrospectives, which was more interesting than I expected it to be, although still short on take-home value given the size of our organization.

After the plenaries and a mid-morning snack (I had gotten up too late for the free breakfast and needed to take my medication with food), the first session I attended was about a content-delivery-network failure at Facebook, caused by a testing scenario. In this case, the CDN was ringing with requests for images, as image encoding servers were marked down, their backup servers became overloaded, the primary servers recovered and were marked up, the backup servers failed their health checks, and so on. The solution the engineers finally found was to tell their load balancers to pretend that all servers were healthy — sure, some servers were overloaded and wouldn’t be able to answer requests, but this would allow the load to be spread out evenly across the network, stopping the oscillation, and thereby becoming true.

I moved over to track 2 for the next talk, because I had a personal interest in the subject: how one Mastodon instance (hachyderm.io) handled the sudden influx of tens of thousands of new users who were leaving Twitter in response to its new Main Character For Life. Hachyderm had been serving a few thousand users on a single server in someone’s basement, but scaling to 30,000 users required significantly more resources — particularly since the all-volunteer operations team was committed to doing as much as possible in public and without depending on proprietary value-added services. The original server was also, as it turned out, failing. And oh, by the way, Mastodon is an inconsistently documented Ruby on Rails monolith. The Hachyderm team wanted to avoid putting all their eggs in one basket as far as both physical location and infrastructure providers go, which led to some interesting decisions, like running in three different clouds, and running NFS over a transatlantic network connection. (As someone who runs NFS over a wide-area network within a single small state, that’s not something I’d recommend! They eventually moved all of the media files from NFS to an object store.) The astonishing thing was that they made this work with all volunteer labor, with an average (per the presenters) of only two hours per volunteer per week.

Lunch each day was in what used to be called the “vendor expo” but is apparently now the “sponsor showcase”. I had never heard of the vast majority of the companies exhibiting; when I mentioned this to Tom Limoncelli when we chatted later, he suggested that the majority of the companies hadn’t even existed three years ago, before the pandemic. I didn’t look closely at most of the booths, but I’m given to understand a large fraction of them were actually selling products to ease incident response; another bunch were selling metrics products, either products to collect more of that data, or services to handle all of that data collection for you. The first day it was pouring rain, so I was glad enough to take advantage of the free lunch rather than walking half a mile to Togo’s for a sandwich.

After lunch on Tuesday, I was back in track 1 with a talk from two Datadog engineers about how, for once, it wasn’t DNS. This was actually one of the more technical talks in track 1, getting into the weeds over how a little oddity in their internal metrics after redeploying a Kubernetes pod led them through DNS down into the Amazon Elastic Network Adapter metrics and eventually to a configuration problem with Google RPC that was causing their DNS resolvers to get flooded. It was a great detective story and I’d recommend going back to watch the video once it’s uploaded.

That talk was followed up with Nick Travaglini’s talk about how a test of a NORAD computer system almost started World War III in 1979 — thankfully averted because some eagle-eyed staffers noticed that the timestamp on the supposed “Soviet attack” was off. This was not original research, in the sense of academic historians — everything in the talk was taken from open media and Congressional reports — but a story well told and unfamiliar even to many of us who lived through it.

The day’s final session focused on the actual practice of incident response. First, two people from jeli.io discussed the difference between being an incident commander and being an incident analyst from the perspective of having performed both jobs. Both deprecated the term “incident commander” — a collocation taken from military-style hierarchies of rank that are the norm in the public-safety field from whence the concept derives — and preferred “incident coordinator”. They noted that actually managing incident response in a large team calls for a very different set of skills compared to analyzing the event post facto, conducting interviews and retrospectives; some people might be better suited for one role over the other, and asking people to do both is a recipe for burnout. This was followed by Chad Todd from CloudStrike, who discussed his Lund University master’s thesis research about effective practices for handovers between incident responders during long-running incidents, expanding on literature that heretofore largely focused on the healthcare industry (e.g., nursing shift changes).

Wednesday morning started with an actual breakfast, to my surprise: the program said “Continental Breakfast”, but when I got down to the meeting hall, I found chafing dishes heaped high with scrambled eggs and roasted new potatoes, as well as crispy bacon and the expected pastries and beverages. I was utterly uninterested in either of the first talks of the day, and I’m not even sure what I actually did. I perked up for the second talk in track 2, although I was misled by the title and failed to read the abstract — “high priority” and “happy queues” primed me to be thinking of overbearing VIPs and ticketing systems, rather than the actual subject, correctly specifying the queues in a job queuing system. (The key, as it turns out, is that the queues should be distinguished not by programmers’ subjective notion of “priority”, but by how soon the job must be completed to meet a customer’s service level expectation.)

I bailed entirely on the second session of the day, and spent the time watching the previous day’s figure skating from Japan, then headed back down for free lunch. The first talk, by Lorin Hochstein, is very difficult to summarize — and reviewing the slides didn’t help — but he titled it “Why This Stuff Is Hard“, which gives you a flavor of the level of abstraction it’s pitched it. (It was a good talk for after lunch, talking about big concepts rather than in the weeds, but you’ll need to wait for the video.) After that, I switched back over to track 2 to learn how the Wikimedia Foundation uses Network Error Logging to learn about users’ connectivity problems directly from their web browsers, often in close to real time. It was an interesting idea, and I was quite surprised to learn that many users are able to report connectivity errors when they occur, over the same (presumably unreliable) network. Although an official W3C specification, NEL is only implemented in Chromium-based browsers; Firefox has thus far resisted implementation, rightly considering it to have significant privacy implications.

For Wednesday’s final session, it was more incidents: a short talk about actually doing some statistical analysis on a large corpus of after-incident reports, and a longer talk about practices and skills that incident responders who are not incident commanders (same language caveat) can develop to help their teams work more effectively. This was followed by a conference banquet (again paid for by sponsors) which was loud and confusing: there was apparently Real Food that I did not find until I noped out of the noise and excessive crowding and tried to find the most convenient exit. (I had three small plates of hors d’œuvres, and each time had to sit at a different table because my seat was cleared and taken by someone else before I made it back from the catering stations — but I probably would have been happier just getting a sandwich from the hotel cafe and eating in my room.) I did not attend the Lightning Talks, indeed I did not even see them on the schedule, and in any case if I didn’t finish watching the figure skating that night, NBC would take it away. (Literally, the replays were only available for two days!)

Thursday morning began with an audience-participation talk analogizing incident response to musical improvisation, which I honestly did not pay much attention to as I was busy reading my work mail for nearly the entire talk. This was followed by a talk I can’t make sense of from the published abstract but I remember as being interesting and researchy; you’ll have to wait for the video and judge for yourself.

After the mid-morning coffee break, the second session was again divided into two short talks and one full-length talk. Austin Parker’s “The Revolution Will Not Be Terraformed” was the most direct attack on the deprofessionalization of operations in the entire conference, taking lessons from Kropotkin and Bookchin to encourage the audience to resist commoditization of their field. This was unfortunately followed by a completely useless (but at least short) talk from a bank in Singapore which repeated DevOps and SRE bromides that even I was thoroughly familiar with. The long talk of the session was from a Shopify engineer talking about both failed and successful interventions to reduce cloud resource consumption in the face of budget tightening. The biggest takeaway from this seemed to be that they were wasting a lot of money on fragmentation, the capacity they were paying for but couldn’t use because the resource requirements of their services were not integral divisors of the sizes of their cloud provider’s virtual machine instances. (Yet again this seems like a recapitulation of a long-known issue people would learn about in an undergraduate CS curriculum.)

I don’t remember anything at all about Thursday’s after-lunch session other than that I was there, and the slides aren’t up yet to tickle my memory. I may have bailed on the second talk in the session; neither track’s abstracts sound like something I would have been interested in.

Like the opening plenary session on Tuesday morning, Thursday’s closing plenary was split into two 45-minute talks. In the first talk, two engineers at a bank walked through the process of developing service-level objectives that their management and developer teams would actually believe meant something, which took some trial and error to find metrics that actually correlated well with already accepted high-level measures of customer experience. The final talk of the conference, and probably the talk that most fit the mold of “classic LISA closing plenary”, was “Hell Is Other Platforms“, a riff on Sartre’s No Exit in both title and content. I would highly recommend watching the video once it’s uploaded.

So that’s the formal program. Like other Usenix conferences, SREcon has BoFs — “Birds of a Feather” sessions — where attendees can get together in a meeting room and discuss topics of their own choosing, organized by writing session topics on a blank schedule posted in the meeting hall. There were BoFs on both Tuesday and Wednesday; while the Wednesday topics were uninteresting to me, I attended three sessions on Tuesday after our free dinner in the expo hall. The first session was actually a guy pitching his early-stage startup which proposed to use GPT-3 to analyze terminal session transcripts created by operators and incident responders to help turn them into operations “runbooks”. The language model was invoked after the fact to create a human-readable description of what the command in question was doing, and a human operator could edit these or add more annotations such as specific script fragments that could be executed. The second BoF was about PostgreSQL, a database server of which I operate several instances for both personal and CSAIL systems — but it turned out to be something of a bust when the speaker and I were the only attendees.

My final BoF was organized by Brian Sebby, a sysadmin from Argonne National Lab who, like me, was a long-time LISA attendee and had come to SREcon to see if it had any value for his organization. He titled the session “LISA Refugees”, if I recall correctly — unlike the formal program there’s no digital record of the BoF sessions — and we were joined by fellow LISA regulars Tom Limoncelli (author or coauthor of several notable O’Reilly titles on system administration) and Matthew Barr; later on, we were joined by the Usenix board delegate for SREcon, Laura Nolan. We had a wide-ranging discussion about how the needs of our (Brian’s and mine) communities were not really represented very well, and how the announcement of LISA’s end had promised that Usenix would try to make more programming that was of interest to operations-only organizations. We discussed some of the other conferences we had considered attending, and some of the areas where there was still meaningful overlap. Tom suggested that we ought to consider ourselves as being engaged in the emerging practice of “platform engineering” — there are now enough people at enough companies building these technological platforms that “product” sits on top of that there is a movement to recognize this as a distinct discipline. We also discussed a number of other topics, including how even some big-name web shops are actually pretty “enterprisey” in their service architecture, and how at CSAIL we now have a lot of students doing summer internships in tech companies and being taken aback at the low level of abstraction provided by our long-evolved service offering. We’re not far at all from the “cloud generation” entering grad school and lacking a basic understanding of servers and files, if we haven’t already passed that point. It was a good discussion, even if it came to no particular conclusion.

So that was my trip. If I have any take-home from all of that (6,100 words!), it would be that my MIT colleagues on the academic side — particularly in systems and HCI areas — really should engage more with this this community. Will I attend SREcon again? Probably not, if the “Americas” edition continues to be held in March, but if the 2027 Worlds get awarded to Australia, maybe? Assuming we all still have jobs by then, and haven’t been replaced by GPT-42.

Posted in Computing, Transportation, travel | Tagged , , , | Comments Off on SREcon Americas 2023, a report

Other people’s recipes: Torrone morbido

This gallery contains 13 photos.

Yes! It’s back! Doing my part to move more of my quality content off of Twitter and onto a platform I control (well, pay for, anyway). Torrone is an Italian confection, popular for the winter holidays, which is widely available … Continue reading

More Galleries | Tagged , ,

Question 1 passed, so now what?

Given the ongoing issues with Twitter, as detailed in my previous post, I’m trying to move more of my mid-length writing back onto the blog so I’m not generating as much free content for the South African emerald-mine heir. This is an experiment; we’ll see how it goes.

Massachusetts voters have approved the so-called “Fair Share Amendment”, which provides for a 4% surtax on incomes over $1 million. Before this vote, the Massachusetts constitution required a flat income tax, without regard to the marginal utility of money. The legislature had previously put amendments before voters permitting (but not requiring) a progressive income tax, including at least once since I moved here in 1994, but the previous attempts — made at a time when Barbara Anderson and the Reganite “tax revolt” were still alive — went down to defeat. This was actually the second try at the “millionaires’ tax”; the first one was struck from the ballot by the Supreme Judicial Court, who ruled that it was improperly drafted and violated the constitution’s prohibition on combining multiple subjects in a single referendum. The proponents went back and were able to get a revised amendment through the legislature and onto the ballot, and that’s what we approved on November 8.

The constitutional infirmity with the original amendment (that should have been on the ballot in 2018) was that it directed how the new revenue was supposed to be spent. The new text leaves it up to the legislature to spend the money, although it is supposed to be spent exclusively on transportation and education. The state will begin collecting this money starting in January, 2023, so the legislature could appropriate as much as half of the revenue in the current fiscal year.

The Tufts Center for State Policy Analysis estimates that the state will collect between $1.3 and $2 billion in calendar year 2023 (although this will be split across two state fiscal years). As soon as the legislature meets, they will start working on the FY24 state budget, but the new governor, once she’s inaugurated, can submit a supplemental state budget for FY23 that would appropriate the new revenue.

Note that the very open-ended way the amendment was worded does not require that the revenue actually be used for new spending — indeed this was one of the criticisms of the amendment raised by opponents. The legislature is perfectly free to simply alter which revenue accounts existing transportation and education appropriations come from, and then reallocating the original funds to some other purpose (including tax cuts if they so desire, and our state legislature is still full of conservative anti-tax types, and not just those with an (R) after their names).

I don’t really have many strong opinions about how education is funded in Massachusetts, other than that property taxes are obviously inequitable and the state really ought to equalize funding per pupil. (Again, the amendment’s text is quite general, and the legislature could spend all of the new revenue on subsidies for UMass and nothing on transportation or local schools if they wanted to.) I’m instead going to concentrate on transportation.

Currently, the MBTA is funded by several sources: aside from fares and other “own source” revenue, the original funding mechanism was the local assessment paid by all cities and towns in the MBTA district. (Fall River voted this election to join the MBTA district, a prerequisite for the start of rail service next year.) The 80s “tax revolt” put paid to the local assessment as a major source of revenue, although the T does bond against that revenue. (The municipalities don’t directly pay the assessment: it’s subtracted from transfer payments they would otherwise receive as state aid.) As a result, the primary source of the MBTA’s non-operating revenue is direct grants from the state, which comes in two forms: “contract assistance”, which is an annual appropriation by the legislature, and one cent of the state sales tax, which flows automatically to the T and is subject to a guaranteed minimum level, allowing the authority to effectively bond against it. In FY23, the contract assistance was $187 million (of which $60 million is bond authorization dedicated to capital programs), and the state sales tax is projected to be around $1.2 billion. (The sales tax being much better than projections for the last three years has been a significant positive for the MBTA’s budget; many other transit agencies around the country were not so lucky and received substantial subsidy cutbacks because their revenues were not guaranteed.) The state’s numerous Regional Transit Agencies get no sales-tax money and depend entirely on fares and annual state appropriations — over which there is considerable fighting in the legislature every budget cycle — which currently runs around $90 million.

So now that there is this new revenue source, how much of it should go to transportation vis-à-vis education, and how should that be structured? I would argue that the first priority should be to create a structure similar to the MBTA’s sales tax dedication: pick a fraction of the surtax revenue and automatically transfer it without further appropriation to specific transportation agencies, with a guaranteed minimum that allows for multi-year planning across budget cycles and administrations. Specifically, I would propose the following allocations:

  • 25% or $250 million to the MBTA
  • 10% or $100 million to the RTAs, using a formula that rewards ridership
  • 15% or $150 million to a new east-west passenger rail authority

In the longer term, this would substitute for the existing (lower) appropriations; in FY23, a special appropriations bill could provide “top-up” funds from the first half-year of revenues. As a condition, the MBTA and the RTAs should be required to implement means-tested fares, using some program that the state already administers for eligibility, and EOHHS should be required to administer it for all of the agencies on a cost-recovery basis. The MBTA estimates (as of October) that implementing a means-tested fare program would cost between $46 and $58 million annually, which is easily covered by the increased subsidy. (At the low end of the Tufts CSPA projection, the MBTA would get $325 million, or $138 million above its current state assistance.)

What else should the MBTA be expected to do with the money? Obviously, close the structural budget deficit first and foremost, including all the new safety hires and more realistic salaries for rail and bus operators and maintenance personnel than were included in the recent 5-year pro-forma. On the capital side, the legislature should finally require electrification of the regional rail network by a date certain — I propose 2035 as a reasonable compromise between rolling-stock lifetimes and when I’m expecting to retire. I would also like to see the MBTA reduce its pass multiplier, reducing the costs of fare collection and inspection by making monthly or even annual passes the default for more riders. (Note that the MBTA currently projects a $208m budget deficit in FY24, so even the entire $138m wouldn’t be enough to solve this it, but the high end of Tufts CSPA projections would. It’s possible that a fare increase will be necessary, which could be paired with the means-tested fare program to reduce the impact on lower-income riders.)

I know the Healey administration’s transition team has put transportation in some good hands (there are few people I’d trust more than Monica Tibbitts-Nutt after watching her for four years on the MBTA’s former Fiscal and Management Control Board) but the state legislature is chock full of suburbanites with windshield brain and actually getting this program passed will require some lobbying — even if it does free up nearly $300 million for them to spend on their own pet projects.

Posted in Law & Society, Transportation | Tagged , , | Comments Off on Question 1 passed, so now what?

The Twitter That Was

Attention conservation notice: 6,000 words about the decline of a social-media platform, none of which are particularly original or well-informed.

Since Elon Musk took control of Twitter at the end of October, like many people I’ve had to ponder a number of questions: Why am I here? Am I still getting what I wanted out of this experience? Would I be better off just logging off? What alternative is there should Musk execute a Controlled Flight Into Terrain? These are hard questions to answer, and even waiting 24 hours to download my account archive hasn’t really made it any easier. I’m not a facile writer, and it’s taken me quite a bit of effort (and powering through a lot of distractions) to get even this much in one place, and it’s probably going to be pretty disjointed. (Generating words has never been a problem for me, it’s getting them in the right order that so often eludes me.)

According to my Twitter data download, I joined Twitter on March 24, 2012. I had entirely refused to be involved with any “social media” until that time; I thought it somewhere between harmful and pointless (perhaps both). But it was a time when I was losing a number of work colleagues I liked, and Twitter had made a few design choices that at least made me willing to consider it (unlike Facebook).

Structurally, Twitter’s model of “following” was much more to my taste than the faux “friend”ship of Facebook and its work-alikes (now largely forgotten). You could “follow” someone on Twitter and see what they had to say, with no expectation of reciprocity. A “favorite” in the original Twitter was more like a bookmark; it wasn’t a shadow-retweet like “likes” are today. And even things we take for granted on Twitter now weren’t part of the data model in 2012: it was all just plain text. Retweets were just posts that started with “RT” and a handle — they were often edited to add comments or just to fit into 140 characters.

That, in combination with an open API, made it possible for there to be multiple, even open-source, user interfaces to Twitter, something that wasn’t possible with Facebook. That allowed me to exclusively use a terminal-mode Twitter client for some time, keeping their tracking cookies out of my browser and avoiding the excessive use of addictive smartphone apps. This didn’t last long: within a few years, Twitter started making life difficult for third-party user interfaces and restricted full use of the API to only “official” clients. However, I still keep the terminal-mode client running 24×7 under script, so I have a running log (as rendered text, rather than JSON objects) of my entire timeline. (With the API restrictions, it can no longer show inbound notifications and frequently hits rate limits.)

Frustratingly, although the account archive does include all of my tweets and retweets, and all of my “likes” (even from back when they were still “favorites”), the following, follower, block, and mute lists are all uninformative: they contain no dates, are not sorted in any obvious order, and do not identify the user being referred to. I had hoped, when I requested the archive, to be able to look back at my follows in chronological order and that just isn’t data they seem to keep. (I find this particularly surprising given the amount of advertising-related data they do keep.) The archive does tell me that I have blocked 13,178 accounts (!), and muted 553, but doesn’t give me a count of either followers or following. (grep works, though: 1194 and 1008.) So the archive seems on the one hand excessive for normal uses and on the other annoyingly incomplete.

The first thing I ever tweeted was a link to a blog post by Mike Konczal (@rortybomb), just a month before he closed up shop at his old free-tier WordPress.com blog. While I was never particularly tuned in to the “blogosphere”, as the big (largely political) bloggers called their community, it’s clear that I was actually reading quite a lot of them, using Opera’s (RIP) built-in RSS reader. (I kept using Opera long past its sell-by date just to have that reader in my home browser.) Many of my first month’s tweets were links to, or inspired by, blogs I was reading at that time: Language Log, Three-Toed Sloth (Cosma Shalizi), Antick Musings (Andrew Wheeler, the former SFBC editor), SCOTUSblog, Jack Balkin’s “Balkinization“, and the new-deleted airline pilot blog Flight Level 390.

My parents were still living in Massachusetts at the time, so I also tweeted a bit about my father’s dog Mocha, and our Sunday dinners, which I really miss since they moved (five times since 2012). Mocha was a two-year-old rescue mutt when my father got him, and in 2012 he was hit by a car — thankfully my parents could afford the veterinary care to save him. He’s still alive today but quite old for a dog and spends most of his time sleeping, except when barking at the neighbors’ golf carts. That spring, my cousin the airline pilot got married, and I tweeted about my trip to San Francisco and the North Bay (her Scottish husband worked in tech and they lived in SF at the time). Around the same time, I was apparently reading Alon Levy, although I didn’t leave any trace of how or why, because it was another five years before I got involved with “Transit Twitter” (after a revelatory trip to Helsinki). That summer, I drove my parents up to Carol Noonan’s Stone Mountain Arts Center in Brownfield, Maine, to see a Knots & Crosses reunion show, and tweeted about it. In the early time, I often went days without tweeting anything — of course there were no Threads back then.

As I mentioned, the impetus for signing up with Twitter in the first place was a number of my colleagues all leaving at the same time. Because of a bad decision I made in 2001, I am stuck living out in the suburbs and as a consequence have no social life; I had hoped to be able to keep up with what my former coworkers were doing through Twitter. That very quickly failed to work out: most of those colleagues used Twitter seldom if at all, and even by setting up the phone app to notify me whenever they tweeted something, genuine interaction was infrequent and overwhelmed by the “firehose” of news and politics that most regular users end up with. Journalists were the first large-scale adopters of Twitter as a medium — both to advertise their work and also to cultivate sources — and the new-user suggestions at the time were very heavily weighted towards high-volume news, “entertainment”, and tech-industry accounts.

I did make an effort to follow a number of my then-current colleagues, but very few of them were or are active even daily; like my ex-coworkers, they were largely not using Twitter for interaction. Many of them were students, and at least my corner of Comp Sci Twitter largely uses it as one vehicle among many to promote their research (and for faculty, their students). There’s nothing wrong with this, but I really appreciate those people who actually use Twitter for something other than professional advancement — whether it’s Rod Brooks flaming autonomous-vehicle boosters, Mark Handley and Dave Andersen posting COVID stats for places I don’t live, or Hari Balakrishnan flaming about The Cricket. Other people I wish would tweet more, if for no other reason than to reassure me that things are still going OK for them.

Coming into Twitter in 2012, I already had a substantial variety of interests, as you can deduce from some of the blogs I mentioned above: I was nearing the end of my particular interest in broadcasting facilities, but SF, language, science, and international sports had been abiding interests of mine since the 1980s, and I gravitated towards many of those communities on Twitter. I ended up following far more economists than I would ever have expected, as well as more wildlife biologists and even more fantasy authors. (With I think one exception, the authors I follow are not the ones I read much if at all: I have a thing about knowing too much about authors.)

I created a WordPress blog in 2013, and immediately started using my own Twitter account, which probably had a hundred followers, to promote it. It wasn’t my first blog, but I had given up on trying to maintain blog software locally — there were too many diseconomies of scale compared to simply paying WordPress.com $100 a year to deal with PHP security holes and spam. (In fact, my very first blog post was about that precise decision.) Many of the things I posted in the early years on the blog would today probably be Twitter threads, but that was before Twitter actually implemented threads in their data model, let alone the mobile clients, which made a much clearer distinction between “where researched, long-form writing goes” and “where offhand, slice-of-life observations go”.

One of the first things I started blogging was recipe walk-throughs, originally as photo essays, and in conjunction with this I started following a bunch of cookbook authors, so that I’d see when they had something interesting coming out (and also to make it easier to tag them when I wanted to ask questions about one of their recipes). I followed a bunch of musicians early on, too, especially the ones whose new projects I probably wouldn’t find out about through traditional radio. I quickly realized that most of these accounts were run by publicists and not actually connected to the artists, which makes them read quite oddly — but there were once some substantial exceptions. I think Rosanne Cash may be the only one left; most of the others have simply left the platform. (In 2013 I was still regularly commenting on what my music player was playing when the mood struck, something I almost never do now.)

Looking back at my very earliest follows: of course the first people I followed were my few friends and those colleagues. I followed the official accounts of a bunch of radio shows I listened to, some defunct like Studio 360, and others still on the air like the incredible Ideas from CBC Radio. The first authors I followed were Diane Duane and Tom Limoncelli, and the first non-trade-publication journalist was Lyse Doucet, the BBC’s Canadian-born chief international correspondent (and a fellow descendant of Acadiens). Since I was still in the radio hobby, I followed a bunch of people who either were radio engineers or who wrote about it. The first musical artist I followed was Catie Curtis, followed very soon after by others from the folk scene like Patty Larkin, David Wilcox, Jonatha Brooke, and Lucy Kaplansky. I continued to add many other people in linguistics, economics, radio, and science; at one point I tried to follow all of the BBC announcers presenting on the World Service (many of whom have since left).

At some point, I made an effort to find and follow as many faculty and grad students (or former students) from our lab as I could — and I very quickly found that the Pareto principle applied to my academic colleagues as well as my co-workers: the vast majority of them said almost nothing, and certainly not nearly enough to break through the chatter in a busy timeline. Somehow, I think probably through #MarchMammalMadness, I ended up latching on to the wonderful world of “bio twitter”.

Then came November 8, 2016.

Like many people around the world, I recoiled in horror when the revolting Donald Trump won the presidency. I found many like-minded people on Twitter, but more importantly, it gave me a way to follow the day’s news without having to actually listen to the news — which had become intolerable as every A-block for three straight years started with a Trump sound bite or “hey look what racist garbage Trump tweeted today”. Thanks to blocks and the setting to disable auto image loading on mobile, I was able to avoid actually seeing much of Trump’s vomit actually on the platform itself, while still keeping abreast on what harm the administration was doing day to day. I didn’t bother following the big-name political reporters: I could be sure that the other reporters I followed would retweet if they said anything notable.

The Trump election also made for a meaner and less pleasant Twitter experience for a lot of people, and looking at accounts I followed before that election, I see that a fair number of them simply dropped off — either became entirely lurkers, or just stopped using the site — around 2017-19. But for me, 2017 was a year when I significantly broadened my interests.

This actually goes back to February of 2016, when I was driving to work one morning and saw a billboard advertising the World Figure Skating Championships. I had been a big fan of the sport as a kid (solely in a spectating role; I never learned to skate) but had dropped off after I moved to Boston and couldn’t watch Canadian television any more. I started paying attention to the Winter Olympics again in 2010, but it wasn’t until the Worlds came to Boston that I actually thought that I could actually go see a major international competition in person. I was able to buy tickets for some events at the 2016 Worlds, and I bought the program book — which included an advertisement for the 2017 Worlds in Helsinki.

I’m not going to recapitulate my history as an exchange student in Finland, but suffice it to say that I thought enough time had passed (28 years) that I would be comfortable traveling there again and not have to feel embarrassed at my lack of facility with the language. Having not visited any European country in this millennium, I was bowled over, especially by the transportation infrastructure, and made a series of blog posts about it after I got back — in addition to taking thousands of pictures of the skating and running out of space on my laptop’s tiny SSD. My experience made me feel comfortable enough going back to Finland that August for the fabled World Science Fiction Convention: something that I had heard about, but assumed was only for really hard-core fans and not for someone as ill-read as me. (It was not a coincidence that both events were in Helsinki that year: it was the centenary of Finnish independence and substantial grants were available to bring events of international importance to the country.)

My blogging about both trips to Finland got me into Transit Twitter, which has a pretty big overlap with Urban Planning Twitter and Housing Twitter. I also started to get into Information Security Twitter, although I’m not sure what the impetus was — it may well have been something that a FreeBSD developer retweeted. I started following a lot of athletes, I think in the lead-up to the 2018 Olympics, although it’s hard to be sure. (I followed a bunch of SF-related accounts around the same time, so Worldcon 75 seems likely.)

I started getting more politically engaged in 2018 thanks to Maciej Ceglowski (@Pinboard), who ran independent fundraising campaigns in both 2018 and 2020 to try to get Democrats elected in “winnable” seats that the national apparatchiki had deemed not worth pursuing — but the only real success was Jared Golden in ME-02.

Ah, yes, 2020. The year of the pandemic, when every journalist (and economist) was doing double duty as an epidemiologist. I found a very small number of people who actually were virus experts, like Trevor Bedford and Emma Hodcroft, but mostly I was relieved to be able to watch the news while I was stuck at home, alone, because the virus had pushed Trump out of the A-block. I started watching BBC World News over lunch, since I couldn’t go into the office even if I had a need or desire to do so, and probably blocked more dishonest bloviators than any year before or since. I finally gave up on The Atlantic, which had succumbed to terminal Washington brain after its unfortunate relocation from Boston some years previously. (Sorry, Ed Yong!) Of course, the pandemic canceled all of my 2020 travel plans and much of 2021’s as well, starting with the World Figure Skating Championships that were supposed to start in Montreal the week the travel restrictions started. (The 2020 Worlds were rescheduled to 2024.) Similarly, the 2021 Bobsled & Skeleton World Championships were to have been in Lake Placid (at a newly renovated facility!) and were also postponed. The practical upshot of all this was that I had a lot of time stuck at home, alone, unable to travel and with nothing much else to do for leisure except constantly scroll Twitter.

It was at this point that I realized I was in real danger of an overuse injury if I didn’t start to limit my “phone time”. I used the “Digital well-being and parental controls” feature on my phone to limit my use of the Twitter app to just two hours a day. I later tried to crank this down to one hour, but found myself constantly bypassing the restriction because I wanted to tweet something before I forgot what it was. I also started using Web Twitter much more — I had previously only used it to do things that are difficult on the app, like posting long threads summarizing MBTA board meetings back when those were in-person. Since Web Twitter can be scrolled with a mouse or the keyboard, it’s less stressful for the forearms than holding a phone up in front of my face and tapping on the touchscreen.

There aren’t that many pleasant surprises in my experience on Twitter. The communities I found myself connected with were the biggest, of course, and the willingness of some Big Names (but not New York Times reporters) to actually engage with their readers. Beyond that would be the wide variety of friendly and helpful bots — thanks to some really really bad reporting in 2016, “bot” somehow got attached to the idea of an underpaid Moldovan kid posting election disinformation under a thousand different identities, but there are actually a lot of honest bots that automatically post legitimately interesting content. Among the best was @_everybird_, which has now sadly retired, but still going (for now) are Joe Sondow’s @EmojiTetra; @pomological, which posts early-20th-century fruit pictures; the various “every lot”, “every tract”, and “every USPS” bots; @sansculottides, which tweets the current date in the French Republican calendar; @tinycarebot, which reminds you to take care of yourself; and especially @hourlykitten, which at the top of every hour posts a freely-licensed photo of a kitten from Flickr.

On the negative side, there were a few surprises. I’ve already touched on how much time I ended up spending on the app, which came to be rather concerning, and has certainly taken away from the time I could be spending doing anything productive (from cycling to reading to writing). It’s definitely made it much more difficult to get out of bed in the morning: way easier to just grab the phone and scroll for an hour than to actually brush my teeth, let alone getting my bike kit on and going for a chilly morning ride. I think it’s also made my vision worse, although it’s hard to prove that it’s anything other than natural presbyopia setting in as I approach 50, combined with a really crappy fit on my most recent pair of glasses.

Another surprise was the number of “bad bots” that exist solely to steal other people’s content — often scraped from Reddit, complete with erroneous captions, but with the original creator’s identity filed off. There’s a whole ecosystem of these, such that accounts have popped up to warn people about them, provide accurate descriptions of the scraped artwork, and maintain records of how frequently these accounts get banned and then recreated under a slightly different name. (@PicPedant is one re-identify-er that I follow; there is also @HoaxEye.)

A different genre of “bot” is structured as a “honey trap”: an account with a random female-coded name tweets out three stolen pictures of an attractive woman then follows a thousand or more randomly selected users in the hope they will follow back. The account then goes silent for months before launching a spam campaign (which eventually gets them banned). Similarly annoying are the “Kibo” bots — these search constantly for any mention of keywords they are programmed to trigger on, and then interact with those tweets in some unhelpful way, like retweeting to an unexpected audience or causing another account to reply with spam or abuse. (I have named these after James “Kibo” Perry, who in the days of yore would search the Usenet feed on world.std.com for any mention of his name, and join the conversation. He was less annoying.)

This sort of behavior is something that Twitter really ought to have had the capacity to detect and block, but never seemed to manage. Some of it seems to have accelerated since Musk’s takeover, as if the bad actors are testing the limits of the (now greatly diminished) trust & safety team.

A surprise that’s hard to characterize as either good or bad is the level of data about advertising that’s included in a Twitter account archive. There’s a file, ad-impressions.js, that lists every ad Twitter has ever presented to your account, along with the exact targeting information specified by the advertiser — even including the advertiser’s names for the prospect lists they uploaded for the campaign.

[It was at this point, a week ago, that I had to stop writing, and when I picked it back up I was not sure what direction I wanted to take.]

As I said, it was my original hope when I signed up for a Twitter account that I would be able to use it to keep up with former colleagues and coworkers who had moved on to other jobs and institutions. That largely didn’t happen; while there are a few of these people who are (or were) regularly active on Twitter, their use of the platform has been quite different from mine: with maybe one or two exceptions I can think of, largely lurker-ish and to a much greater degree, “professional”. There are a few people I’ve enabled mobile notifications for, so I actually see every time they tweet (even when they immediately delete the tweet afterward), and I think it’s fair to conclude that they’re much quieter and share far, far less of their personal lives than I do — and I don’t have much of a personal life to share. I’d feel better if I saw more of these people complaining about the MBTA, or bad business travel experiences, or even posting cute cat pictures, but they don’t.

As you might conclude from that, Twitter certainly hasn’t done anything for the loneliness, either. I have met a grand total of one person thanks to Twitter, which is far fewer even than Usenet; everyone else I follow who I have met is someone I had some prior connection with. Twitter isn’t substitute for companionship, however much I might like it to be, and I’m still stuck off here in my own isolated corner of the world. (It certainly does not help at all that the only people I ever do meet these days are in their mid-20s. That may not have been so much of a problem when I moved out to the suburbs at the age of 28, totally oblivious to the social isolation, but it is a very big issue now.)

Some of the communities that have given me value from Twitter have been moving to Mastodon. There are some real issues with this platform that lead me to think it’s not going to be an effective long-term solution to the collapse of Twitter under the overbearing weight of its new billionaire owner. I’m going to use “Mastodon” has a shorthand, because it’s by far the most popular, but technically Mastodon is a specific implementation (software package) of an open protocol called “ActivityPub”, and the whole intercommunicating network of servers using this protocol is referred to as “the Fediverse” (because it’s “federated”, or decentralized and operated by many cooperating but independent server owners).

Twitter itself has never been more than barely profitable. The company has been able to raise capital to keep going in the hope that something will come along that makes it more profitable, but at least that has come largely in the form of equity rather than the huge debt that Musk’s leveraged buyout has saddled the company with (and which is likely to be its, and Musk’s, downfall). Mastodon, however, has an enormous missing money problem: without any meaningful way to sell contextual advertising, Mastodon server operators have few options to raise revenue — donations, subscription fees, or just volunteerism are three common options today. Some very large servers may be able to sell sufficient advertising to support their operations, but as Twitter demonstrates, even with a very large audience, the costs grow faster than the revenue does. Many people have laughed at Musk’s insinuation that he can fund Twitter through subscription fees after scaring away all the blue-chip advertisers, and with good reason.

You might ask why this matters: if people are willing to volunteer their time or their money to run a Mastodon server, who’s to say that there’s any money “missing”? I think there are a number of reasons why relying on decentralized, volunteer labor and donated resources is a problem for large-scale adoption of Mastodon.

The first and most significant is that moderation is a difficult, time-consuming task. There are communities where volunteer moderation works, but they have a few features in common: the number of participants is small, the participants share a common purpose, and there is usually a formal body or person that is empowered to make the final decisions if the moderators get it wrong. Any community even a tenth the size of Twitter, if it is to be effectively moderated, is going to require a significant amount of full-time community management, and someone has to pay for that somehow.

Mastodon’s design makes this much harder, because moderation decisions have to be made by every server operator with respect not only to their own users but with respect to every other server, from the smallest single-user server (which can be spun up by anyone at any time) to big servers with thousands of users. Because moderation decisions are made by individual server operators, policies are guaranteed to be inconsistent. Because federation decisions are also made at the server level, users who find a server with a satisfactory moderation policy may find their ability to communicate with users on other servers substantially limited — indeed, users on any server may find themselves “islanded” based on something they have no control over (the behavior of other users), or may be unable to find a server that federates with all of the servers for the users they want to communicate with.

Black astrophysicist and bestselling author Chanda Prescod-Weinstein posted a thread on why this is a problem, especially for minoritized communities. The kicker.

At the level of an individual server, Mastodon’s design is not scalable. That’s probably not inherent to the ActivityPub protocol, but actually available implementations require significant investment (either engineering effort to rework the design, or simply throwing compute resources at the existing code until it works or completely breaks). Mastodon has been around, under the radar, for several years without attracting a significant user base; the Twitter exodus (exoMusk?) has highlighted the difference in a design for a few thousand users with low fan-out and the design required to serve a few million users with high fan-out. In a reply to Dave Guarino, Dan Hon writes about how the low-cost tiers of his hosted Mastodon server were inadequate, requiring him to upgrade to $19 a month — for one user with a few thousand followers. That does not bode well for a government or a news organization that needs to host hundreds of official accounts broadcasting information to potentially millions of followers on thousands of federated servers.

Twitter is able to handle this sort of load (now, unlike ten years ago) because it has both made significant capital investments (building distributed data centers, completely rewriting its core code base including both the user interface and the message routing system underneath it) and has (or had, until Musk fired them all) a significant operations staff who were knowledgeable about the implementation and could respond to issues before they caused major outages. (When I signed up in 2012, the “fail whale” was still a thing, although on its way out — that was Twitter’s custom version of the “500 Server Error” response that every Ruby on Rails app generates if the application code raises an unhandled exception or fails to start in a reasonable time.)

Twitter has been able to make significant architectural changes — like making tweets longer, threadable, and deletable — because of its centralized but globally distributed infrastructure. Mastodon and the Fediverse could conceivably evolve in similar ways, making the software more scalable, but it’s a substantial lift without multiple large engineering teams. The likely best case for this involves most Twitter refugees landing on one or a small number of Mastodon servers, which gets us back to the issue of the missing money. If a million ex-Twitter users land on masto.social (or pick your favorite other instance), are they going to bring in sufficient revenue (either direct payments or advertising) to even be able to pay for server operations, let alone engineering effort to scale their server to that level? Many Mastodon servers and hosting providers being closed to new users/customers at the moment suggests that they’re having trouble doing that now, with only the cognoscenti trying to make the jump away from Twitter.

It is part of the nature of the Fediverse that there is no central list of servers or directory of users: servers can and do come and go at arbitrary times, and the only notification to anyone is that a user on that server starts subscribing to the feed of a user on a different server. As a result, there is no way to directly search for anything or anyone globally: you can search the users on the server you’re using, and their contacts on other servers, but that’s it unless you know the correct remote server to search. There’s no global identity; at best, we could end up with Big Name users on an instance that their employer, or talent agency, or publicist runs. The first of those options at least works the same as email, and would allow employers to enforce a stronger separation of “work” and “play” than they currently are able to do with centralized identity on Twitter. (Sometimes the first public notice of a journalist’s new employer has come when their Twitter handle suddenly changed!)

This creates a Sybil problem, however: not only can anyone create an account impersonating a famous person or brand (what Twitter’s @verified program was created to handle, in response to a lawsuit and subsequent consent decree with the Federal Trade Commission), but for almost no money (the cost of a domain registration and hosting) anyone can create a whole Mastodon instance to impersonate a brand — and every other individual Mastodon server operator will have to decide whether to federate with it or not. Different server operators will have different policies and will undoubtedly be subject to different legal regimes regarding parody, defamation, and trademark laws. (Almost all of The Discourse about the collapse of Twitter had completely ignored the millions of users in other countries where monetization is difficult — these are effectively subsidized by Twitter revenue in the US but in a federated system may be left out in the cold.) Supposing Merck & Co., the giant pharmaceutical company, wants to start its own instance; will Mastodon instances in Europe be required to refuse federation with it on the grounds that Merck KGaA is “the real Merck” in Europe?

Nicholas Weaver points out the underlying issue with identity. Twitter got along perfectly well with lots of pseudonymous users; indeed, many Twitter users follow a good number of them. “James Medlock” is a real person but that’s not their actual name; they’re not impersonating anyone so there’s no reason that it should matter what the government calls them. But for public officials, institutions, media outlets, firms, and brands, it matters greatly that the public is not confused about who speaks for them. For any entity trying to do customer support over social media (which if they’re smart, they stopped doing last week), it is absolutely essential that customers be able to tell legitimate support accounts from scammers. Scams have real-world effects: Someone spent $8 this past week to create a fake “verified” Twitter account claiming to be Eli Lilly (using Musk’s new “pay for Twitter Blue and get this free hat blue badge” policy) for a hoax post about insulin prices that tanked the company’s stock. (As someone who has received more than his share of pharma ads on Twitter, this must have sent shock waves through the industry; it’s not just Musk setting his own money on fire here.)

So it’s clear that under Musk, Twitter doesn’t necessarily have any better answer to the impersonation problem than the Fediverse does — but because most Twitter users have a reliable pre-Musk history of accounts they follow and are followed by, new hoaxes are at least plausibly discernible. (Historically, the way this sort of hoax was run was to compromise an existing verified user’s account and change the display name, because in the old system, verification was tied to the account’s @-handle and not to its “name”, allowing once-verified users to change their names without losing the badge.) That account history also means that the discovery problem is temporarily papered over by the continued existence of Twitter itself: people leaving Twitter are using Twitter-based apps to find the Mastodon handles of their (presumed reliable) Twitter contacts, and so they’re not currently falling victim to the Sybils — but it will be a much larger problem if Twitter completely collapses and there’s no other widely agreed source of online identity.

People can put their Fediverse identities on their web sites, for sure, but the @foobar@baz.quux format is not very friendly for many of the other channels through which people distribute contact information — spoken on the radio, written on a sign or a note card, in six-inch type on a transit bus — advertising really needs a flat, easy to type “AOL Keyword”, not a structured hierarchical naming system. (And yes, that’s something of a surprise coming from me of all people; my thinking has evolved on that subject, as I’ve learned to consider social and user-interface concerns and not just the “purely technical” aspects of a design. I don’t think Musk has.)

OK, so you’re 6,000 words into this essay and probably wondering when I’m going to get around to what I, personally, am going to do about this.

Attention is a finite resource. Even in my ADHD-addled brain, attention is limited (and I guess I have ADHD Twitter to thank for the understanding that I display classic symptoms of adult ADHD, because no medical professional ever has). I already spend too much time scrolling Twitter; I do not use any other social network (most of which are even more evil than Twitter — that’s how I made that choice) and cannot have additional apps demanding even more of my time and attention. Nor do I particularly desire to bridge between two networks — although before Twitter closed their API you could have built an application that seamlessly integrated Twitter and ActivityPub and RSS and Jabber and Slack and all manner of other text-oriented publish/subscribe mechanisms.

For the moment, that means sticking with the devil I know, Twitter. There may come a time — probably long after it’s clear to the rest of my network — that the social center of the word people has decisively moved to another platform, and assuming it’s not run by Facebook Meta I’ll probably switch over completely to that, whatever it may turn out to be. But my follow network extends in a lot of different directions, and it seems possible that not everyone will end up in the same place, and then I’ll have to choose who to abandon and which connections to maintain. I don’t relish the prospect. Quoting @inthefade:

today’s timeline feels like we’re all standing around a hospital bed waiting for grandma to die. unfortunately, grandma held the family together and when she finally dies, we scatter like leaves in the wind

(link) (hat tip: Doug Newman)

Posted in Administrivia, Broadcasting & Media, Computing, States of mind | Tagged , , , | Comments Off on The Twitter That Was

More comments on the MBTA’s capital plan

Since my last post, the state legislature has gotten down to work in earnest on the FY23 budget, but unfortunately I have not had time to do a dive into the Senate version of the budget before the logrolling started due to work commitments. I did, however, have time to review the Boston Region MPO’s five-year Transportation Improvement Plan, part of a federally-mandated public process for transportation agencies that receive federal subsidies. My comments were published as part of the MPO board materials for the May 26 meeting. I also watched the MBTA board’s Audit and Finance committee meeting, at which the final CIP was previewed, and needless to say, the T gave no sign of having responded to public comment in any meaningful or constructive way. They do say that they will publish the public comments in the summer some time.

Because of those work commitments I was a bit behind schedule to leave a voice message for the board, so that the board would be forced to ignore me in near-real time rather than not even bothering to read my comments, and there is now a day-before cutoff for voicemail. There might well be a similar cutoff for email, given the early hour that chair Betsy Taylor likes to have these meetings. I sent email anyway, but since the T refuses to publish the written comments the board receives, I am publishing the text of my comment below. You’ll note (as it was intended for a voicemail) that it’s more focused on commuter rail rather than the laundry list of projects I commented on in the official public engagement for the CIP.

[salutation and introductory material deleted]

Unpowered, locomotive-hauled coaches have been functionally obsolete in passenger service since the 1980s. There is no situation in which it makes sense for the MBTA to be purchasing new coaches at any time in the future: all future rolling-stock procurements MUST be modern self-propelled equipment, not unpowered coaches. If, due to management’s foot-dragging on electrification, additional diesel-powered rolling stock is required, domestic manufacturers are ready and able to supply EPA Tier IV compliant diesel multiple-unit vehicles which can operate on the Old Colony lines, including South Coast Rail, which obviates the need for any additional coach purchases beyond the procurements the Authority has already awarded.

This management team has had two and a half years to make progress on electrification. As far as this CIP indicates, they have absolutely NOTHING to show for it — and this board has certainly done nothing to hold them to your predecessors’ commitment. In that time, the MBTA has announced ONLY ONE of seven required high-platform projects on the Providence Line, none (of two) on the Fairmount Line, none (of two) on the Stoughton branch, and one (of fifteen) on the Eastern Route. Meanwhile, plans have moved forward for high platforms at seven stations on the Worcester Line (although not the most important one, Back Bay), which benefits me and my neighbors by speeding up our trains, but is a complete strategic mismatch with the Authority’s ostensible priorities for electrification.

The previous Secretary of Transportation was under the mistaken impression that high platforms were solely an accessibility issue. They are not. A standard platform height (whether low or high) is a “can we buy modern rolling stock in a competitive procurement” issue. I urge the board to direct management to adjust its priorities accordingly. If that means we get electrification on the Worcester Line by the end of this decade, I won’t complain — but the course the Authority has set in this CIP is to not electrify anything before the end of the decade, and that does not serve the interests of riders or the Commonwealth.

Posted in Transportation | Tagged , | Comments Off on More comments on the MBTA’s capital plan

Comments on the MBTA’s FY23-27 Capital Investment Plan

At its last board meeting in March, the MBTA released its first five-year Capital Investment Plan (CIP) since the COVID-19 pandemic sent the agency scrambling. This marks a significant change from prior CIP cycles, in which the MBTA’s and MassDOT’s projects have been combined together in a single state CIP for all transportation investments. The old CIPs, done by MassDOT on MassDOT’s schedule, made public comment pointless, since the plan had to be approved by the end of June, only a few weeks after it was published in late May, and staff responses to comments would not be published until late summer, when it was too late to have any effect.

What follows is my submission, lightly edited for formatting, to save you the effort of using the public records law to find out what I said (since the MBTA will only publish comments in aggregated form).


I will begin my comments with some general remarks. The timing, structure, and presentation of this CIP are a significant improvement over recent years, and the introduction of information about project phasing in particular is a welcome addition. On the content, however, it is still quite disappointing that the staff evidently decided to “wait the old board out” and make no progress whatever on the electrification of the commuter rail system that was directed by the FMCB. We should, had the Authority not decided to drag its feet, now be seeing significant capital programs for new vehicles, platform upgrades on the Providence and Fairmount lines, and design contracts for electrification on the Fairmount Line and the Eastern Route. None of these are anywhere to be found in this document.

In particular, the Authority was directed to proceed without delay to a pilot of electric multiple unit service on the Providence Line — a line which has six low-platform stations that all must be upgraded to high platforms in order for to support modern rolling stock. In one of the FMCB’s last actions before its termination, the board also published a document describing ways to improve worker productivity and safety, which highlighted, among other issues, the problem of “traps” on the Authority’s existing, obsolete commuter-rail equipment. Nonetheless, senior management has allowed the staff to continue to treat low-level platforms as solely an accessibility issue and not a sine qua non for modernization of the service (and a necessity for purchasing modern off-the-shelf multiple-unit rolling stock).

It is still somewhat difficult to figure out exactly what some of these projects actually are. I would suggest that every project over some threshold dollar value (say, $25 million?) should have a project page on mbta.com and the CIP document should link to project pages whenever they exist so that the public may provide more informed comment.

My comments on individual projects and groups of projects follow, indexed by project ID.

P1108
erroneously categorized as commuter rail. What is the division of responsibility between the MBTA and municipalities for these street improvements? How does this differ from P1113?
P1005a, P1005b
Strongly support these bus priority projects. Center-running bus lanes with dedicated stations are among the most effective investments the MBTA can make for bus passengers.
P0940
Please clarify the locations and time scale involved. The budget seems quite low, based on recent MBTA commuter rail construction projects, so assuming this is a design-only project, there should be other projects within this CIP that would fund the construction — otherwise it’s anything but “early action”.
P0906
You told us in February that you were going to destroy the trolleybus infrastructure. I support the modernization of the North Cambridge trolleybus network as this project proposes, and the replacement of the existing fleet (and expansion of emissions-free service) using extended-range battery trolleybuses similar to those deployed in Dayton, San Francisco, and Seattle. But there’s nothing else in this CIP to suggest that you have even considered that (unsurprising given the lies told by staff at the last public meeting).
P0889
Obviously this project is in progress, but “South Station Expansion” is unnecessary and a waste of money given the significant inefficiencies of the current South Station terminal operation. Fix the operational problems (such as slow clearance of platforms, unreliable diesel locomotives, and low-speed turnouts) before pouring more concrete. (And obviously the North-South Rail Link tunnel should be built and would completely obviate any need for more surface platform capacity at South Station.)
P0869
This was approved by the board more than a year ago. What is holding up the final conveyance?
P0752
Blandin Ave. does not cross the Worcester Line. Is this project actually on the Framingham Secondary, a freight line connecting Framingham and Mansfield?
P0705c
If you’re going to destroy the trolleybus infrastructure, why do you still need a duct bank in front of Mt. Auburn Cemetery? What purpose would it serve?
P0692
One hopes that “capital improvements” includes upgrades to Sharon substation to support electrification of Providence and Fairmount commuter rail.
P0261
Strongly support reconstruction of stations on this line to reduce dwell times and improve passenger safety and accessibility. Unclear to me that restoring a third track for the full length offers significant benefits, because no operational model is specified and it’s not clear how it relates to rail transformation. (A three-track line is not especially useful for frequent, all-day bidirectional service as called for by the previous board — you would need to restore all four tracks for that; Amtrak or FRA grants should pay for it if the only benefit is to infrequent intercity trains.) Support proceeding with the design process so that some of these constraints can be fleshed out in public.
P0214
Support completing this project. The existing project page on mbta.com has not been updated since 2020 and needs to explain where the project is at and what the revised timeline is.
P0206
The “Foxboro Pilot” seems to be dead and probably not coming back, but I guess you’ve already spent most of the money….
P0164
Needs complete description.
P1101, P1010, P1011, P0920, P0921
Strong support.
P1002
Arborway is being replaced by 2027. Why build permanent buildings at the old location now?
P0952
Layover facilities should be located at the ends of the lines; rolling stock should not be stored in Boston.
P0863
Planning for maintenance and layover facilities should ensure that they are capable of handling articulated (non-separable) multiple-unit sets of at least 330 feet in length, to unconstrain choices of future rolling stock. Construction at Readville must accommodate electrification.
P0671, P0671a, P0671b
While recognizing the need for swift action to replace Quincy garage, the cost of $3.35 million per bus is unacceptable, and designs for subsequent bus facility replacements like Arborway must be constrained to a more reasonable amount.
P0671c
Do not support destruction of trolleybus infrastructure — at a time when all major builders can supply extended-range battery trolleybuses — merely to satisfy Vehicle Engineering’s unjustified desire for all buses to be identical. Safety and accessibility in the Harvard bus tunnel require left-hand doors.
P0609
Sorry, what does GLX have to do with something 20 miles away in Billerica?
P0515
Support. This is only item in the CIP that demonstrates recognition of the requirement to electrify commuter rail service.
P1150
Where? For only $10 million, at MBTA costs that’s like one station’s worth of high platform.
P1025
Support. When I visited in 2021, the garage was both nearly empty and visibly in very poor condition. Lynn would do much better with transit-oriented development on this site to take advantage of more frequent regional rail service. In the mean time, the garage needs either to be properly maintained or to be taken down.
P1009
Where?
P0970
Attleboro station requires high-level platforms. $1.2m really ought to be enough, but on the basis of recent projects that’s low by about an order of magnitude.
P0890
Support.
P0761, P0689c
I recall numerous presentations to the previous board about bus stop amenities, and discussions of a new contract for modern shelters. Is this all that’s become of that extensive discussion? Another priority that management has just dropped the ball on?
P0395
Strongly support the completion of this project, which will relieve a significant bottleneck on the Worcester Line.
P0179, P0178, P0174
Support.
P0173
You’re just going to have to go back and put in full-length platforms, which FTA should never have allowed you to leave off this project.
P0170
Strongly support the revised scope of this project with platforms serving both tracks.
P0168, P0129, P0117
Support.
P1152, P0652
No no no no no. Unpowered coaches are generations-obsolete technology and the MBTA should not be planning to still be operating them in 2050, regardless of motive power. To the extent new diesel-hauled equipment is necessary as a result of management’s foot-dragging on electrification, the Authority should be purchaing dual-mode (diesel/electric), single-level, articulated multiple-unit vehicles with a passenger capacity between 250 and 400, with a manufacturer option to remove the diesel prime mover at mid-life overhaul. Such vehicles could be deployed immediately on the Old Colony lines (including South Coast Rail), allowing the 67 obsolete coaches (81 including the SCR order) to be moved to other lines until high-platform and electrification construction has progressed sufficiently. This is sufficient to retire all currently active BTC-1C, BTC-1A, and BTC-3 coaches.

This style and passenger capacity “right-sizes” vehicles for the future all-day bidirectional service, optimizing the use of equipment and personnel by eliminating separate compartments on most trains, improving acceleration, and reducing dwell times, thereby allowing equipment to cycle faster. All major domestic builders offer families of multiple-unit equipment with a variety of power sources, allowing for parts and training commonality and reducing maintenance expense.

I cannot emphasize this enough: there is no “leapfrog” move available here. The MBTA must electrify its commuter rail network, it must do so using standard 25 kV overhead catenary, and it must purchase modern rolling stock and construct uniform high platforms. Additional investment in the current operating model, inherited from the freight railroads 50 years ago, is unacceptable.

The IIJA includes funding programs to support mainline rail electrification, grade crossing elimination, and station accessibility; the MBTA must aggressively pursue these opportunities as they are opened for applications.

P0893
I understand that this contract has already been executed; otherwise the same comments apply as for P1152.
P0653
See my comments above with respect to buses and bus facilities.
P0369
Strong support for the type 10 program and related capital improvements to bring the Green Line closer to industry standard for light rail facilities. This will improve service reliability and reduce operating costs over the lifetime of these vehicles.
P0362
The public deserves more frequent status updates regarding this procurement, quarterly at a minimum.
P0866
Strong support for Red-Blue Connector; continued design should be funded to support applications for discretionary FTA grants.
Posted in Transportation | Tagged | Comments Off on Comments on the MBTA’s FY23-27 Capital Investment Plan

A big PostgreSQL upgrade

I don’t write here very often about stuff I do at work, but I just finished a project that was a hassle all out of scale with the actual utility, and I wanted to document it for posterity, in the hope that the next time I need to do this, maybe I’ll have some better notes.

Introduction

A bit of backstory. I have three personal PostgreSQL database “clusters” (what PostgreSQL calls a single server, for some reason unknown to me), plus I manage two-and-a-half more at work. All of them run FreeBSD, and all of them are installed from a custom package repository (so that, among other things, we can configure them to use the right Kerberos implementation). These systems were all binary-upgraded from PostgreSQL 9.4 to 9.6 in the summer of 2020, but all are much older than that — one of the databases started out 20 years ago running PostgreSQL 7.0, and another one was at some point ported over from a 32-bit Debian server (although not by binary upgrade). We have a shared server for folks who want an easy but capable backend for web sites that’s managed by someone else, and that has about 50 GB of user data in it (including, apparently, an archive of 4chan from a decade or so ago that was used for a research project). For my personal servers, I have applications that depend on two of the database “clusters” (one on my home workstation and another on my all-purpose server), and the third “cluster” is used for development of work projects. At work, in addition to the big shared server, we have a couple of core infrastructure applications (account management and DNS/DHCP) that depend on the other “cluster” — these are separate to avoid dependency loops — and the “half” was originally supposed to be an off-site replication target for that infrastructure server, but since I never managed that, I could use it for testing the upgrade path.

Now, as I said, we were running PostgreSQL 9.6. As of December 1, that version was still supported in the FreeBSD ports collection, and so we could build packages for it, but it had just gone end-of-live as far as the PostgreSQL Global Development Project was concerned. The FreeBSD ports maintainers have a history and practice of not keeping old versions of packages that are no longer supported upstream — unlike Linux there’s no big corporate entity selling support contracts and funding continued maintenance of obsolete software packages and therefore no “megafreeze”. So it was clear that we needed to get off of 9.6 and onto something modern — preferably 14.1, the most recent release, so we don’t have to do this again for another few years. But if I got stuck anywhere in the upgrade process, I wanted it to be as recent a release train as I could possibly get onto. For that reason, I decided to step through each major release using the binary pg_upgrade process, identifying and resolving issues at a point where it would still be relatively easy to roll back if I needed to do some manual tweaking of the database contents (which as it happened did turn out to be necessary).

All but one of these databases are small enough that it would be practical to upgrade them by using a full dump and restore procedure, but of course the 50GB shared database is too big for that. I wanted to maximize my chances of finding any pitfalls before having to upgrade that database, which meant the same pg_upgrade in-place binary upgrade process for all of them. Running pg_upgrade on FreeBSD is a bit involved, because different PostgreSQL versions cannot be installed together in the same filesystem, but this part of the procedure is fairly well documented in other sources online. I have two separate package build systems, one for work and one for personal, because the work one doesn’t need to bother with time-consuming stuff like web browsers and X, whereas the personal one is what’s used by all my workstations so it has all of that. In both cases, though, package repositories are just served directly from the repository poudriere creates after every package build.

Building packages

Because poudriere builds packages in a clean environment, there is no difficulty in building a package set that includes multiple PostgreSQL releases. Where the challenge comes in, however, is those packages for which PostgreSQL (or more specifically, postgresql-client) is an upward dependency — they can only be built against one PostgreSQL version, either the default one defined in the FreeBSD ports framework, or (most relevant for my case) the one set in /usr/local/etc/poudriere.d/make.conf in the DEFAULT_VERSIONS variable. poudriere has a concept of “package sets”, packages built with different build parameters but otherwise from the same sources and build environment, which makes it easy to build the six different package repositories required for this project: we can just create a pgsql10-make.conf, pgsql11-make.conf, and so on, and then use the -z setname option to poudriere bulk to build each repository.

Now, one of the things poudriere is good at, by design, is figuring out which packages in a repository need to be rebuilt based on current configuration and package sources — so we don’t need to actually rebuild all of the packages six times. First, I added all the newer PostgreSQL packages (postgresql{10,11,12,13,14}-{client,server,contrib,docs}) to my package list, and built my regular package set with my regular make.conf and all of those versions included. (I lie: actually I’ve been doing this for quite some time.) Then, I made six copies of my repository (I could have used hard links to avoid copying, but I had the disk space available) using cp -pRP, after first checking the poudriere manual page to verify where the setname goes in the path. (For the record, it’s jail-portstree-setname.) Then I could step through each setname with poudriere bulk and only rebuild those packages which depended on postgresql-client. All I needed to do was make these additional sets available through my web server and I would be ready to go.

Because I knew this was going to be a time-consuming project, I chose to freeze my ports tree after the Thanksgiving holiday: I could either manage a regular package update cycle (which I usually do once a month) or I could do the database upgrade, not both.

The “easy” systems

For obvious reasons, I started out working on my personal machines; they all have slightly different configurations, with their own motley collections of databases, loaded extensions, and client applications. They were set up at different times and were not under any sort of configuration management, so they were all subject to some amount of configuration drift, but none support remote access, so I didn’t have to worry about synchronizing pg_hba.conf or TLS certificates — which was a concern for the work servers, which had exclusively remote clients. And since I was the only direct user of these machines, it was easy for me to make the upgrades in lockstep on all three servers, so I wouldn’t get stranded with different machines requiring different PostgreSQL releases. (That wouldn’t have been a crisis on any of those machines, but would have been a bigger issue at work where the package repository configuration is under configuration management.)

The whole process actually was pretty easy, at least at first, and worked pretty much as the various how-to articles suggest: create a directory tree where you can install the old packages, stop the old server, initdb, run pg_upgrade, start the new server, and do whatever pg_upgrade told you to do. This is not automatable! You have to actually pay attention to what pg_upgrade says, which will vary depending on database configuration, what extensions are loaded, and on the specific contents of the database cluster, in addition to which “old” and “new” PostgreSQL releases are targeted. (You must always use the pg_upgrade supplied with the new server release.) I’ll give a full rundown of the process at the end of this post.

The first showstopper issue I ran into is that PostgreSQL 12 dropped support for WITH OIDS. If you’re not familiar, in early versions of PostgreSQL (indeed, even before it was called that), every row in every table would automatically get a column called oid, which was a database-wide unique numerical identifier. There are a whole bunch of reasons why this turned out to scale poorly, but the most import of these was that the original implementation stored these identifiers in an int32 on disk, so if you had more than four billion tuples in your database, they would no longer be unique (and nothing in the implementation would enforce uniqueness, because that was too expensive). The oid type served a useful function in the internals of the database, but by PostgreSQL 9.x, actually using the oid column was deprecated, and the default was changed to create new tables WITHOUT OIDS.

It should not surprise you, if you’ve read this far, that some of these databases dated back to PostgreSQL 8, and therefore were created WITH OIDS, even if they they didn’t actually use the implicit oid column for anything. (I had to carefully check, because some of my applications actually did use them in the past, but I was able to convince myself that all of those mistakes had been fixed years ago.) None of this was an issue until I got to the step of upgrading from PostgreSQL 11 to 12 — because PostgreSQL 12 entirely dropped support for WITH OIDS tables: you can’t upgrade from 11 to 12 without first either dropping the old tables or using ALTER TABLE ... SET WITHOUT OIDS while running the old release of the server. pg_upgrade can’t patch this over for you. On more than on occasion, I got the 12.x packages installed only to have pg_upgrade fail and have to roll back to 11.x.

The first time this happened, I was almost ready to give up, but I was able to find a howto on the web with the following extremely helpful bit of SQL to find all of the WITH OIDS tables in a database:

SELECT 'ALTER TABLE "' || n.nspname || '"."' || c.relname || '" SET WITHOUT OIDS;'
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE 1=1
  AND c.relkind = 'r'
  AND c.relhasoids = true
  AND n.nspname  'pg_catalog' 
ORDER BY n.nspname, c.relname;

(Hint: replacing the semicolon at the end with \gexec will directly execute the DDL statements returned by the query, so you don’t have to cut and paste.) Note that this procedure must be run on every database in the cluster, using the database superuser account, to get rid of any remaining OIDs.

Another important thing I ran into is that some PostgreSQL server-side extensions define new data types, and the code in those extensions must be available to the old server implementation for pg_upgrade. The easiest way to ensure this is to make sure that the old versions of all of the extension packages are installed in the same temporary filesystem as the old server packages are. In my case this was easy because I was only using extensions included in the postgresql-contrib package, which in the process above was built for every version I was stepping through.

Once I fixed the WITH OIDS issue, I completed the upgrade to 12.x and let it burn in for a day before continuing on with 13 and 14, so the whole process took about a week, but I was confident that I could do it in under four hours for the work servers if I could just deal with the OID issue.

The hard systems

I used the query above to check all of my work databases for OIDful tables, and there were quite a bunch. I was able to confirm that the ones for our internal applications were just ancient cruft. Frankly, most of the data in our shared server is also ancient cruft, so I largely did the same there, but several of the tables belonged to someone who was still an active user, and so I asked first. (He told me I could just drop the whole schema, which was convenient.) Finally, I was ready, and sent out an announcement that our internal applications would be shut down and the shared database server would be unavailable for some unknown number of hours this week. The process ended up taking about 2½ hours, most of which was spent copying 50 GB of probably-dead data.

pg_upgrade has two modes of operation: normally, it copies all of the binary row-data files from the old version to the new version, which allows you to restart the old database with the old data if necessary. There is another mode, wherein pg_upgrade hard-links the row-data files between the two versions; this is much faster, and obviously uses much less space, but at the cost of not being able to easily roll back to the old server version if something goes wrong. All of our servers use ZFS, so a rollback is less painful than a full restore from backups would be, but it’s still much better if I don’t have to exercise that option. On the big shared server, it would simply take too long (and too much space) to copy all of the data for every upgrade, but it made sense to copy the data for the upgrade from 9.6 (old-production) to 10.x, and then link for the each successive upgrades, guaranteeing that I could restart the old production database at any point in the process but not worrying so much about the intermediate steps that would be overtaken by events in short order.

Those other servers are included in our configuration management, which I had to stop during the upgrade process (otherwise it would keep on trying to revert the package repository to the production one and then reinstall the old PostgreSQL packages). This also required paying more attention to the server configuration files, since those were managed and I didn’t want to start the database server without the correct configuration or certificates (having had some painful and confusing recent experiences with this). I had to stop various network services and cron jobs on half a dozen internal servers, and call out to our postmaster to keep the mail system from trying to talk to the account database while it was down (all of these applications are held together with chewing gum and sticky tape, so if someone tried to create a mailing-list while the account database was down, the result would likely be an inconsistent state rather than a clean error). I started by copying the existing network and accounts database to the off-site server, so that I could run through the complete upgrade process on real data but on a server nobody was relying on. (I initially tried to use pg_basebackup for this, but it didn’t work, and I fell back to good old tar.) It was in running through this process that I discovered I had neglected to account for a few pieces of our managed configuration. That dealt with, I then proceeded to the production account and network database, and finally the big shared database full of junk.

The actual upgrade procedure

Note that as a consequence of having previously upgraded from PostgreSQL 9.4 to 9.6, our package builds override PG_USER and PG_UID to their old values; thus, the database superuser is called pgsql and not postgres as in the current stock FreeBSD packages. The procedure assumes that you are typing commands (and reading their output!) at a root shell.

Preparatory steps on each server

Before doing anything, check /etc/rc.conf and /usr/local/pgsql to verify that there is nothing special about the database configuration. Most importantly, check for a postgresql_data setting in rc.conf: if this is set, it will need to be changed for every step in the upgrade. Also, check for any custom configuration files in the data directory itself; these will need to be copied or merged into the new server configuration. (Because of the way our configuration management works, I could just copy these, except for two lines that needed to be appended to postgresql.conf.)

zfs create rootvg/tmp/pg_upgrade
cd /usr/src
make -s installworld DESTDIR=/tmp/pg_upgrade
mount -t devfs none /tmp/pg_upgrade/dev

Note that these steps will be different for servers running in a jail — you will probably have to do all of this from the outside. The mount of a devfs inside the destination tree is probably unnecessary; it’s not a complete chroot environment and the package setup scripts aren’t needed.

At this point, I would stop configuration management and set downtime in monitoring so the on-call person doesn’t get paged.

conffiles="cacert.pem keytab pg_hba.conf pg_ident.conf postgresql_puppet_extras.conf server.crt server.key"

This just sets a shell variable for convenience to restore the managed configuration before starting the server after each upgrade. Depending on your environment it might not be necessary; you will almost certainly have to customize this for your environment.

The following is the main loop of the upgrade procedure. You’ll repeat this for each release you need to step through, substituting the appropriate version numbers in each command.

# install the _old_ database release in the temporary install tree
pkg -r /tmp/pg_upgrade/ install -y postgresql96-server postgresql96-contrib
# Update the package repository configuration to point to the repo for the _new_ release
vi /usr/local/etc/pkg/repos/production.conf 
cd /usr/local/pgsql
service postgresql stop
# If and only if you set postgresql_data in rc.conf, update it for the new release
sysrc postgresql_data="/path/to/data/$newvers"
# Try to install the new release. This is known to fail going from 9.6 to 10.
pkg install postgresql10-server postgresql10-contrib
# If the above fails, run "pkg upgrade" and then repeat the "pkg install" command

# Create the new database control files. Will fail if the data directory already exists.
service postgresql initdb
# Check whether the upgrade can work
su pgsql -c 'pg_upgrade -b /tmp/pg_upgrade/usr/local/bin -B /usr/local/bin -d /usr/local/pgsql/data96 -D /usr/local/pgsql/data10 -c'
# The most common reason for this to fail is if the locale is misconfigured.
# You may need to set postgresql_initdb_flags in rc.conf to fix this, but you
# will have to delete the data10 directory and redo starting from the initdb.

# Same as the previous command without "-c". Use "-k" instead for link mode.</i<
su pgsql -c 'pg_upgrade -b /tmp/pg_upgrade/usr/local/bin -B /usr/local/bin -d /usr/local/pgsql/data96 -D /usr/local/pgsql/data10'

# pg_upgrade will likely have output some instructions, but we need to start
# the server first, which means fixing the configuration.
for a in ${conffiles}; do cp -pRP data96/$a data10/; done
tail -2 data96/postgresql.conf  >> data10/postgresql.conf
service postgresql start

# If pg_upgrade told you to update extensions, do that now:
su pgsql -c 'psql -f update_extensions.sql template1'
# if pg_upgrade told you to rebuild indexes, do htat now
su pgsql -c 'psql -f reindex_hash.sql template1'

# For a big database, this can be interrupted once it gets to "medium"
# (make sure to let it complete once you have gotten to the final version).
su pgsql -c ./analyze_new_cluster.sh 
# If the new version is 14.x or above, run the following command instead:
su pgsql -c 'vacuumdb --all --analyze-in-stages'

Before moving on to the next version in your upgrade path, you should probably check that the server is running properly and authenticating connections in accordance with whatever policy you’ve defined.

I followed this procedure for all of my servers, and ran into only one serious issue — of course it was on the big 50GB shared server. pg_upgrade -c fails to diagnose when the database contains a custom aggregate as described in this mailing-list thread, and the upgrade process errors out when loading the schema into 14.x. The only fix is to drop the aggregates in question (after first reinstalling and starting the 13.x server) and then recreating them the “new way” after completing the upgrade. Thankfully this was easy for me, but it might not have been.

After all that, once you’ve verified that your applications are all functioning, you are almost done: rebuild the production package set, restart configuration management, and remove the temporary install tree (there’s nothing in it that is needed any more).

Posted in FreeBSD | Tagged , | Comments Off on A big PostgreSQL upgrade

The Turnpike Extension is too wide

For two decades, my homeward commute (when I’m driving, and these days when I’m also not working from home) has been the same: head south on Mass. Ave. to Newbury St., take the on-ramp formerly known as “exit 21”, and then the Mass. Pike west to Natick. This on-ramp has always been dangerous, with very limited visibility and no merge zone; I’ve narrowly avoided crashes innumerable times by either slamming on the brakes or flooring the accelerator. I didn’t avoid a crash once, many years ago, and got rear-ended by someone roaring down the ramp behind me as I stopped for heavy traffic. This two-mile stretch of highway is four lanes in each direction, with no shoulder; you can see the area I’m talking about on this map (from Google Maps):

This design made a modicum of sense in the original configuration of the Turnpike Extension, from 1963 until the introduction of physically separated E-ZPass lanes in the 2000s: with a mainline barrier toll at the Allston interchange as well as exit and entrance tolls there, westbound traffic was frequently slowed or stopped as far back as Comm. Ave. if not all the way to Copley Square. But when MassDOT implemented all-electronic tolling on the Turnpike in the 2010s, the Allston interchange was reconfigured to have three full-speed through lanes in both directions — which eliminated the bottleneck at Allston. To be clear, this change didn’t increase the capacity of the Turnpike; it simply meant that traffic was not stopped in Allston in addition to being stopped at the Copley (eastbound) and Newton Corner (westbound) lane drops at times of congestion. MassDOT is now engaged in a process that will hopefully replace the outmoded and wasteful Allston interchange, which was designed the way it was to support a connection to the never-built I-695 Inner Belt freeway.

The Prudential Center, one of the first major freeway air-rights projects in the US, was built in conjunction with the Turnpike in 1961–63. The John Hancock Tower and Copley Place followed in the 1970s and 1980s, and the state has long been looking to generate even more revenue from its valuable Back Bay real estate. Since late 2020, two air-rights projects have been under construction between Mass. Ave. and Beacon Street — with a third one under contract but yet to begin — and as a result there has been a lane restriction in both directions between Copley and Beacon St. to allow the construction crews to safely install the buildings’ foundations. After much hue and cry over how much of a traffic impact this would have, the end result has been … nothing. (Granted that traffic has been reduced somewhat as a result of the pandemic, but not so much as that.) With the bottlenecks through Allston and Copley already being limited to three lanes, the traffic capacity simply isn’t limited by the work zone — at least not any more than it is limited by the unsafe merges that were always there. Eastbound traffic still backs up at the Copley lane drop, and westbound traffic doesn’t back up until Market St. or even closer to Newton Corner. On the westbound side, there simply isn’t that much traffic entering at Copley Square or Mass. Ave. and exiting at Allston — that’s simply not a route that makes sense for most of the trips that could conceivably use it — so the traffic that takes exit 127 westbound is traffic that came from I-93 or points east, and without a toll barrier on the exit ramp, that traffic can queue along the whole length of the ramp without backing onto the mainline Turnpike.

This aerial photo shows the air-rights parcels (under-construction parcels shaded in purple, future parcel in green):

(I may be misremembering the parcel numbers, and didn’t bother to look them up, so they’re not labeled on the map.)

One of the most controversial aspects of the project to replace the Allston interchange (which is also supposed to include a bus/train station and significant new construction on the Harvard University-owned former railyard) has been the area called “the throat”, where MassDOT is trying to thread a two-track railway and twelve lanes of freeway (eight lanes of the Turnpike, four of Soldiers Field Road) in a very narrow stretch of land between the Boston University campus and the Charles River, without impacting traffic on the MBTA Worcester commuter rail line or on the Turnpike or on Soldiers Field Road during construction, and without any permanent structures in the Charles, and without unduly limiting use of the Dr. Paul Dudley White path, which parallels Soldiers Field Road. That section is highlighted in the map below:

Community advocates have largely recognized this as a fool’s errand, and contrary to the city’s and state’s climate goals to boot. Fixing the interchange and removing the viaduct through the throat, and of course building the new train station, are all recognized as positives, but even with a Federal Highway Administration waiver for limited shoulder widths, it has proved impossible to squeeze all of these roadways into the space allowed without either elevating one over the other or building into the Charles. Of course, the “impossibility” is a sham: it’s only “impossible” because both MassDOT (through the effort of former MassDOT secretary Stephanie Pollack) and the Department of Cars and Roads Conservation and Recreation have refused to countenance any reduction in freeway capacity. It is quite clear from the results of the current work zone that the Turnpike is two lanes too wide between Allston and Copley, and it would be a tremendous boon for safety and the environment if it was reduced to six lanes with full shoulders and safe merges at the entrance ramps. Likewise, although I do not have as much direct knowledge, the bottlenecks on Soldiers Field Road are all downstream, on Storrow Drive, especially at the Bowker Interchange and at Leverett Circle: Soldiers Field Road could stand to be just two lanes wide in this stretch.

These width reductions (24 feet on the Turnpike, 20 feet on Soldiers Field Road) would have a very limited impact on traffic, if DOT and DCR stopped their obstructionism, and the result would be a much cheaper, more constructable Allston project with bigger buffers between the freeway traffic and park users.

Posted in Transportation | Tagged , | Comments Off on The Turnpike Extension is too wide

A busy week for transportation legislation

It’s Memorial Day weekend, and in Massachusetts that means two things: the state legislature is debating the budget and the state’s draft Capital Investment Plan has been published for comment. In Washington, the Biden Administration has published its official budget request for Congress (which will go right in the trash; Congress writes its own budget), but the administration’s so-called “American Jobs Plan” is being debated (and torn to shreds) by Congress, and in addition, the five-year surface transportation authorization — which was given a one-year extension last year because of the pandemic — is also being debated.

As a result, the transportation funding situation for the coming year is even less clear than it usually is. The US Department of Transportation released a glossy brochure attempting to explain the President’s budget request, which called out some worthwhile initiatives but largely failed to clarify which programs were associated with which funding requests. Amtrak released its own glossy brochure, explaining to congressional delegations what sort of service enhancements it was thinking about (if only Congress would appropriate more money) — largely devoid of HSR or any particularly compelling program for capital investment or frequent service.

I pointed out on Twitter a few weeks ago that Joe Biden is both the first and last president of his generation — the Silent Generation. He took the oath of office as a United States Senator just a few days after I was born; his political formation is quite different from both the G.I. Generation who preceded him in birth and the Baby Boom Generation who preceded him to the presidency. This is reflected in his speechmaking, in his non-confrontational approach to governing, and especially in his approach to policy: he is going to defer to Congress, and while he will privately jawbone Manchin and Sinema, ultimately he is going to sign whatever Congress passes, and he isn’t going to make a big deal of it if some of his major legislative initiatives founder in the Senate.

While I do not doubt the sincerity of Biden’s support for the proposals his administration has put forward, I believe he values the appearance of bipartisanship more than he values livable cities or indeed an inhabitable planet, and fundamentally, when the autoists in Congress finish chewing up and spitting out his infrastructure proposals, turning them into more of the same polluter-subsidizing climate arson, he will acquiesce without protest. His only “red line”, so far as I can perceive, is his refusal to increase taxes on polluters — who by and large make less than $150,000 a year and drive a light-duty truck. The rich are simply not numerous enough for any taxes aimed at changing their behavior to have any meaningful effect on the existential crisis of our time. We need more and heavier sticks, like a carbon tax, like a national VAT, that would actually be paid by the vast majority of people (myself included), in order to incentivize a meaningful amount of behavioral change.

OK, enough about the situation in Washington — what’s going on on Beacon Hill?

This past week, the Senate unanimously passed its version of the state budget on Thursday. I believe there is one more procedural vote to come, this Tuesday, and then it will go back to the House, which will take another procedural vote, and then both houses will appoint members of a conference committee to hammer out the numerous small differences between the chambers. (Of course, the conferees know who they are already, even though they haven’t been formally appointed, and through this whole process, the staff of both chambers has been tracking the points of disagreement so they know what they need to hammer out.) This whole process might take a week or two, then both chambers will vote again to accept the conference report (by a veto-proof majority) and then deliver the final engrossed text to the governor’s desk, at which point he will veto a bunch of items, and then both chambers will have a day-long vote-a-thon to override most or all of the individual vetoes. Hopefully they can get this all done by the end of June, at which point the legislature will have to take up the FY21 close-out supplemental budget (to reconciles the budget that they passed back in January with what the state actually spent).

The budget as passed by the Senate does not include any revenues from the Biden administration’s “American Rescue Plan”, for the simple reason that the Baker administration doesn’t yet have guidance from Washington on how the state is allowed to use the money. This is also an issue for the Capital Investment Plan, which I’ll discuss next. (The MBTA budget assumes the availability of ARP funds to cover operating expenses, but is not itself subject to legislative approval.) It does include a significant drawdown of the state’s “Rainy Day Fund”, which will presumably be reversed in a supplemental budget once the federal guidance is received on eligible expenses. (ARP funds are generally available for allocation through the end of calendar 2023, but cannot be used to reduce state tax rates, fund pension obligations, or various other things state legislatures might want to do, so each federal agency charged with disbursing ARP money has to go through a rulemaking or similar procedure to issue official guidance about which expenses are or are not eligible and what the state must certify in order to access the funds — the previous Coronavirus relief bills, “CARES” and “CRRSSA”, had similar administrative complications but subtly different requirements.)

Without debate, the Senate adopted an amendment by Sen. Joe Boncore, chair of the Transportation committee, which restructures and increases the fee charged for TNC (i.e., Uber and Lyft) rides, adds additional reporting measures, and creates a trust fund into which the state portion of the fee revenue is paid. Most significantly, the Boncore amendment creates a “public transit access fee”, an additional twenty-cent charge for trips that both begin and end within “the fourteen cities and towns”, and paid into a segregated account to support a low-income fare program to be established by the MBTA. (Which are “the fourteen cities and towns”? They are the communities in the original, inner-core MBTA district, which still receives the majority of bus and rapid transit service.) The amendment passed 37–2 on a non-recorded standing vote, but it remains to be seen whether this language will survive the conference committee; if it does, expect the governor to veto it. (It’s likely that the legislature will override the veto if it gets that far.) Unlike with the bond bill back in January, the legislature will remain in session after the budget is passed, so there is no possibility of a pocket veto.

I should note here that neither House nor Senate budget includes provisions for an MBTA board of directors, to replace the current five-member, unpaid Fiscal and Management Control Board, which expires at the end of June. The governor’s budget as filed included such language, but it was dropped from the budget by House Ways & Means, and it was also left out of the Senate budget. (Senate Minority Leader Bruce Tarr proposed an amendment to restore the governor’s language, but it was “bundled” with several other amendments and rejected on a voice vote.) There are companion House and Senate bills in the Transportation Committee which would establish a new board, but thus far, with only five weeks left of the FMCB’s legal existence, the committee has not chosen to advance either one. The bills are H.3542 by Rep. Meschino and S.2266 by Sen. Boncore; the language in both looked the same on spot-checking but I did not do a line-by-line comparison. In addition to expanding the board to seven members, S.2266 would provide for some local representation, by giving the existing municipally-appointed Advisory Board the right to appoint one MBTA board member, and would authorize an annual stipend of $12,000 for each board member except the Secretary of Transportation. While this is not the exact structure I would prefer, time is of the essence and I would like to see one of these bills reported out of committee within days to allow for a smooth transition from the old board to the new.

Finally, on to the state’s Capital Investment Plan (CIP), which the MassDOT board and the FMCB voted to release for public comment last Monday, six weeks before the deadline for it to be adopted and generally much too late for any public comments to make a significant impact. As with last year, significant uncertainties related to the aftermath of the pandemic and the availability of federal support have been used to justify a one-year “maintenance of effort” CIP rather than the five-year CIP the law requires. These uncertainties do not get the state out of its federally-mandated five-year State Transportation Improvement Plan, through which all federal grant programs flow, so we still have some idea of what might be funded in the out years simply because it has to be programmed in the STIP; the Boston Region MPO will meet on Thursday to endorse its FY22-26 TIP, which is the largest regional contribution to the STIP, and this will then flow through to the state CIP, which both the MBTA and MassDOT boards must formally adopt at their joint meeting at the end of June. The state and the federal government operate on different fiscal years (the state’s is July to June and the feds use November to October), which means the exact alignment of the two plans depends on the exact scheduling: some SFY22 projects are funded with FFY21 obligations.

One thing we do know about the Rescue Plan is that it includes an additional $175 million of funding to recipients of Federal Transit Administration capital improvement grants in fiscal year 2019 which have not yet entered revenue service. Up to 40% of this could go to the Green Line Extension — except that the GLX project is running under budget and may not need any more money. The text of the act makes the distribution of funds non-discretionary, but the agency will have to determine what Congress intended by this provision in the case of funds to be distributed being in excess of obligations. The MBTA and FTA are in discussions to see how the GLX money could be reallocated — the original funding agreement includes a clawback provision for Cambridge and Somerville if their local contributions turn out not to be required, but if the surplus comes in beyond that amount, after all contractor claims are resolved, the MBTA would like to use the money for other priorities. This should be resolved by the time the boards vote on the CIP in four weeks, so it’s likely that there will be some transfers of these funds in the final CIP that aren’t shown in the draft.

Having said all of that, and in the knowledge that my comments will have no meaningful effect on the process, I still chose to email the state with my comments, which will probably get a formal reply from the staff some time in September. Here is what I said, lightly edited for presentation here:


I will begin my comments with some process issues.

  • While the “accessible” PDF version of the draft is definitely more accessible than the “story map” (which has undocumented requirements for computer hardware and is difficult to navigate or resize), it is still lacking in some basic information, such as the actual location (at least city or town) of projects in the MBTA section of the plan. Many of the “project descriptions” are quite cryptic, even for someone who regularly attends/watches the board meetings, and need a more complete explanation. That said, the breakdown of programmed expenditures in the last four columns of Appendix A is an appropriate and helpful way to present the status of a project in the absence of an itemized plan for the out years.
  • To repeat my comments from previous years, the schedule for comment is far too rushed. Anyone who has followed this process over time knows that all of the important decisions have been made by the staff already, sometimes months ago, and as a result there is almost zero chance that public comment will result in any changes to the draft before the boards vote to adopt it in a few weeks. The capital planning process needs to be open and transparent, and that starts with publishing the universe of projects and assigned priorities well before the end of the fiscal year so that members of the public can develop reasoned arguments about which should be advanced or delayed.
  • While I am sympathetic to the desire to do a short-term capital plan given the uncertainties over whether Congress will pass an infrastructure bill, it is unfortunate that the draft CIP only shows one year, and does not show projects in the out years that might have an opportunity to be accelerated if additional funding is made available. This is important information and citizens deserve to have at least some details so that we can make a case to our representatives and before the various boards. Many agency priorities have changed and numerous projects have been accelerated, so it is not possible to refer back to the FY20-24 CIP for information about FY23 and FY24 projects.

I have one comment regarding Highway Division programs: I am disappointed that the Weston Route 30 reconstruction and bike/ped safety improvements project was not programmed by the Boston Region MPO and said so in comments on the draft TIP. Should additional funding become available before the end of FY22 I urge that consideration be given to programming this important project.

My remaining comments are all regarding the MBTA section of the draft.

  • Should the Green Line Extension come in under budget (as suggested at last Monday’s board meeting), and should FTA allow the MBTA to reprogram the FFGA funds, I strongly support funding the Red/Blue Connector and/or advancing the bus facility modernization program by replacing Arborway garage.
  • However, I remain unalterably opposed to the destruction of the North Cambridge trolleybus network as currently proposed by MBTA staff. Trolleybuses are inherently more efficient than battery buses, do not require supplemental diesel heaters, and are already zero-emissions vehicles; North Cambridge has been a trolleybus carhouse since it was converted from streetcars in the 1950s (when most Mattapan carlines were dieselized, contributing to today’s environmental injustice in that neighborhood). The Transportation Bond Bill specifically authorized “transit-supportive infrastructure” program funds to be used for trolleybus infrastructure, including electrification, and the MBTA should be making plans to expand North Cambridge trolleybus service to other nearby bus routes such as the 68, 69, and 77, by extending the trolley wire and/or acquiring battery-trolleybuses with in-motion charging.
  • The continued progress on making commuter rail stations fully accessible is laudable. However, I continue to be concerned that upgrading stations to full high-level platforms is being approached solely as an accessibility issue, and thus being advanced piecemeal, rather than as a significant constraint on operations, staffing rationalization, and competitive rolling stock procurement — as was obliquely pointed out by Alistair Sawers in his presentation before the boards last Monday. While I strongly support completion of the current platform accessibility projects (all of them on the Worcester Line), future investments in platform upgrades need to be done more strategically.
  • In particular, given the response of the FMCB to Mr. Sawers’ presentation — specifically, endorsing the idea of proceeding with a traditional procurement for EMU rolling stock — construction of high-level platforms on the remaining Providence and Fairmount Line stations needs to be prioritized, packaged as a single unit of design to control costs, and put out to bid ASAP, preferably in FY22, to ensure that these lines will be able to use standard rolling stock purchased in a competitive marketplace rather than bespoke trains with nonstandard multi-level boarding. Platform upgrades on other lines should be prioritized on a line-by-line basis, so that remaining diesel lines can be converted to remote door operation and the reprocurement of the operating contract can go to bid without the burden of unnecessary assistant conductors.
  • The placeholder commuter-rail project labeled “future fleet” should obviously be reprogrammed as an explicit EMU procurement. The General Court has made it quite clear that it is the policy of this Commonwealth to electrify the commuter rail network, using overhead catenary electrification and EMU rolling stock, and has authorized nearly a billion dollars in bond issuance over the next decade to put it into practice. It is time for the MBTA and MassDOT to get in line.

Project-specific comments:

  • P0170: station design for full access to both platforms should be advanced.
  • P0261: the description says “3rd track feasibility study” but other MBTA documents and presentations have implied that the third track was actually going to be progressed to design and eventual construction. Please clarify.
  • P0650 and others: since coach availability has been an issue recently, even during the period of pandemic-reduced schedules, I support continued lifetime extension and overhauls of legacy rolling stock to keep this equipment running while electrification is being pursued at the greatest practical speed.
  • P0863: strongly support construction of a south-side maintenance facility, but caution that the design needs to be able to accommodate articulated EMUs which are several times longer than legacy coaches, so as not to constrain rolling stock procurement.
  • P1009: what FTA compliance actions are these? For a $57mn program this needs to be spelled out explicitly.
  • P1011: GLX hasn’t even finished constructing the maintenance facility, and you’re already looking to spend $12m to modify it?
Posted in Transportation | Tagged , | 1 Comment

Weekend excursion: Stations of the B&M New Hampshire Main Line/MBTA Lowell Line

As I’ve neared the end of this series of posts, I’ve gotten a bit better at procrastinating, so most of the photos this post is based on (see the associated photo gallery) were taken a month ago now, and I’m drawing a lot on unreliable memory and aerial photos (and a bit of Wikipedia) to bring this together. It’s an interesting time to be writing about this line, for a number of reasons I’ll try to articulate.

As the title suggests, today’s Lowell Line was historically the Boston & Maine’s New Hampshire Main Line, with passenger service north through Nashua, Manchester, and Concord into the White Mountains and through Vermont to Montreal. In Chelmsford, north of Lowell, the line connects with the Stony Brook Railroad and becomes part of Pan Am’s freight main line from Western Massachusetts to Maine. There have been discussions on and off about re-extending commuter rail service to Nashua (where the historic B&M station apparently still stands) and even Manchester, but the discussions have always foundered on New Hampshire’s refusal to subsidize anything other than private automobiles. Recently, Amtrak released a map of possible service extensions which included service as far as Concord — Amtrak, unlike the MBTA, has both the legal right to operate on any railroad and a mandate to provide interstate service, and already operates Downeaster service along the line as far as Woburn.

As I described in more detail in my survey of the Western Route, the B&M planned in the 1950s to run all longer-distance services north of Boston via the NHML, with trains to Haverhill and Portland using the “Wildcat” Branch in Wilmington to access the northern part of the Western Route. This made a good amount of sense (and still does, which is why the Downeaster does so) because the NHML is the highest capacity line on the ex-B&M network, and the second-highest-capacity on the entire MBTA system: it’s the only North Side line with no single-track bottlenecks and no drawbridges other than at North Station; even slow diesel trains can maintain decent speeds because the stations are few in number and fairly widely spaced. All of the stations have platforms for both tracks, allowing bidirectional service without scheduling difficulties, although with the exception of the recently constructed Anderson RTC/Woburn station, they are all low-platform (all except West Medford and closed-for-demolition Winchester Center have mini-highs).

Which then brings me to the saga of Winchester Center, one of the two stations that got me started on this series of travels back in March. Winchester was scheduled for accessibility upgrades, with final design nearly complete and construction supposed to be put to bid in the second half of this year, when regular inspections early this year revealed safety issues with the old station’s platforms. Rather than perform emergency repairs, the MBTA chose to simply demolish the old station early, while commuter-rail ridership was low due to the pandemic, and remove the demolition from the scope of the reconstruction contract, reducing the cost and allowing construction to proceed more quickly. While I did not make it to Winchester Center in time to see the old platforms, I did get pictures of the demolition work in progress. (Not literally in progress, though, because I made my visit on Easter Sunday when no work was taking place.)

So with all that out of the way, let’s go station-by-station. With the historic stops in East Cambridge, Somerville, and Medford Hillside all long gone, the first stop on the modern Lowell Line is at West Medford. The station is located next to the West Medford post office (in fact the inbound shelter looks to be attached to the side of the building) and it is inaccessible, with only low-level platforms on both tracks. A few years ago, the MBTA’s system-wide accessibility program rated West Medford one of the highest priority stations to receive full accessibility upgrades, but I haven’t seen anything to indicate that this has been advanced in the capital program since then, not even as far as a 10% design. In the 2018 passenger counts, about 600 people a day used West Medford — which is pretty good for a commuter rail station but only the fourth-busiest suburban station on this line. Much of the station’s popularity can be explained by its assignment to the inner-core fare zone, zone 1A, so travel to North Station costs only as much as a subway fare and is much faster than taking the bus to Sullivan and then transferring to the Orange Line into town. (It will be interesting to see how the popularity of this stop changes when the Green Line Extension opens, since it will operate much more frequently and offer bus connections closer to West Medford.)

The route runs on a viaduct through much of Winchester; Wedgemere station is located near the south end of that viaduct, where the railroad crosses the Aberjona River at the north end of Upper Mystic Lake. In the middle of a wealthy residential neighborhood and without practical bus connections, Wedgemere gets a surprising amount of traffic compared to its 120-stall town-owned parking lot, about 300 riders in 2018. With the closure of Winchester Center station, only four tenths of a mile to the north, Wedgemere is currently the only station in the town of Winchester, but with much more development within walking distance, Winchester Center had about 50% more traffic.

North of Winchester Center, a long-abandoned branch once led to downtown Woburn, with the main line running through a largely industrial area on the east edge of the town, before crossing under Route 128 into a truck-oriented wasteland of industrial parks. At the Route 128 overpass, Mishawum station was formerly the primary station serving Woburn, located between two toxic-waste cleanup sites, “Wells G & H” and “Industri-plex”. It used to be accessible, and was upgraded with a ramp system on the inbound platform and mini-highs on both platforms before being abandoned in favor of a new station half a mile deeper into auto-dominated industrial-park hell. The former parking lot, shared with the Woburn Logan Express, has turned into a bank office building and a Dave and Buster’s. The station still stands, and still seems to be receiving some maintenance, but at some point in the last decade, the mini-high platforms were partially demolished to reuse the folding steel platform edge at another station. As a result, Mishawum is the only MBTA station to have been accessible, and then made inaccessible. As late as 2018, long after the new station was opened, Mishawum was (apparently illegally?) still being served as a flag stop by a handful of trains a day; the 2018 traffic counts (32 passengers a day) are the most recent mention of any kind I can find of it. The town of Woburn apparently wants to see service maintained at Mishawum, because as we shall see, its replacement is even farther from where any humans can be found without a steel exoskeleton, whereas there is a residential neighborhood not too far southwest of Mishawum. But it’s no longer shown on public schedules, and with the MBTA’s slow diesel trains it’s really too close to the new station to even be a flag stop. Even with electrification, a new station at Montvale Ave. or Salem St. would have a much larger catchment of Woburn residents and result in a more appropriate interstation.

The new station in question is of course Anderson Regional Transportation Center, which is an enormous ocean of parking, nearly 2,000 spaces, accessible only via an unwalkable car sewer with a direct exit off I-93, connected to a combination bus stop and train station, and owned and operated by Massport. Of course, it hardly matters that it’s unwalkable, because in the middle of this toxic waste site (the Woburn Industri-plex Superfund site) there’s nothing you’d want to walk to or from. For train facilities, the station has two overhead pedestrian bridges, one connecting the high-level center platform to the second floor of the station building, and the other, at the far northern end of the platform, connecting to the northwest edge of one of the enormous parking lots. In addition to the MBTA commuter trains, the Downeaster stops here, and presumably if the proposed Amtrak service to Concord ever gets off the ground, it would as well (and probably Lowell, too). When I visited, the parking lots were barren, and Logan Express bus service had been suspended due to the pandemic. Despite the horrible location, the station definitely got plenty of use, with more than 1,200 passengers a day in 2018. (One wonders how many of those passengers are actually driving down I-93 from New Hampshire.)

The next station north, Wilmington, is where the Wildcat Branch diverges to the north as the main line heads north-northwest. The turnout is located just north of the outbound platform, resulting in offset platforms. The single-track Wildcat only connects to the outbound track, but a universal crossover south of the station allows access to both tracks; passenger service using the Wildcat does not currently make a stop a Wilmington, so it matters little that the branch only serves one platform. There is a 200-space MBTA-owned parking lot on the east side of the tracks, but this is far too small to account for the average daily ridership of 575; there is also an apartment complex, “Metro at Wilmington Station”, at the south end of the inbound low-level platform.

There’s a long interstation, about 6 miles, between Wilmington and North Billerica, but the line runs through wooded, low-density areas nearly the entire length. Just south of North Billerica is the B&M’s former maintenance yard, now an industrial park called Iron Horse Park, a 553-acre Superfund toxic waste site, including numerous landfills and former waste lagoons, which are contaminated with a variety of solvents, heavy metals, asbestos, and pesticides. Iron Horse Park is in its 37th year of EPA-supervised cleanup, partially funded by the MBTA, which made the mistake of acquiring 150 acres of the property in the 1970s as it began the process of taking over the B&M’s commuter rail operations. The MBTA’s new backup rail operations center is being constructed in a less restricted part of the park.

I actually went to North Billerica station first, before heading down to Iron Horse Park. It’s another two-track station with low-level platforms and mini-highs, made slightly more interesting by its 19th-century station building (although it’s been extensively renovated, to the point that I had figured it was new-old-style rather than Actually Old when I visited). The station has two surface parking lots, operated the Lowell RTA, with 540 spaces between them, and is also served by two LRTA bus routes, helping to explain its over 900 daily riders in the 2018 statistics. As the sun was setting, I did not make it all the way to Lowell on my initial trip, but returned a week later as part of a wrap-up trip that also included stops in Worcester, Lawrence, Rowley, and Newburyport.

At Lowell, I found LRTA’s exceedingly expensive and aggressively human-enforced parking garage, located over the rail line and next to LRTA’s central bus hub. Google Maps initially wanted to take me into the west garage entrance, which I found blocked with Jersey barriers, and when I found the entrance that was nominally open, I found that it was (unlike every other RTA garage) not equipped with automatic ticketing and payment systems, and the human who was supposed to sell me a ticket was not in their booth. I moved on, not wanting to spend $8 to park for 15 minutes, stopping to take a few quick pictures of the bus hub and the commuter-rail platform, but was chased away by an LRTA employee in a pickup truck. The platform here is a low-level center platform, between the westernmost pair of tracks, with a half-length high-level platform accessed from the 700-space garage, which is built across the tracks. The line quickly narrows to two tracks north of the station before crossing the Pawtucket Canal, and narrows to a single track at the wye with the Stony Brook. There is no layover facility on the Lowell Line, so trains entering and leaving service must do so at Boston Engine Terminal in Somerville; the lack of such a facility is one of the major constraints on increasing service on the line (because there is little room to store additional trainsets that would be required). In 2018. more than 1,500 people a day used the station.

That concludes the March–April run of MBTA station “weekend excursions”, but the project as a whole is far from complete: in addition to the new stations currently under construction (six stations of South Coast Rail, to open 2024; New Chelsea, opening later this year; New Natick Center and Central Falls–Pawtucket, opening next year) there still remain all of the stations that I avoided because they were in crowded areas and there’s still a pandemic on: Boston Landing, Lansdowne, Back Bay, Ruggles, Forest Hills, South Station, JFK/UMass, Quincy Center, Braintree, North Station, Malden Center, and Porter, plus the rest of the Fairmount Line and three stations in Rhode Island. In addition, Mansfield station, which I last saw while it was still under construction, fully opened in 2019. I’ll be fully vaccinated in a few days, and weather permitting, I still have plenty to do and see.

Posted in Transportation | Tagged , , , , | Comments Off on Weekend excursion: Stations of the B&M New Hampshire Main Line/MBTA Lowell Line