Skip to content

PUBLIB List is Very Messed Up

The PUBLIB and Web4Lib discussion lists have been down all week–since at least Monday. I’m the co-moderator for the Publib list and on the editorial board of Web4Lib. They will not come back up until OCLC (our host for these two lists) figures out why the lists are misbehaving, and that may take a while. I thought I’d post this here because these lists do not have a “status page” such as has become popular for some websites (for example, Dreamhost and Fastmail both have status blogs hosted somewhere safe from their main ops).

The lists died on Monday morning, which has happened before. On Wednesday we began to see some bounce messages–usually a sign that the list is coming back to life. However, this was merely the list going into its manic phase. On Thanksgiving eve I was flooded with so many bounced email messages (over 49,000!) that it nearly overran my personal email queue–even worse, I couldn’t log in to the administrative interface to remove my email list or take any other actions. During this period I sent several pleas for assistance to the two email support addresses I have on file, but–I have to say this–never got a response.

Yesterday OCLC sent a message to Roy Tennant, which he forwarded to me. I include portions of it below so you can understand what’s going on.

Meanwhile, this has been a really good reminder of the price of free, and perhaps also a reminder of some of the challenges I face as I move into a new work environment. It’s wonderful that PUBLIB has a home, and I appreciate what OCLC does for PUBLIB and Web4Lib. But it’s frustrating not to be fully in the loop with a service for which I’m at least partially responsible; it’s also frustrating not to have more influence over the accountability of this service. In some ways, I’m getting a good taste of the customer’s point of view.

From OCLC:

“As you were aware of the fact that our listserv [sic; PUBLIB and Web4Lib run on Mailman] was down for a few days this week, after restarting the services yesterday morning (Wednesday 11-22-2006) all the services were back online. Late last evening/night we ran into some major issues with the listserv as it started to spew over 120,000+ emails. This brought our mailhost to a complete halt. There were more emails in the queue than our server could handle. At that point we took the decision to stop the listserv in order to clear the queue but the volume of emails continued to overflow the queue.

“We tried to stem the emails coming from the server but it was quite unsuccessful. The ratio of send-to-receive was approximately 1:1000.

“At one point the number of emails sent had stalled because the queue was saturated. At about 110,000 emails in the queue … the service used by the application to scan and send email, was restarted to try to clear the queue. The queue didn’t clear but instead continued to grow.

“All listservs [sic] are currently down and will be down for at least few more days before we can investigate the root cause and find the appropriate resolution. We have few possible solutions and will explore them as we try and resolve this at the earliest. … “

Posted on this day, other years: