Kind of off-topic, but Mailman related
I back up my server using "borg" to send it to a different server. I keep 7 dailies, 4 weeklies and 6 monthlies. When I used Mailman 2, my borg backup size stabilized and never seemed to grow much. But since switching to Mailman 3, my backup size has been growing, seemingly without bounds, even though the amount of disk space I'm using on the Mailman server isn't growing much. I suspect this might be because of the way it's storing stuff in PostgreSQL instead of in the file system is fooling borg's deduplication methodology. Does anybody else have any experience with borg and Mailman 3? Should I exclude postgreSQL from the backup and do a separate pg_dumpall?
-- Paul Tomblin
Paul Tomblin via Mailman-users writes:
I back up my server using "borg" to send it to a different server. I keep 7 dailies, 4 weeklies and 6 monthlies. When I used Mailman 2, my borg backup size stabilized and never seemed to grow much. But since switching to Mailman 3, my backup size has been growing, seemingly without bounds,
Do your lists enable archives? If so and you're using HyperKitty, the archives are also stored in Postgres. Depending on whether you backed up archives in Mailman 2 and whether that was conceptually part of your Mailman backup (vs your website backup, for example), that could explain some of a size difference, though I don't see offhand how it would explain consistent growth since conceptually mail archives are append only.
How long have you been using this system with Mailman 3? I can't imagine that it's going to grow without bound for very long (but if you have been observing Mailman 3 for 6 months and it's still growing linearly I'd revise that guess). The question in my mind for now is how big might it get.
even though the amount of disk space I'm using on the Mailman server isn't growing much.
I suspect that because Mailman 3 is storing things in tables that can cut across lists, many of them with small records (indexes and relations), the segments that borg detects and deduplicates are going to be smaller than Mailman 2. So borg's deduplication algorithm just isn't going to work well no matter what you do. I think you will find the size of the incremental backups will stabilize but substantially larger than the Mailman 2 backups. Also, Mailman 3 does keep more data (the User and Address databases) than Mailman 2 did.
Should I exclude postgreSQL from the backup and do a separate pg_dumpall?
I would expect that to result in a larger backup than you're seeing now, even with compression.
-- GNU Mailman consultant (installation, migration, customization) Sirius Open Source https://www.siriusopensource.com/ Software systems consulting in Europe, North America, and Japan
On Fri, May 29, 2026, at 1:09 AM, Stephen J. Turnbull wrote:
How long have you been using this system with Mailman 3? I can't imagine that it's going to grow without bound for very long (but if you have been observing Mailman 3 for 6 months and it's still growing linearly I'd revise that guess). The question in my mind for now is how big might it get.
I backup the full server at a time, not just mailman or the web server separately. I’ve been running Mailman 3 on it since January. The current disk use of my entire server is around 35 Gb. The borg backup is somewhere around 600 Gb. The old borg backup when I was using Mailman 2 was less than 100 Gb (based on the fact that the partition I’m backing up into was smaller than that - after switching, I’ve been doing an lvresize of it every week or so.)
-- Paul Tomblin
Paul Tomblin via Mailman-users writes:
I backup the full server at a time I’ve been running Mailman 3 on it since January. The current disk use of my entire server is around 35 Gb. The borg backup is somewhere around 600 Gb.
OK, if you just did tar backups (uncompressed!) you should see (7 + 4 + 6) x 35GB = 595 GB, no? So this is just nuts.
BLUF: I think you have a borg vs. DB problem, not a borg vs. Mailman problem. You should ask them, because they're the ones who understand their algorithms and software. I would imagine that (1) it's possible to tune borg to keep its own overhead reasonable even if the use case is outside of its primary desigh, and (2) the borg devs have very likely had experience with "database backups bloat quickly". They can tell you how to configure PostgreSQL to be more borg-friendly, or vice-versa, or to configure borg around some other backup method for PostgreSQL.
A few thoughts since I already thought them:
By "my entire server" do you mean the /var partition, or everything the OS can access? If the latter, about 20GB of that is just operating system and server software, no? Which presumably doesn't change very often in a 6-month period? I'd guess you're looking at most 15GB of potentially variable data, so 255GB in uncompressed tarballs. Even if the 35GB is just /var, I just don't see how you could have a Mailman installation generating 500GB of new backup in 5 months unless the traffic is truly gargantuan.
One possibility for a fraction of that bloat is that you have the prototype ("write-only to maildir") archiver enabled. In that case $var_dir/archives/prototype could be collecting GB fairly quickly. Since it's write-only, there's no loss to disabling it and deleting it. But it's append-only so should deduplicate well, and unless mail traffic is exploding it should pretty quickly stabilize to contribute a constant amount to a stable archive size -- it wouldn't explain your 600-lb gorilla. I could be wrong if you have enough traffic, so check that.
If you want to continue the conversation here I'd need du data on the mailman installation (especially Mailman's $var_dir), on Mailman's $log_dir (which is $var_dir/logs by default but can be configured elsewhere, often /var/log/mailman3), and on the PostgreSQL database (usually /var/lib/postgresql/$version/main) Mailman is using, but I really don't think I can help much. Also how big your Mailman 2 archives were, if you kept them.
The old borg backup when I was using Mailman 2 was less than 100 Gb
That sounds right given the schedule of "keeps".
Steve
-- GNU Mailman consultant (installation, migration, customization) Sirius Open Source https://www.siriusopensource.com/ Software systems consulting in Europe, North America, and Japan
participants (2)
-
Paul Tomblin -
Stephen J. Turnbull