Libelle Mail Archive Manual
Performance
Data volumes
This briefly shows how big our archive is and how it got that
way. This is intended to give you some basis for estimating your
e-mail volumes and the disk space the archive is likely to
occupy
We started our live archive in 2008 by loading a backlog of
around 40,000 manually saved e-mails collected during the
previous 10 years. Currently (2010) we receive an average of just
under 100 messages a day. Much of this volume is due to our
membership of two moderately busy mailing lists (the Wine and
Spamassassin users lists). At this rate it has taken us two years
to build the total archive to just under 100,000 messages. This
amount of data occupies 3GB of disk space when stored in a
PostgreSQL database.
Run times
Our archive runs on an old IBM NetVista PC which has an 866MHz
Pentium 3 CPU and 512 MB of RAM. All backups are written to a USB
2.0 disk drive.
Loader
We run the loader on a daily basis. It takes well under a
minute to load the day's messages. The exact time depends on
network response because the loader is checking that sender
domains are valid. With sender DNS checks turned on and a local
caching DNS server the loader handles between 6 and 10 e-mails a
second.
Backups
- A full PostgreSQL backup takes under 30 minutes. This is
the form that must be used to restore the database after a new
PostgreSQL version upgrade which involves a change in the
database structure, i.e. from 8.x to 9.x.
- A full backup using MABackup, such as occurs after using
MAUpdate to delete one or more e-mails from the archive, is a
bit slower than the PostgreSQL backup, taking a bit under 40
minutes.
- A weekly incremental backup using MABackup adds 500-650
e-mails to the backup files. An incremental backup run copies
the new e-mails at around 25 per second and all the addresses
in the archive at around 2500 per second. The entire
incremental backup takes between 25 and 45 seconds to
complete.
Searching
All the searches listed below were done over a local network
with MASearch running on a 1.6 GHz core Duo with 1 GB of RAM.
- Getting a list of all addresses in order to select a
specific address from the list took 2 seconds. Selecting the 79
messages with this address that were sent or received in the
last month took under 2 seconds.
- Getting a list of all 670 messages ever sent to or received
from another address by using a partial match on the personal
name took 3 seconds.
- Getting a list of all 327 messages sent or received during
2009 where the body text contained "LK8000" took 25
seconds.
- Selecting all messages sent or received in 2009 took 18
seconds.
- Selecting all the e-mails in the archive took 23
seconds.
Archive maintenance
All the actions listed below were done over a local network
with MAUpdate running on a 1.6 GHz core Duo with 1 GB of RAM.
- Program startup, which involves reading all the addresses
on file, takes about 3 seconds.
- The time needed to identify and list the titles of all
messages associated with an address varies a lot, from under a
second for a single message to 90 seconds for the 28000+
messages received from a busy mailing list.
- If an address has more than a screenful of message titles,
the messages must be retrieved before making changes to the
address details or deleting any messages. This takes about the
same time as building the titles list did.