GNOME Message Managment System (Storage mechanism)
- From: Scott Wimer <scottw dev cgibuilder com>
- To: gnome-list gnome org, gnome-hackers nuclecu unam mx
- Subject: GNOME Message Managment System (Storage mechanism)
- Date: Sun, 18 Apr 1999 04:58:08 -0700 (PDT)
This used to be the mail client thread. But, we started discussing
stuff well beyond just a simple mail client.
Here's my take on a decent Message storage aproach. I'm trying to
optimize a certain number of activities here:
fast Message storage
ease of Message grouping, allow multiple grouping
ease of Message ordering
scalable Message group sizes
minimal space wasted
fast Message retrieval
ability to regenerate corrupted indexes (for ordering and grouping)
fast Message searching abilities
In most cases, the end user is not going to care about or see the
storage mechanism used. They'll only see how well it implements the
above features.
I think this storage architecture should work rather well for meeting
the above goals. Here we go. Some of this has shown up in earlier
emails, this is mostly a condensing of other data and an expansion on
a couple of points.
Each Message is given a unique Message ID.
Message Composition
<Message ID>
[group1 [,groupN]* ]*
Original Message Header
Original Message Body
Message ID Composition
(date of arival)-(random 10 character string)
The date of arival is a 32 bit integer, time since epoch
The random string is open for discussion, basically, just
there to make the message ID's unique.
Message Storage begins at a single directory on the disk
Softlinks out to other directories are allowed
This simplifies the code for accessing the Messages themselves
This simplifies people moving their Message store location
Messages are stored across several directories under the base dir
The target directory is chosen by a hash of the Message ID
Multiple levels of hashing are allowed
This reduces directory size
Each Message is written to a separate file
File name = Message ID
The Main Message Index can contain pointers to sub indexes
Allows for sub groupings
An Index exists for each Grouping defined
A Message may be Indexed under multiple Groups
A Message Index entry looks like:
<Message ID> <Subject> <Location>
The Location is the path to the Message file
This is relative to the base Message directory
Some thoughts on scaling this system. I'll try to be brief, since
you're no doubt tired of reading. :)
If a Message directory grows to have more than a thousand or so
entries in it, accessing each individual Message will be slow, since
most directory lookups are linear. This system can be made self
balancing though. On startup, we could check to see if any directory
had more than some arbitrary number of entries in it, say 600. If it
did, then we would add another 10 numbered entries to the directory,
and then re hash the Messages currently stored. This shouldn't be
as slow as it sounds, since we will probably be able to re link the
Message files into the new directories. So, it's just a bunch of
directory table updates (and we're not letting these directories get
huge on us), without a lot of copying happening.
The end result is a storage system that looks like this:
~/Message/Base/Group1.db
~/Message/Base/GroupN.db
~/Message/Base/Store/[0-9]
~/Message/Base/Store/[0-9]/924436539-j3902mdnijH
~/Message/Base/Store/[0-9]/924432923-02df23ds93j
~/Message/Base/Store/[0-9]/924433610-923oiUIHli8
Comments?
--
Scott Wimer
play ---> scottw@cgibuilder.com http://www.cgibuilder.com/
work ---> scottw@corp.earthlink.net http://www.earthlink.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]