Node:Importing and Exporting E-Mail, Next:, Previous:Writing Clients, Up:Top



Importing and Exporting E-Mail (informative)

E-mail import has been implemented using various programs since the first LysKOM server became operational. Protocol version 10 introduces a lot of aux-items, a large part of which are intended for use by mail importers to enhance the functionality. As of this moment, there is one mail importer (komimportmail) that is designed to take full advantage of all the new aux-items.

E-mail export has never been used seriously. The first person to design and implement an exporter gets to rewrite this appendix based on his or her experiences.

Importing e-mail

The main job of the mail importer is to figure out where to deliver mail, how to handle MIME coding and/or structure and how to deal with threading. During this, it creates one or more texts and a lot of aux-items.

Recipients

Although a mail message contains To and CC headers, they are not really useful when importing as it is the envelope recipients, not the header recipients, that should be used. To understand this, consider a mail where the To header contains a personal mail address. The mail is received using a tool like procmail and forwarded to the LysKOM importer. The envelope address will be correct, but the To header will still contain the personal address.

The komimportmail importer uses addresses like "number@server", where number is the number of the recipient and server is the mail domain reserved for the LysKOM importer. For backwards compatibility with earlier importers, it is allowed to prepend a "p" before the number. Instead of the number, komimportmail can accept a name, as long as the name can be resolved to exactly one conference or letterbox. Before looking up the name, any underscore or period is translated into a space.

Care should be taken when a mail is received more than once. This can happen if a mail is addressed to more than one address. For example, assume that a mail is sent to john.q.public@example.com and sven.svensson@exempel.se. Two different mail servers handle the two recipients, but both eventually decide to forward the mail to the LysKOM importer (but for different conferences). The LysKOM importer will receive the mail twice, with different envelope recipients.

A solution is to keep a database containing a mapping from Message-ID to LysKOM text number for imported messages. If a message is seen more than once, the message is not imported. Instead, recipients are added to the existing text.

On the other hand, that will introduce a security hole, where a person who knows the Message-ID of an interesting imported mail can add himself or some open conference as a recipient. Perhaps the importer should check for matching contents before adding recipients.

The importer needs to be careful not to deliver messages to conferences that do not allow messages, even though the server might not complain.

For mail delivery to work for any conference, the importer has to use a privileged person, or it will be unable to deliver mail to secret conferences. A potential problem is that this leaks secret information from the server. For the time being, the komimportmail importer avoids this problem by using an unprivileged person and requiring the members of secret conferences to invite the importer if they want e-mail import to work.

Threading

The importer should do its best to thread messages. When the importer sees a new message it needs to look at the In-Reply-To header to see what the message is a reply to. If the In-Reply-To header does not exist, or if it exists but does not contain a valid Message-ID, the last valid Message-ID of a References header may be used instead.

If the Message-ID of a previously imported e-mail is found, the new text should be made a comment of the replied-to text.

If the Message-ID is of the form defined below (see Message-ID), and it refers to a text exported from this server, the new text should be made a comment of the replied-to text.

This means that the importer will probably have to maintain its own database of imported texts that maps the message ID to the text number in the LysKOM database. There is no other way to find the text number for a particular imported text. Fortunately, this is exactly the same database we need to solve the multiple reception problem described above.

It has been noted that messages on some mailing lists arrive in peculiar order, with replies before the original messages. Perhaps this is due to moderation. A smart importer should be prepared to handle this, by adding a comment link when the original message eventually arrives.

One possible solution is to add a new kind of entry to the Message-ID database, mapping a Message-ID to a list of text numbers that should become comments to the message when it is imported.

MIME issues

An importer should try to handle e-mail messages containing MIME appendices as smart as possible. As the current LysKOM model lacks hierarchical structuring inside articles, appendices should probably be imported as comments or footnotes to the main message.

One would think that it is easy to convert the hierarchical MIME structure to a corresponding LysKOM comment tree. However, this would require creating empty interior nodes to attach some comments to.

Therefore, the komimportmail importer currently uses a rather naive algorithm: All leaf parts are found. The first one gets to be the main text, and the rest are included as comments to it.

Appendices encoded with Base64 or Quoted-Printable should be decoded.

When creating aux-items like mx-author, text coded using the method in RFC 2047 should be decoded.

Exporting e-mail

As of this writing, an experimental e-mail exporter exists, but it is a fairly recent creation. The author of this document knows very little about how it works, so this section contains very little information.

Message-ID

A standard for Message-ID creation has been established. The general format is:

    text-no.port.exporter.randomness@server

The different parts are explained below:

text-no
The text number of the text that was exported.
port
server
The canonical-name aux item defines a unique name for the LysKOM system that the text was exported from. The server and port fields are set from it. If no port is specified in the canonical-name, the port part is set to the empty string.
exporter
The name of the software that exported the text.
randomness
A string that ensures that the Message-ID is unique even if the same text is exported several times. This could contain a random string, a sequence number, or something else that makes the Message-ID unique.