Paul Hoffman / IMC
2003-11-26 18:57:57 UTC
Greetings again. At the Minneapolis meeting, I proposed that if
people were interested in John's proposal to encode the addresses in
a message as UTF-8, they might be interested in making all the
headers UTF-8. (The proposal was initially sparked by Pete Resnick.)
Following the thread from the past few weeks, I have come up with the
following strawman. If no one finds any huge problems with this, I'll
turn it into an Internet Draft in a few weeks.
All comments welcome!
--Paul Hoffman
- The dual motivations are to allow UTF-8 everywhere in the headers and
to not bounce any messages just because they originated with UTF-8
headers.
- Allows current users who have all-ASCII mailbox names to step up
to UTF-8 headers easily.
- Updated sending MUAs will create all headers in UTF-8.
- Transmission is protected by a new ESMTP command, UTF-8-HEADERS.
- Everyone who has a UTF-8 mailbox name MUST also have an all-ASCII
mailbox name that is equivalent.
- The terminal SMTP server is responsible for knowing whether or not the
message store can handle UTF-8 headers.
- If a receiving SMTP server does not support UTF-8-HEADERS, the sending
SMTP client downgrades all headers and continues to send the message.
- Free text fields are downgraded using quoted-printable encoding;
SHOULD be into UTF-8 charset. Downgrading MUST only be done if
necessary.
- Downgrading email addresses that only contain UTF-8 in the domain name
is done with IDNA.
- For every address in a message with a UTF-8 mailbox name, the mail
initiator tries to create a mapping in a new header, Address-maps:. A
message only has one Address-map: header; the header has a string of
maps. The header is only for addresses that have an UTF-8 mailbox name;
it SHOULD NOT be used for addresses that have all-ASCII mailbox names,
even if those addresses have UTF-8 domain names.
- If the initiator has a UTF-8 mailbox name, the initiator MUST also
have an all-ASCII mailbox, and the all-ASCII address MUST appear in the
map header.
- If the initiator knows the mapping for any recipient (through caching
or an address book), they SHOULD put it in the map header. If they
don't include a mapping and the message hits a non-UTF-8-HEADERS
SMTP server, the message will bounce.
- The Address-map: header is downgraded using Base64 for mailbox
names, IDNA for domain names.
- Example:
Address-map: José@example.com,jose-***@example.com;
törbjø***@fältström.se,***@fältström.se
If passed to a non-UTF-8-HEADERS system, this header gets downgraded
to:
Address-map: Sm9zw6k=@example.com,jose-***@example.com;
dMO2cmJqw7hybg==@xn--fltstrm-5wa1o.se,***@xn--fltstrm-5wa1o.se
- Intermediate SMTP servers MAY change the values in the Address-map:
header (such as to add one that is missing or to correct a mapping), but
SHOULD only do so for local domains. This might be a bad idea and might
be removed.
- Terminal SMTP servers should write messages addressed to either the
UTF-8 address or the all-ASCII address into the same mailbox, but this
is not mandatory.
- POP and IMAP might be updated to allow one request to bring in two or
more mailboxes; otherwise, users will have to do two separate requests.
- Digital certificates for addresses that have UTF-8 LHSs should contain
both addresses; this is already supported in PKIX and OpenPGP.
- Other headers that include mailbox names and domain names will need
further definition for downgrading.
- MUAs are encouraged to cache address mappings they see, probably with
a user-settable time-to-live.
- Terminal SMTP servers MAY look into the headers of a message to
determine whether they should upgrade a downgraded set of headers to
UTF-8. This is easy to determine: if the Address-map: header contains
only ASCII, it was downgraded. Upgrading is particularly useful on
bounce messages caused by bad mappings.
- It might be good to have a protocol for determining mappings, but it
is not defined here.
- It might be better to have just one mapping per Address-map: header
and have multiple Address-map: headers per message.
--Paul Hoffman, Director
--Internet Mail Consortium
people were interested in John's proposal to encode the addresses in
a message as UTF-8, they might be interested in making all the
headers UTF-8. (The proposal was initially sparked by Pete Resnick.)
Following the thread from the past few weeks, I have come up with the
following strawman. If no one finds any huge problems with this, I'll
turn it into an Internet Draft in a few weeks.
All comments welcome!
--Paul Hoffman
- The dual motivations are to allow UTF-8 everywhere in the headers and
to not bounce any messages just because they originated with UTF-8
headers.
- Allows current users who have all-ASCII mailbox names to step up
to UTF-8 headers easily.
- Updated sending MUAs will create all headers in UTF-8.
- Transmission is protected by a new ESMTP command, UTF-8-HEADERS.
- Everyone who has a UTF-8 mailbox name MUST also have an all-ASCII
mailbox name that is equivalent.
- The terminal SMTP server is responsible for knowing whether or not the
message store can handle UTF-8 headers.
- If a receiving SMTP server does not support UTF-8-HEADERS, the sending
SMTP client downgrades all headers and continues to send the message.
- Free text fields are downgraded using quoted-printable encoding;
SHOULD be into UTF-8 charset. Downgrading MUST only be done if
necessary.
- Downgrading email addresses that only contain UTF-8 in the domain name
is done with IDNA.
- For every address in a message with a UTF-8 mailbox name, the mail
initiator tries to create a mapping in a new header, Address-maps:. A
message only has one Address-map: header; the header has a string of
maps. The header is only for addresses that have an UTF-8 mailbox name;
it SHOULD NOT be used for addresses that have all-ASCII mailbox names,
even if those addresses have UTF-8 domain names.
- If the initiator has a UTF-8 mailbox name, the initiator MUST also
have an all-ASCII mailbox, and the all-ASCII address MUST appear in the
map header.
- If the initiator knows the mapping for any recipient (through caching
or an address book), they SHOULD put it in the map header. If they
don't include a mapping and the message hits a non-UTF-8-HEADERS
SMTP server, the message will bounce.
- The Address-map: header is downgraded using Base64 for mailbox
names, IDNA for domain names.
- Example:
Address-map: José@example.com,jose-***@example.com;
törbjø***@fältström.se,***@fältström.se
If passed to a non-UTF-8-HEADERS system, this header gets downgraded
to:
Address-map: Sm9zw6k=@example.com,jose-***@example.com;
dMO2cmJqw7hybg==@xn--fltstrm-5wa1o.se,***@xn--fltstrm-5wa1o.se
- Intermediate SMTP servers MAY change the values in the Address-map:
header (such as to add one that is missing or to correct a mapping), but
SHOULD only do so for local domains. This might be a bad idea and might
be removed.
- Terminal SMTP servers should write messages addressed to either the
UTF-8 address or the all-ASCII address into the same mailbox, but this
is not mandatory.
- POP and IMAP might be updated to allow one request to bring in two or
more mailboxes; otherwise, users will have to do two separate requests.
- Digital certificates for addresses that have UTF-8 LHSs should contain
both addresses; this is already supported in PKIX and OpenPGP.
- Other headers that include mailbox names and domain names will need
further definition for downgrading.
- MUAs are encouraged to cache address mappings they see, probably with
a user-settable time-to-live.
- Terminal SMTP servers MAY look into the headers of a message to
determine whether they should upgrade a downgraded set of headers to
UTF-8. This is easy to determine: if the Address-map: header contains
only ASCII, it was downgraded. Upgrading is particularly useful on
bounce messages caused by bad mappings.
- It might be good to have a protocol for determining mappings, but it
is not defined here.
- It might be better to have just one mapping per Address-map: header
and have multiple Address-map: headers per message.
--Paul Hoffman, Director
--Internet Mail Consortium