Discussion:
Comments on and FWD: I-D
John C Klensin
2004-01-30 17:41:12 UTC
Permalink
Hi. This draft is an update on the SMTP approach to mailbox
internationalization.

Three things:

* I am copying the SMTP list on this because everything
is getting connected to everything else, as my on-list
discussion with Nathaniel about the "trace fields in the
envelope" proposal illustrates. Please, to preserve
everyone's sanity, do not start parallel discussions or
cross-post -- this one is clearly an IMAA issue, at
least for historical reasons.

* There is one more proposal I-D coming in this series.
As a one-line summary, it outlines ("specifies" would be
too strong at this stage, but that is the intent) the
encapsulation model that this draft suggests for
downgrading.

* Very high-level summary of changes from the -01
version: I have become convinced that, if we are going
to have an internationalized structure for email --not
merely a collection of kludges and workarounds-- we are
going to need to make some rather basic changes. They
include SMTP extensions for i18n addresses and alternate
addresses and UTF-8 headers as well as UTF-8 headers, in
some form, themselves. I've picked up on Paul's UTF-8
header proposal because it seems sensible and nothing
else is on the table, but the proposal announced below
eliminates the need for alternate address headers and
all of the mucking around in the message payload by MTAs
they imply. And I'm convinced that we should have
_one_ header to cover this rather than a collection.
I.e., the model is that one is internationalized or not,
rather than one in which internationalization is done
piecemeal and incrementally, with all of the profiling,
multiplicative cases, and long-term cruft that would
imply.

Please do not react to what you think is in this proposal
without reading it or, especially based only on what you think
it is about from reading the above. That is a waste of your
time and that of everyone on the list, as previous rounds of
such discussions amply demonstrate.

best,
john



---------- Forwarded Message ----------
Date: Friday, 30 January, 2004 11:49 -0500
From: Internet-***@ietf.org
To: IETF-Announce
Subject: I-D ACTION:draft-klensin-emailaddr-i18n-02.txt

A New Internet-Draft is available from the on-line
Internet-Drafts directories.


Title : Internationalization of Email Addresses
Author(s) : J. Klensin
Filename : draft-klensin-emailaddr-i18n-02.txt
Pages : 30
Date : 2004-1-29

Internationalization of electronic mail addresses is, if
anything, more important than the already-completed effort for
domain names. In most of the contexts in which they are used,
domain names can be hidden within or as part of various types of
references. Email addresses, by contrast, are crucial: use of
names of people or organizations as, or as part of, the email
local part is, for obvious reasons, a well-established tradition
on the network. Preventing people from spelling their names
correctly is, in the long term, inexcusable. At the same time,
email addresses pose a number of special problems -- they are
more difficult than simple domain names in some respects, but
actually easier in others. This document discusses the issues
with internationalization of email addresses, explains why some
obvious approaches are incompatible with the definitions and use
of Internet mail, and proposes a solution that is likely to
serve users and the network well for the long term.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-klensin-emailaddr-i18n
-02.txt

To remove yourself from the IETF Announcement list, send a
message to ietf-announce-request with the word unsubscribe in
the body of the message.

Internet-Drafts are also available by anonymous FTP. Login with
the username "anonymous" and a password of your e-mail address.
After logging in, type "cd internet-drafts" and then
"get draft-klensin-emailaddr-i18n-02.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
***@ietf.org.
In the body type:
"FILE /internet-drafts/draft-klensin-emailaddr-i18n-02.txt".

NOTE: The mail server at ietf.org can return the document in
MIME-encoded form by using the "mpack" utility. To use this
feature, insert the command "ENCODING mime" before the "FILE"
command. To decode the response(s), you will need "munpack" or
a MIME-compliant mail reader. Different MIME-compliant mail
readers exhibit different behavior, especially when dealing with
"multipart" MIME messages (i.e. documents which have been split
up into multiple messages), so check your local documentation on
how to manipulate these messages.


Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

---------- End Forwarded Message ----------
Martin Duerst
2004-02-02 20:56:37 UTC
Permalink
Hello John,

I have read your draft. Congratulations, I think it is
extremely well written and well argued.

I have the following comments:

- Editorial: occasionally, please make shorter paragraphs.
It makes things a bit easier to read.

- Name of the extension: Something a little bit more specific
than just "I18N" would probably be better.

- 3.4.1: Quite a bit of text is spent on the idea to upgrade
Web browsers to automagically detect and convert special
(e.g. ACE-like) encoding forms in the context of Webmail.
This is a total non-starter, and you should just say so.
What I have not seen discussed, but is much more feasible
(although it's still much better to not have to do it) is
to upgrade the webmail software on the server side to
convert things from ACE or whatever to actual characters
(or numeric character references if the charset used in
the Web page has limitations) and back.

- Editorial: I don't think you have used the term ACE (ASCII
compatible encoding), but it would be more precise in some cases
(instead of writing something like 'IDN/punicode-like').

- Editorial: In point 2. of 4.1, it's unclear how many alternatives
there are, or what goes logically together.

- Regarding point 3. in 4.1, and point 7.2, I think allowing
anything else but UTF-8 should not only be strongly discouraged,
it should be forbidden. There are some reasons for this:
- Does the current LHS allow anything else than US-ASCII?
- Not having a globally consistent way to interpret the
bytes in there as characters creates all kinds of problems.
- For security reasons (e.g. to avoid smuggling of syntax-
relevant US-ASCII characters in overlong encoding), on
may want to run various checks on UTF-8. This won't
work with arbitrary byte sequences.

- Editorial: 4.3, first paragraph, overstriking: This indeed has been
used in the past. Is it still used? Where? I would tone this down
a bit more, e.g. change "that there is a long history" ->
"that historically, there has been some use".

- Editorial: 4.3 "it SHOULD first verify that the string is
valid for a domain name according to IDNA rules": There should
be a reference to the relevant section/operation/flags in IDNA.

- 6., point 4.: Yes, please bundle UTF-8-HEADERS and your proposal,
and make 8BITMIME mandatory. This will strongly simplify
implementation and deployment.

- 7.1: allowing punycode in domain names with UTF-8 LHS seems
no big harm. Mail should still be delivered in such a case.
But the question (more for Paul's draft than for yours) is
whether MUAs should convert this to something readable.
Probably saying MAY is the right thing here, to mark the
use of punycode in this case as an occasional option, not
something that should be used in general.

- 7.2: See above.

- 7.3: First a stupid question: What characters are excluded by
prohibiting quoted strings or characters? If it's things
such as spaces, quotes, and so on, excluding them is probably
not a bad idea.
Rather than using restrictions, it may be good to give some
guidelines.
Some guidelines or restrictions may be needed for right-to-
left characters/bidirectional issues.

7.4: see above; yes, please require 8BITMIME

7.5: Yes, if this extension and 8BITMIME are in use,
RFC 2047 encoding should be dropped.

8: Editorial: "accomodate local character sets": Is this sets
in the strict mathematical sense (in this case I suggest
using 'character repertoires') or does it include encodings
(then I would suggest 'character encodings' or maybe 'charset's).
"some character script other than ASCII": Latin, Cyrillic,...
are scripts. ASCII isn't a scrip.

8: ideas such as SEPARATOR="...": In my opinion, don't.
Scripts around the world have imported/accepted punctuation from
other scripts (in particular Latin) much better than letters.


Regards, Martin.
Post by John C Klensin
Hi. This draft is an update on the SMTP approach to mailbox
internationalization.
* I am copying the SMTP list on this because everything
is getting connected to everything else, as my on-list
discussion with Nathaniel about the "trace fields in the
envelope" proposal illustrates. Please, to preserve
everyone's sanity, do not start parallel discussions or
cross-post -- this one is clearly an IMAA issue, at
least for historical reasons.
* There is one more proposal I-D coming in this series.
As a one-line summary, it outlines ("specifies" would be
too strong at this stage, but that is the intent) the
encapsulation model that this draft suggests for
downgrading.
* Very high-level summary of changes from the -01
version: I have become convinced that, if we are going
to have an internationalized structure for email --not
merely a collection of kludges and workarounds-- we are
going to need to make some rather basic changes. They
include SMTP extensions for i18n addresses and alternate
addresses and UTF-8 headers as well as UTF-8 headers, in
some form, themselves. I've picked up on Paul's UTF-8
header proposal because it seems sensible and nothing
else is on the table, but the proposal announced below
eliminates the need for alternate address headers and
all of the mucking around in the message payload by MTAs
they imply. And I'm convinced that we should have
_one_ header to cover this rather than a collection.
I.e., the model is that one is internationalized or not,
rather than one in which internationalization is done
piecemeal and incrementally, with all of the profiling,
multiplicative cases, and long-term cruft that would
imply.
Please do not react to what you think is in this proposal without reading
it or, especially based only on what you think it is about from reading
the above. That is a waste of your time and that of everyone on the
list, as previous rounds of such discussions amply demonstrate.
best,
john
Loading...