Comments on and FWD: draft-klensin-emailaddr-i18n-02.txt

John C Klensin

2004-02-13 19:47:46 UTC

Charles, and Martin,

Sorry for not responding to Martin's note sooner, but now I get
to answer both at once :-)

--On Thursday, 12 February, 2004 21:34 +0000 Charles Lindsey

I have read your draft. Congratulations, I think it is
extremely well written and well argued.

Yes, I too have read this (though it was a long time before I
had an opportunity to sit down and study it carefully - hence
this delayed response). And I agree that it makes a most
persuasive case for solving several problems, not just the
local-part one, by allowing UTF-8 into headers. And that is
is really the only viable way to proceed. So I have just a
few niggles and issue to raise below. Of course, there is
still much detail to be filled in.

Thanks. And "of course".

- Name of the extension: Something a little bit more specific
than just "I18N" would probably be better.

Indeed. There are two extensions under discussion (though
they might eventually be bindled together). So let us call
them I18N-ENVELOPES
UTF-8-HEADERS

Actually, there are a bunch of such extensions. Another one is
"get the trace fields out of the message headers, so we don't
have the potential of different (relay and other) MTAs inserting
things with different codings in the headers". And another one
is the existing 8BITMIME. My guess is that there will be more
before we get finished with this (although I haven't found
others to suggest). Now, what we know about "lots of options"
is that we get into serious interoperability problems. You pick
two, I pick three, and we have only one of them in common, so
the mail doesn't go through and people start talking about
"profiles" and "pro formas". I don't think we should go there
if we can possibly avoid it, so part of my agenda involves the
hope that we can figure out what is needed for
internationalization and then say "if you need
internationalization, then you need the following list of
changes -- but as _one_ ESMTP option (a bundle, if you prefer),
not an "internationalization profile".

That said, the over-general name has its origin in the -00
version of this draft, back when the alternative seemed to be an
MUA-only IMAA and before any of us had really sorted through the
issues that started leading to, e.g., "UTF-8 in the headers"
proposals.

- Regarding point 3. in 4.1, and point 7.2, I think allowing
anything else but UTF-8 should not only be strongly
discouraged, it should be forbidden.

Absolutely so! No doubt about it! The eventual standard will
contain wording like "the use of UTF-8 is REQUIRED", "it is
FORBIDDEN to use other charsets".

I prefer that type of answer (see my comments about profiles
above and elsewhere). Fewer choices equals better
interoperability. But I think it is irresponsible to discard
the other options before thinking about their strengths and
weaknesses.

The bad news is that this will not stop people (especially
the Chinese, but
the French can be a bit awkward too) from doing it. The trick
is to make sure that the people who insist on violating the
standard all do so in a consistent manner, and one that can
be distinguished from the real thing -
at the same time without admitting in the standard that such
a thing might
be possible.
So the way to do that is to ensure that there is a header
somewhere that could have contained a "charset=..."
parameter, and then to point out in very strong terms that
there is NO such parameter in that header, and that
it is FORBIDDEN to include one. That should give them the
hint to do the "right thing".

I have my doubts that this is either necessary, or sufficient if
it is necessary. And I think it would be more realistic to
write a gateway spec that says, more or less, "if you are going
to inject your stuff into the public Internet, it needs to look
like this; otherwise, it will probably be trashed beyond
recognition" (there are, of course, some statements equivalent
to that in 2821). The advantage of this over a header or
parameter (or the absence thereof) is that it delineates the
strange things that are done "within an environment" or "for
local use" from what is interchangeable over the Internet. But
this is precisely an example of "exploring options" that I
implied above.

2. Three Models for Transition
1. Avoid any more infrastructure changes than are
absolutely necessary. ................
2. Use a transport-level negotiation model of some
variety to ensure that the recipient machine can and will
accept the format and structure of the message and
options being sent. ........
3. Start a conversation about discarding more or less
everything and moving toward a "next generation" of
Internet mail in the hope of a huge gain in elegance,
capability, or other functions.

4. Do it as an "experimental protocol" (this is not
exclusive with the others, of course, and it is not an
excuse for not designing it properly). The advantage of this
approach is that you can ignore the naysayers and
nit-pickers on the grounds that "it is only for those who
want to try it out"; it is more easily forgotten if it fails
ignominiously; and you can afford not to be backward
compatible, especially in areas where you believe that no
current software implements (or ever implemented) that
obsolete feature, or that 99.999% of existing
implementations already allow what you want.

Yes, mostly. Note that, if any type of ESMTP negotiation is
needed, the 99.999% argument does apply. With such negotiation,
the argument for backward compatibility need not significantly
apply -- even though there are huge arguments for maintaining as
much compatibility as is reasonably possible, including the
all-important issues of getting things actually deployed. And,
without such negotiation, things rapidly deteriorate toward the
justifications for your next strategy.

5. "Just do it". This, of course, is an extremely bad
strategy from an IETF point of view. Indeed, it is not a
strategy at all (so let us call it a "pseudo strategy").
But the point is that this pseudo-strategy is already
in widespread use. Indeed, I would think that 50% of
innovations were first introduced because "people tried it
and it worked", and then it was perhaps standardized later,
but reluctantly so because the feature had been poorly
designed but was now too entrenched to change.
But the danger is that this is just what will happen
in the case of 8-bit headers of any form UNLESS we
step in soon to provide a "respectable" way of doing it.
Indeed, it is already happening, and maybe it is already too
late.

I am certainly sympathetic to this position.

3.4.2 Relay environment
..... If internationalized addresses are important to the
destination host, its administrators will chose
lower-preference MX hosts or other relays that can support
internationalized addresses.

Here you make an argument why the receiving MTA needs, and
can be expected
to have, I18N capability.

3.4.3 Internationalizing the Sender

But here you seem to say that you do not expect the sending
MTA to need that capability. I find this odd.
Now this difference might make good sense if all you have is
the I18N-ENVELOPE extension, but if you have the
UTF-8-HEADERS extension as well, then it is not so simple.
Suppose that Alice communicates with Bøb (observe the 'funny'
spelling of Bøb, and assume that his email address is also
Since Bøb advertises an I18N address, we may assume that he
has an I18N-enabled MUA, and is reached through an
I18N-enabled MTA. Even if Alice is not so enabled, there may
...
Note that Address-Map headers would be of no assistance in
this scenario. Neither would encapsulation (NAIUI), nor would
the "friendly gateway" suggested in 3.4.3 (though it might be
fine in the Alice->Bøb direction).
No, I think bouncing to Bøb is the best solution since it is,
after all, Bøb's problem.

I think this analysis is precisely correct, and hope that others
will study and understand it. The text which you cite was not
rethought and rewritten between -01 and -02 and obviously should
have been. Encapsulation might help if Alice's MUA can do it
locally for outgoing messages, but, since implementing UTF-8
headers is probably less trouble, I wouldn't expect to see a lot
of that option. I should note that there is an important dispute
in the community between

* Those who believe that some judicious message-bouncing
in situations like this will cause software to be
upgraded and/or users to make other arrangements (of
software, providers, or the addresses they are using) in
fairly short order.

* Those who believe that bouncing any message that could
possibly have been delivered will irritate users and
cause any possible extension mechanism to fail.

I think that, for this case, the arguments are stronger for the
first position, partially because the latter position tends to
lead to either "no progress" or horrible kludges, but others
have reached the opposite conclusion.

regards,
john