Discussion:
Comments on draft-klensin-email-i18n-message-00.txt
Charles Lindsey
2004-03-05 14:24:12 UTC
Permalink
Time this got commented on. The topic was essentially how to encapsulate a
message with UTF-8 in its headers so that it could be tunneled through
present transport channels and be reconstituted at the far end.

Jon proposes two encapsulations:

J1. Message Encapsulation

I would like to divide this into two cases:

J1a. Using message/rfc822

Suppose message/rfc822 is extended in the obvious way to allow headers in
the message to be in UTF-8 (and also, if the message is a part of a
multipart, the body part headers, such as Content-Type/Disposition, too).

That is still not good enough because the rules for
Content-Transfer-Encoding only allow for the body-part of the
message/rfc822 to be encoded (in QP or Base64). So those UTF-8 headers
would still be at the mercy of the transport medium.

But help is at hand, assuming that the transport supports 8BITMIME.
Officially speaking, 8BITMIME is not obliged to support 8bit characters
except in the body part of that message/rfc822 (because that is the only
place where you are allowed to say C-T-E: 8bit). But in practice, because
transports never look inside the body of messages, any transport
supporting 8BITMIME would certainly convey those headers correctly
(because it would actually require extra work on the part of the
implementor to make it not work). So it is a pretty safe bet.

J1b. Using application/news-transmission

This type is already registered with IANA and, although desgined for News,
it should work perfectly well for Email. Essentially, it just bundles the
whole message/article up into a bunch of bytes (headers and all) and sends
it as the body of the message with whatever CTE you choose.

Note however, that neither of those methods gives any help for UTF-8 in
the envelope (and specifically in the RCPT TO).


J2. Mail Transaction Encapsulation

This encapsulates a complete ESMTP transaction, including both envelope
and bodies, and sends it as an application/batch-SMTP, as described in RFC
2442. I would regard this as a better solution than J1[ab], because it
deals with the envelope, and the details given in RFC 2442 seem just fine.

The only snag is that RFC 2442 is an Informational RFC, and hence could
not be referenced in a Standards-Track document (though it might be OK in
an Experimental RFC). That problem does not look insurmountable.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl
Email: ***@clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
John C Klensin
2004-03-05 18:04:57 UTC
Permalink
Charles,

To comment on one point, my assumption has been that, if we
decide that the transaction encapsulation is useful and
appropriate, we would just go to the effort to either move RFC
2442 to Proposed along with it, or would obsolete or update 2442
and incorporate its key features into a revision of this draft.
There are interoperable implementations of 2442. My
retrospective guess as to why it was made Experimental (I don't
remember) is that the community wasn't convinced it was
necessary and a good idea. If we need it for this purpose, the
question of necessity and desirability are immediately answered.

The other difficulty with using either 8BitMIME and an 8bit
C-T-E or news encoding is the issue that Keith has raised
repeatedly in other contexts: all sorts of trash gets inserted
into headers, especially trace files, by all sorts of actors.
While the type of brute-force, "take whatever is there and
encapsulate it" approach of this draft won't solve that problem,
it runs a much lower risk of making things worse than assuming
that some stray 8bit characters are really UTF-8.

john


--On Friday, March 05, 2004 14:24 +0000 Charles Lindsey
Post by Charles Lindsey
Time this got commented on. The topic was essentially how to
encapsulate a message with UTF-8 in its headers so that it
could be tunneled through present transport channels and be
reconstituted at the far end.
J1. Message Encapsulation
J1a. Using message/rfc822
Suppose message/rfc822 is extended in the obvious way to allow
headers in the message to be in UTF-8 (and also, if the
message is a part of a multipart, the body part headers, such
as Content-Type/Disposition, too).
That is still not good enough because the rules for
Content-Transfer-Encoding only allow for the body-part of the
message/rfc822 to be encoded (in QP or Base64). So those UTF-8
headers would still be at the mercy of the transport medium.
But help is at hand, assuming that the transport supports
8BITMIME. Officially speaking, 8BITMIME is not obliged to
support 8bit characters except in the body part of that
message/rfc822 (because that is the only place where you are
allowed to say C-T-E: 8bit). But in practice, because
transports never look inside the body of messages, any
transport supporting 8BITMIME would certainly convey those
headers correctly (because it would actually require extra
work on the part of the implementor to make it not work). So
it is a pretty safe bet.
J1b. Using application/news-transmission
This type is already registered with IANA and, although
desgined for News, it should work perfectly well for Email.
Essentially, it just bundles the whole message/article up into
a bunch of bytes (headers and all) and sends it as the body of
the message with whatever CTE you choose.
Note however, that neither of those methods gives any help for
UTF-8 in the envelope (and specifically in the RCPT TO).
J2. Mail Transaction Encapsulation
This encapsulates a complete ESMTP transaction, including both
envelope and bodies, and sends it as an
application/batch-SMTP, as described in RFC 2442. I would
regard this as a better solution than J1[ab], because it deals
with the envelope, and the details given in RFC 2442 seem just
fine.
The only snag is that RFC 2442 is an Informational RFC, and
hence could not be referenced in a Standards-Track document
(though it might be OK in an Experimental RFC). That problem
does not look insurmountable.
--
Charles H. Lindsey ---------At Home, doing my own
thing------------------------ Tel: +44 161 436 6131 Fax: +44
3JU, U.K. PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01
E7 65 E8 64 7E 14 A4 AB A5
n***@mrochek.com
2004-03-05 22:51:51 UTC
Permalink
Post by John C Klensin
To comment on one point, my assumption has been that, if we
decide that the transaction encapsulation is useful and
appropriate, we would just go to the effort to either move RFC
2442 to Proposed along with it, or would obsolete or update 2442
and incorporate its key features into a revision of this draft.
There are interoperable implementations of 2442. My
retrospective guess as to why it was made Experimental
RFC 2442 is informational, not experimental. (I note in passing that one of the
author's names is misspelled in the RFC Index; I'll send in a note about that
right away.)
Post by John C Klensin
(I don't
remember) is that the community wasn't convinced it was
necessary and a good idea. If we need it for this purpose, the
question of necessity and desirability are immediately answered.
Informational status was chosen because that was the status deemed appropriate
for media type specifications at that point in time. The reason don't recall
any concerns that this wasn't a good idea is simple: There weren't any.
I clearly recall the meeting where I presented this specification, and
I also recall that there were no objections or concerns raised. (The
reason I remember this so clearly is that I was quite surprised to encounter
so little resistance to the idea.)

Things have changed, and now the belief is that these sorts of registrations
belong on the standards track. But it goes without saying that such changes
in belief don't retroactivaly change the status of existing documents.

Ned
Charles Lindsey
2004-03-08 11:18:21 UTC
Permalink
Post by John C Klensin
Charles,
To comment on one point, my assumption has been that, if we
decide that the transaction encapsulation is useful and
appropriate, we would just go to the effort ...
Sure, the problem is easily fixed if we decide to follow the RFD 2442 route.
Post by John C Klensin
The other difficulty with using either 8BitMIME and an 8bit
C-T-E or news encoding is the issue that Keith has raised
repeatedly in other contexts: all sorts of trash gets inserted
into headers, especially trace files, by all sorts of actors.
While the type of brute-force, "take whatever is there and
encapsulate it" approach of this draft won't solve that problem,
it runs a much lower risk of making things worse than assuming
that some stray 8bit characters are really UTF-8.
I don't see that using RFC 2442 as opposed to other methods of
encapsulation makes any difference here.

Suppose a message passes through servers A, B, C, D and E, where each is
determined to add, at the least, its own Received header. Suppose A and B
are upgraded to support UTF8-HEADERS, but C is not.

By the time B comes to deal with it, it will have acquired
Received: by A
Received: by B
and when B encapsulates it for sending to C, these headers will be
included in the encapsulated version, just like any other header.

Now C and D (and maybe E) add
Received: by C
Received: by D
etc to the headers of the _outer_ message (what ever other headers are
included in that outer message, especially possible Receiveds from A and
B, will depend on exactly how the protocol gets written).

It is not clear whether the encapsulation will be undone by D (if it has
the capability) or by the end point E. Either way, the undoing process
will need to ensure that Received headers from A, B, C and D are all
present in the final product, just as they would have been in a normal
ASCII-only message sent over that same route.

I don't see that the method of encapsulation affects this either way. In
both cases the protocol needs to state what reconstruction is to be done.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl
Email: ***@clerew.man.ac.uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
n***@mrochek.com
2004-03-05 23:27:45 UTC
Permalink
Post by Charles Lindsey
The only snag is that RFC 2442 is an Informational RFC, and hence could
not be referenced in a Standards-Track document (though it might be OK in
an Experimental RFC). That problem does not look insurmountable.
You have this exactly backwards. References to informational RFCs are allowed
in standards track RFCs, modulo certain restrictions that have yet to be
formalized. (There is a draft out on this out there somewhere, I believe.) If
the informational status of a document is seen to be a problem, it is likely
that it can be changed with minimal effort.

Normative references to experimental protocols, however, are generally not
allowed. And moving an experimental document to the standards track is,
generally speaking, a more difficult proposition, as whatever issue that led it
to be classified as experimental has to be dealt with. And given what it has
historically taken to force a specification to experimental this can be a very
nontrivial proposition. (This has changed in more recent times - the
IESG has begun to use experimental status more than it used to.)

Ned
Loading...