5266 lines
206 KiB
Plaintext
5266 lines
206 KiB
Plaintext
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Network Working Group N. Borenstein, Bellcore
|
|||
|
Request for Comments: 1341 N. Freed, Innosoft
|
|||
|
June 1992
|
|||
|
|
|||
|
|
|||
|
|
|||
|
MIME (Multipurpose Internet Mail Extensions):
|
|||
|
|
|||
|
|
|||
|
Mechanisms for Specifying and Describing
|
|||
|
the Format of Internet Message Bodies
|
|||
|
|
|||
|
|
|||
|
Status of this Memo
|
|||
|
|
|||
|
This RFC specifies an IAB standards track protocol for the
|
|||
|
Internet community, and requests discussion and suggestions
|
|||
|
for improvements. Please refer to the current edition of
|
|||
|
the "IAB Official Protocol Standards" for the
|
|||
|
standardization state and status of this protocol.
|
|||
|
Distribution of this memo is unlimited.
|
|||
|
|
|||
|
Abstract
|
|||
|
|
|||
|
RFC 822 defines a message representation protocol which
|
|||
|
specifies considerable detail about message headers, but
|
|||
|
which leaves the message content, or message body, as flat
|
|||
|
ASCII text. This document redefines the format of message
|
|||
|
bodies to allow multi-part textual and non-textual message
|
|||
|
bodies to be represented and exchanged without loss of
|
|||
|
information. This is based on earlier work documented in
|
|||
|
RFC 934 and RFC 1049, but extends and revises that work.
|
|||
|
Because RFC 822 said so little about message bodies, this
|
|||
|
document is largely orthogonal to (rather than a revision
|
|||
|
of) RFC 822.
|
|||
|
|
|||
|
In particular, this document is designed to provide
|
|||
|
facilities to include multiple objects in a single message,
|
|||
|
to represent body text in character sets other than US-
|
|||
|
ASCII, to represent formatted multi-font text messages, to
|
|||
|
represent non-textual material such as images and audio
|
|||
|
fragments, and generally to facilitate later extensions
|
|||
|
defining new types of Internet mail for use by cooperating
|
|||
|
mail agents.
|
|||
|
|
|||
|
This document does NOT extend Internet mail header fields to
|
|||
|
permit anything other than US-ASCII text data. It is
|
|||
|
recognized that such extensions are necessary, and they are
|
|||
|
the subject of a companion document [RFC -1342].
|
|||
|
|
|||
|
A table of contents appears at the end of this document.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page i]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
1 Introduction
|
|||
|
|
|||
|
Since its publication in 1982, RFC 822 [RFC-822] has defined
|
|||
|
the standard format of textual mail messages on the
|
|||
|
Internet. Its success has been such that the RFC 822 format
|
|||
|
has been adopted, wholly or partially, well beyond the
|
|||
|
confines of the Internet and the Internet SMTP transport
|
|||
|
defined by RFC 821 [RFC-821]. As the format has seen wider
|
|||
|
use, a number of limitations have proven increasingly
|
|||
|
restrictive for the user community.
|
|||
|
|
|||
|
RFC 822 was intended to specify a format for text messages.
|
|||
|
As such, non-text messages, such as multimedia messages that
|
|||
|
might include audio or images, are simply not mentioned.
|
|||
|
Even in the case of text, however, RFC 822 is inadequate for
|
|||
|
the needs of mail users whose languages require the use of
|
|||
|
character sets richer than US ASCII [US-ASCII]. Since RFC
|
|||
|
822 does not specify mechanisms for mail containing audio,
|
|||
|
video, Asian language text, or even text in most European
|
|||
|
languages, additional specifications are needed
|
|||
|
|
|||
|
One of the notable limitations of RFC 821/822 based mail
|
|||
|
systems is the fact that they limit the contents of
|
|||
|
electronic mail messages to relatively short lines of
|
|||
|
seven-bit ASCII. This forces users to convert any non-
|
|||
|
textual data that they may wish to send into seven-bit bytes
|
|||
|
representable as printable ASCII characters before invoking
|
|||
|
a local mail UA (User Agent, a program with which human
|
|||
|
users send and receive mail). Examples of such encodings
|
|||
|
currently used in the Internet include pure hexadecimal,
|
|||
|
uuencode, the 3-in-4 base 64 scheme specified in RFC 1113,
|
|||
|
the Andrew Toolkit Representation [ATK], and many others.
|
|||
|
|
|||
|
The limitations of RFC 822 mail become even more apparent as
|
|||
|
gateways are designed to allow for the exchange of mail
|
|||
|
messages between RFC 822 hosts and X.400 hosts. X.400 [X400]
|
|||
|
specifies mechanisms for the inclusion of non-textual body
|
|||
|
parts within electronic mail messages. The current
|
|||
|
standards for the mapping of X.400 messages to RFC 822
|
|||
|
messages specify that either X.400 non-textual body parts
|
|||
|
should be converted to (not encoded in) an ASCII format, or
|
|||
|
that they should be discarded, notifying the RFC 822 user
|
|||
|
that discarding has occurred. This is clearly undesirable,
|
|||
|
as information that a user may wish to receive is lost.
|
|||
|
Even though a user's UA may not have the capability of
|
|||
|
dealing with the non-textual body part, the user might have
|
|||
|
some mechanism external to the UA that can extract useful
|
|||
|
information from the body part. Moreover, it does not allow
|
|||
|
for the fact that the message may eventually be gatewayed
|
|||
|
back into an X.400 message handling system (i.e., the X.400
|
|||
|
message is "tunneled" through Internet mail), where the
|
|||
|
non-textual information would definitely become useful
|
|||
|
again.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 1]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
This document describes several mechanisms that combine to
|
|||
|
solve most of these problems without introducing any serious
|
|||
|
incompatibilities with the existing world of RFC 822 mail.
|
|||
|
In particular, it describes:
|
|||
|
|
|||
|
1. A MIME-Version header field, which uses a version number
|
|||
|
to declare a message to be conformant with this
|
|||
|
specification and allows mail processing agents to
|
|||
|
distinguish between such messages and those generated
|
|||
|
by older or non-conformant software, which is presumed
|
|||
|
to lack such a field.
|
|||
|
|
|||
|
2. A Content-Type header field, generalized from RFC 1049
|
|||
|
[RFC-1049], which can be used to specify the type and
|
|||
|
subtype of data in the body of a message and to fully
|
|||
|
specify the native representation (encoding) of such
|
|||
|
data.
|
|||
|
|
|||
|
2.a. A "text" Content-Type value, which can be used to
|
|||
|
represent textual information in a number of
|
|||
|
character sets and formatted text description
|
|||
|
languages in a standardized manner.
|
|||
|
|
|||
|
2.b. A "multipart" Content-Type value, which can be
|
|||
|
used to combine several body parts, possibly of
|
|||
|
differing types of data, into a single message.
|
|||
|
|
|||
|
2.c. An "application" Content-Type value, which can be
|
|||
|
used to transmit application data or binary data,
|
|||
|
and hence, among other uses, to implement an
|
|||
|
electronic mail file transfer service.
|
|||
|
|
|||
|
2.d. A "message" Content-Type value, for encapsulating
|
|||
|
a mail message.
|
|||
|
|
|||
|
2.e An "image" Content-Type value, for transmitting
|
|||
|
still image (picture) data.
|
|||
|
|
|||
|
2.f. An "audio" Content-Type value, for transmitting
|
|||
|
audio or voice data.
|
|||
|
|
|||
|
2.g. A "video" Content-Type value, for transmitting
|
|||
|
video or moving image data, possibly with audio as
|
|||
|
part of the composite video data format.
|
|||
|
|
|||
|
3. A Content-Transfer-Encoding header field, which can be
|
|||
|
used to specify an auxiliary encoding that was applied
|
|||
|
to the data in order to allow it to pass through mail
|
|||
|
transport mechanisms which may have data or character
|
|||
|
set limitations.
|
|||
|
|
|||
|
4. Two optional header fields that can be used to further
|
|||
|
describe the data in a message body, the Content-ID and
|
|||
|
Content-Description header fields.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 2]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
MIME has been carefully designed as an extensible mechanism,
|
|||
|
and it is expected that the set of content-type/subtype
|
|||
|
pairs and their associated parameters will grow
|
|||
|
significantly with time. Several other MIME fields, notably
|
|||
|
including character set names, are likely to have new values
|
|||
|
defined over time. In order to ensure that the set of such
|
|||
|
values is developed in an orderly, well-specified, and
|
|||
|
public manner, MIME defines a registration process which
|
|||
|
uses the Internet Assigned Numbers Authority (IANA) as a
|
|||
|
central registry for such values. Appendix F provides
|
|||
|
details about how IANA registration is accomplished.
|
|||
|
|
|||
|
Finally, to specify and promote interoperability, Appendix A
|
|||
|
of this document provides a basic applicability statement
|
|||
|
for a subset of the above mechanisms that defines a minimal
|
|||
|
level of "conformance" with this document.
|
|||
|
|
|||
|
HISTORICAL NOTE: Several of the mechanisms described in
|
|||
|
this document may seem somewhat strange or even baroque at
|
|||
|
first reading. It is important to note that compatibility
|
|||
|
with existing standards AND robustness across existing
|
|||
|
practice were two of the highest priorities of the working
|
|||
|
group that developed this document. In particular,
|
|||
|
compatibility was always favored over elegance.
|
|||
|
|
|||
|
2 Notations, Conventions, and Generic BNF Grammar
|
|||
|
|
|||
|
This document is being published in two versions, one as
|
|||
|
plain ASCII text and one as PostScript. The latter is
|
|||
|
recommended, though the textual contents are identical. An
|
|||
|
Andrew-format copy of this document is also available from
|
|||
|
the first author (Borenstein).
|
|||
|
|
|||
|
Although the mechanisms specified in this document are all
|
|||
|
described in prose, most are also described formally in the
|
|||
|
modified BNF notation of RFC 822. Implementors will need to
|
|||
|
be familiar with this notation in order to understand this
|
|||
|
specification, and are referred to RFC 822 for a complete
|
|||
|
explanation of the modified BNF notation.
|
|||
|
|
|||
|
Some of the modified BNF in this document makes reference to
|
|||
|
syntactic entities that are defined in RFC 822 and not in
|
|||
|
this document. A complete formal grammar, then, is obtained
|
|||
|
by combining the collected grammar appendix of this document
|
|||
|
with that of RFC 822.
|
|||
|
|
|||
|
The term CRLF, in this document, refers to the sequence of
|
|||
|
the two ASCII characters CR (13) and LF (10) which, taken
|
|||
|
together, in this order, denote a line break in RFC 822
|
|||
|
mail.
|
|||
|
|
|||
|
The term "character set", wherever it is used in this
|
|||
|
document, refers to a coded character set, in the sense of
|
|||
|
ISO character set standardization work, and must not be
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 3]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
misinterpreted as meaning "a set of characters."
|
|||
|
|
|||
|
The term "message", when not further qualified, means either
|
|||
|
the (complete or "top-level") message being transferred on a
|
|||
|
network, or a message encapsulated in a body of type
|
|||
|
"message".
|
|||
|
|
|||
|
The term "body part", in this document, means one of the
|
|||
|
parts of the body of a multipart entity. A body part has a
|
|||
|
header and a body, so it makes sense to speak about the body
|
|||
|
of a body part.
|
|||
|
|
|||
|
The term "entity", in this document, means either a message
|
|||
|
or a body part. All kinds of entities share the property
|
|||
|
that they have a header and a body.
|
|||
|
|
|||
|
The term "body", when not further qualified, means the body
|
|||
|
of an entity, that is the body of either a message or of a
|
|||
|
body part.
|
|||
|
|
|||
|
Note : the previous four definitions are clearly circular.
|
|||
|
This is unavoidable, since the overal structure of a MIME
|
|||
|
message is indeed recursive.
|
|||
|
|
|||
|
In this document, all numeric and octet values are given in
|
|||
|
decimal notation.
|
|||
|
|
|||
|
It must be noted that Content-Type values, subtypes, and
|
|||
|
parameter names as defined in this document are case-
|
|||
|
insensitive. However, parameter values are case-sensitive
|
|||
|
unless otherwise specified for the specific parameter.
|
|||
|
|
|||
|
FORMATTING NOTE: This document has been carefully formatted
|
|||
|
for ease of reading. The PostScript version of this
|
|||
|
document, in particular, places notes like this one, which
|
|||
|
may be skipped by the reader, in a smaller, italicized,
|
|||
|
font, and indents it as well. In the text version, only the
|
|||
|
indentation is preserved, so if you are reading the text
|
|||
|
version of this you might consider using the PostScript
|
|||
|
version instead. However, all such notes will be indented
|
|||
|
and preceded by "NOTE:" or some similar introduction, even
|
|||
|
in the text version.
|
|||
|
|
|||
|
The primary purpose of these non-essential notes is to
|
|||
|
convey information about the rationale of this document, or
|
|||
|
to place this document in the proper historical or
|
|||
|
evolutionary context. Such information may be skipped by
|
|||
|
those who are focused entirely on building a compliant
|
|||
|
implementation, but may be of use to those who wish to
|
|||
|
understand why this document is written as it is.
|
|||
|
|
|||
|
For ease of recognition, all BNF definitions have been
|
|||
|
placed in a fixed-width font in the PostScript version of
|
|||
|
this document.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 4]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
3 The MIME-Version Header Field
|
|||
|
|
|||
|
Since RFC 822 was published in 1982, there has really been
|
|||
|
only one format standard for Internet messages, and there
|
|||
|
has been little perceived need to declare the format
|
|||
|
standard in use. This document is an independent document
|
|||
|
that complements RFC 822. Although the extensions in this
|
|||
|
document have been defined in such a way as to be compatible
|
|||
|
with RFC 822, there are still circumstances in which it
|
|||
|
might be desirable for a mail-processing agent to know
|
|||
|
whether a message was composed with the new standard in
|
|||
|
mind.
|
|||
|
|
|||
|
Therefore, this document defines a new header field, "MIME-
|
|||
|
Version", which is to be used to declare the version of the
|
|||
|
Internet message body format standard in use.
|
|||
|
|
|||
|
Messages composed in accordance with this document MUST
|
|||
|
include such a header field, with the following verbatim
|
|||
|
text:
|
|||
|
|
|||
|
MIME-Version: 1.0
|
|||
|
|
|||
|
The presence of this header field is an assertion that the
|
|||
|
message has been composed in compliance with this document.
|
|||
|
|
|||
|
Since it is possible that a future document might extend the
|
|||
|
message format standard again, a formal BNF is given for the
|
|||
|
content of the MIME-Version field:
|
|||
|
|
|||
|
MIME-Version := text
|
|||
|
|
|||
|
Thus, future format specifiers, which might replace or
|
|||
|
extend "1.0", are (minimally) constrained by the definition
|
|||
|
of "text", which appears in RFC 822.
|
|||
|
|
|||
|
Note that the MIME-Version header field is required at the
|
|||
|
top level of a message. It is not required for each body
|
|||
|
part of a multipart entity. It is required for the embedded
|
|||
|
headers of a body of type "message" if and only if the
|
|||
|
embedded message is itself claimed to be MIME-compliant.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 5]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
4 The Content-Type Header Field
|
|||
|
|
|||
|
The purpose of the Content-Type field is to describe the
|
|||
|
data contained in the body fully enough that the receiving
|
|||
|
user agent can pick an appropriate agent or mechanism to
|
|||
|
present the data to the user, or otherwise deal with the
|
|||
|
data in an appropriate manner.
|
|||
|
|
|||
|
HISTORICAL NOTE: The Content-Type header field was first
|
|||
|
defined in RFC 1049. RFC 1049 Content-types used a simpler
|
|||
|
and less powerful syntax, but one that is largely compatible
|
|||
|
with the mechanism given here.
|
|||
|
|
|||
|
The Content-Type header field is used to specify the nature
|
|||
|
of the data in the body of an entity, by giving type and
|
|||
|
subtype identifiers, and by providing auxiliary information
|
|||
|
that may be required for certain types. After the type and
|
|||
|
subtype names, the remainder of the header field is simply a
|
|||
|
set of parameters, specified in an attribute/value notation.
|
|||
|
The set of meaningful parameters differs for the different
|
|||
|
types. The ordering of parameters is not significant.
|
|||
|
Among the defined parameters is a "charset" parameter by
|
|||
|
which the character set used in the body may be declared.
|
|||
|
Comments are allowed in accordance with RFC 822 rules for
|
|||
|
structured header fields.
|
|||
|
|
|||
|
In general, the top-level Content-Type is used to declare
|
|||
|
the general type of data, while the subtype specifies a
|
|||
|
specific format for that type of data. Thus, a Content-Type
|
|||
|
of "image/xyz" is enough to tell a user agent that the data
|
|||
|
is an image, even if the user agent has no knowledge of the
|
|||
|
specific image format "xyz". Such information can be used,
|
|||
|
for example, to decide whether or not to show a user the raw
|
|||
|
data from an unrecognized subtype -- such an action might be
|
|||
|
reasonable for unrecognized subtypes of text, but not for
|
|||
|
unrecognized subtypes of image or audio. For this reason,
|
|||
|
registered subtypes of audio, image, text, and video, should
|
|||
|
not contain embedded information that is really of a
|
|||
|
different type. Such compound types should be represented
|
|||
|
using the "multipart" or "application" types.
|
|||
|
|
|||
|
Parameters are modifiers of the content-subtype, and do not
|
|||
|
fundamentally affect the requirements of the host system.
|
|||
|
Although most parameters make sense only with certain
|
|||
|
content-types, others are "global" in the sense that they
|
|||
|
might apply to any subtype. For example, the "boundary"
|
|||
|
parameter makes sense only for the "multipart" content-type,
|
|||
|
but the "charset" parameter might make sense with several
|
|||
|
content-types.
|
|||
|
|
|||
|
An initial set of seven Content-Types is defined by this
|
|||
|
document. This set of top-level names is intended to be
|
|||
|
substantially complete. It is expected that additions to
|
|||
|
the larger set of supported types can generally be
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 6]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
accomplished by the creation of new subtypes of these
|
|||
|
initial types. In the future, more top-level types may be
|
|||
|
defined only by an extension to this standard. If another
|
|||
|
primary type is to be used for any reason, it must be given
|
|||
|
a name starting with "X-" to indicate its non-standard
|
|||
|
status and to avoid a potential conflict with a future
|
|||
|
official name.
|
|||
|
|
|||
|
In the Extended BNF notation of RFC 822, a Content-Type
|
|||
|
header field value is defined as follows:
|
|||
|
|
|||
|
Content-Type := type "/" subtype *[";" parameter]
|
|||
|
|
|||
|
type := "application" / "audio"
|
|||
|
/ "image" / "message"
|
|||
|
/ "multipart" / "text"
|
|||
|
/ "video" / x-token
|
|||
|
|
|||
|
x-token := <The two characters "X-" followed, with no
|
|||
|
intervening white space, by any token>
|
|||
|
|
|||
|
subtype := token
|
|||
|
|
|||
|
parameter := attribute "=" value
|
|||
|
|
|||
|
attribute := token
|
|||
|
|
|||
|
value := token / quoted-string
|
|||
|
|
|||
|
token := 1*<any CHAR except SPACE, CTLs, or tspecials>
|
|||
|
|
|||
|
tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in
|
|||
|
/ "," / ";" / ":" / "\" / <"> ; quoted-string,
|
|||
|
/ "/" / "[" / "]" / "?" / "." ; to use within
|
|||
|
/ "=" ; parameter values
|
|||
|
|
|||
|
Note that the definition of "tspecials" is the same as the
|
|||
|
RFC 822 definition of "specials" with the addition of the
|
|||
|
three characters "/", "?", and "=".
|
|||
|
|
|||
|
Note also that a subtype specification is MANDATORY. There
|
|||
|
are no default subtypes.
|
|||
|
|
|||
|
The type, subtype, and parameter names are not case
|
|||
|
sensitive. For example, TEXT, Text, and TeXt are all
|
|||
|
equivalent. Parameter values are normally case sensitive,
|
|||
|
but certain parameters are interpreted to be case-
|
|||
|
insensitive, depending on the intended use. (For example,
|
|||
|
multipart boundaries are case-sensitive, but the "access-
|
|||
|
type" for message/External-body is not case-sensitive.)
|
|||
|
|
|||
|
Beyond this syntax, the only constraint on the definition of
|
|||
|
subtype names is the desire that their uses must not
|
|||
|
conflict. That is, it would be undesirable to have two
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 7]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
different communities using "Content-Type:
|
|||
|
application/foobar" to mean two different things. The
|
|||
|
process of defining new content-subtypes, then, is not
|
|||
|
intended to be a mechanism for imposing restrictions, but
|
|||
|
simply a mechanism for publicizing the usages. There are,
|
|||
|
therefore, two acceptable mechanisms for defining new
|
|||
|
Content-Type subtypes:
|
|||
|
|
|||
|
1. Private values (starting with "X-") may be
|
|||
|
defined bilaterally between two cooperating
|
|||
|
agents without outside registration or
|
|||
|
standardization.
|
|||
|
|
|||
|
2. New standard values must be documented,
|
|||
|
registered with, and approved by IANA, as
|
|||
|
described in Appendix F. Where intended for
|
|||
|
public use, the formats they refer to must
|
|||
|
also be defined by a published specification,
|
|||
|
and possibly offered for standardization.
|
|||
|
|
|||
|
The seven standard initial predefined Content-Types are
|
|||
|
detailed in the bulk of this document. They are:
|
|||
|
|
|||
|
text -- textual information. The primary subtype,
|
|||
|
"plain", indicates plain (unformatted) text. No
|
|||
|
special software is required to get the full
|
|||
|
meaning of the text, aside from support for the
|
|||
|
indicated character set. Subtypes are to be used
|
|||
|
for enriched text in forms where application
|
|||
|
software may enhance the appearance of the text,
|
|||
|
but such software must not be required in order to
|
|||
|
get the general idea of the content. Possible
|
|||
|
subtypes thus include any readable word processor
|
|||
|
format. A very simple and portable subtype,
|
|||
|
richtext, is defined in this document.
|
|||
|
multipart -- data consisting of multiple parts of
|
|||
|
independent data types. Four initial subtypes
|
|||
|
are defined, including the primary "mixed"
|
|||
|
subtype, "alternative" for representing the same
|
|||
|
data in multiple formats, "parallel" for parts
|
|||
|
intended to be viewed simultaneously, and "digest"
|
|||
|
for multipart entities in which each part is of
|
|||
|
type "message".
|
|||
|
message -- an encapsulated message. A body of
|
|||
|
Content-Type "message" is itself a fully formatted
|
|||
|
RFC 822 conformant message which may contain its
|
|||
|
own different Content-Type header field. The
|
|||
|
primary subtype is "rfc822". The "partial"
|
|||
|
subtype is defined for partial messages, to permit
|
|||
|
the fragmented transmission of bodies that are
|
|||
|
thought to be too large to be passed through mail
|
|||
|
transport facilities. Another subtype,
|
|||
|
"External-body", is defined for specifying large
|
|||
|
bodies by reference to an external data source.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 8]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
image -- image data. Image requires a display device
|
|||
|
(such as a graphical display, a printer, or a FAX
|
|||
|
machine) to view the information. Initial
|
|||
|
subtypes are defined for two widely-used image
|
|||
|
formats, jpeg and gif.
|
|||
|
audio -- audio data, with initial subtype "basic".
|
|||
|
Audio requires an audio output device (such as a
|
|||
|
speaker or a telephone) to "display" the contents.
|
|||
|
video -- video data. Video requires the capability to
|
|||
|
display moving images, typically including
|
|||
|
specialized hardware and software. The initial
|
|||
|
subtype is "mpeg".
|
|||
|
application -- some other kind of data, typically
|
|||
|
either uninterpreted binary data or information to
|
|||
|
be processed by a mail-based application. The
|
|||
|
primary subtype, "octet-stream", is to be used in
|
|||
|
the case of uninterpreted binary data, in which
|
|||
|
case the simplest recommended action is to offer
|
|||
|
to write the information into a file for the user.
|
|||
|
Two additional subtypes, "ODA" and "PostScript",
|
|||
|
are defined for transporting ODA and PostScript
|
|||
|
documents in bodies. Other expected uses for
|
|||
|
"application" include spreadsheets, data for
|
|||
|
mail-based scheduling systems, and languages for
|
|||
|
"active" (computational) email. (Note that active
|
|||
|
email entails several securityconsiderations,
|
|||
|
which are discussed later in this memo,
|
|||
|
particularly in the context of
|
|||
|
application/PostScript.)
|
|||
|
|
|||
|
Default RFC 822 messages are typed by this protocol as plain
|
|||
|
text in the US-ASCII character set, which can be explicitly
|
|||
|
specified as "Content-type: text/plain; charset=us-ascii".
|
|||
|
If no Content-Type is specified, either by error or by an
|
|||
|
older user agent, this default is assumed. In the presence
|
|||
|
of a MIME-Version header field, a receiving User Agent can
|
|||
|
also assume that plain US-ASCII text was the sender's
|
|||
|
intent. In the absence of a MIME-Version specification,
|
|||
|
plain US-ASCII text must still be assumed, but the sender's
|
|||
|
intent might have been otherwise.
|
|||
|
|
|||
|
RATIONALE: In the absence of any Content-Type header field
|
|||
|
or MIME-Version header field, it is impossible to be certain
|
|||
|
that a message is actually text in the US-ASCII character
|
|||
|
set, since it might well be a message that, using the
|
|||
|
conventions that predate this document, includes text in
|
|||
|
another character set or non-textual data in a manner that
|
|||
|
cannot be automatically recognized (e.g., a uuencoded
|
|||
|
compressed UNIX tar file). Although there is no fully
|
|||
|
acceptable alternative to treating such untyped messages as
|
|||
|
"text/plain; charset=us-ascii", implementors should remain
|
|||
|
aware that if a message lacks both the MIME-Version and the
|
|||
|
Content-Type header fields, it may in practice contain
|
|||
|
almost anything.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 9]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
It should be noted that the list of Content-Type values
|
|||
|
given here may be augmented in time, via the mechanisms
|
|||
|
described above, and that the set of subtypes is expected to
|
|||
|
grow substantially.
|
|||
|
|
|||
|
When a mail reader encounters mail with an unknown Content-
|
|||
|
type value, it should generally treat it as equivalent to
|
|||
|
"application/octet-stream", as described later in this
|
|||
|
document.
|
|||
|
|
|||
|
5 The Content-Transfer-Encoding Header Field
|
|||
|
|
|||
|
Many Content-Types which could usefully be transported via
|
|||
|
email are represented, in their "natural" format, as 8-bit
|
|||
|
character or binary data. Such data cannot be transmitted
|
|||
|
over some transport protocols. For example, RFC 821
|
|||
|
restricts mail messages to 7-bit US-ASCII data with 1000
|
|||
|
character lines.
|
|||
|
|
|||
|
It is necessary, therefore, to define a standard mechanism
|
|||
|
for re-encoding such data into a 7-bit short-line format.
|
|||
|
This document specifies that such encodings will be
|
|||
|
indicated by a new "Content-Transfer-Encoding" header field.
|
|||
|
The Content-Transfer-Encoding field is used to indicate the
|
|||
|
type of transformation that has been used in order to
|
|||
|
represent the body in an acceptable manner for transport.
|
|||
|
|
|||
|
Unlike Content-Types, a proliferation of Content-Transfer-
|
|||
|
Encoding values is undesirable and unnecessary. However,
|
|||
|
establishing only a single Content-Transfer-Encoding
|
|||
|
mechanism does not seem possible. There is a tradeoff
|
|||
|
between the desire for a compact and efficient encoding of
|
|||
|
largely-binary data and the desire for a readable encoding
|
|||
|
of data that is mostly, but not entirely, 7-bit data. For
|
|||
|
this reason, at least two encoding mechanisms are necessary:
|
|||
|
a "readable" encoding and a "dense" encoding.
|
|||
|
|
|||
|
The Content-Transfer-Encoding field is designed to specify
|
|||
|
an invertible mapping between the "native" representation of
|
|||
|
a type of data and a representation that can be readily
|
|||
|
exchanged using 7 bit mail transport protocols, such as
|
|||
|
those defined by RFC 821 (SMTP). This field has not been
|
|||
|
defined by any previous standard. The field's value is a
|
|||
|
single token specifying the type of encoding, as enumerated
|
|||
|
below. Formally:
|
|||
|
|
|||
|
Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" /
|
|||
|
"8BIT" / "7BIT" /
|
|||
|
"BINARY" / x-token
|
|||
|
|
|||
|
These values are not case sensitive. That is, Base64 and
|
|||
|
BASE64 and bAsE64 are all equivalent. An encoding type of
|
|||
|
7BIT requires that the body is already in a seven-bit mail-
|
|||
|
ready representation. This is the default value -- that is,
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 10]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
"Content-Transfer-Encoding: 7BIT" is assumed if the
|
|||
|
Content-Transfer-Encoding header field is not present.
|
|||
|
|
|||
|
The values "8bit", "7bit", and "binary" all imply that NO
|
|||
|
encoding has been performed. However, they are potentially
|
|||
|
useful as indications of the kind of data contained in the
|
|||
|
object, and therefore of the kind of encoding that might
|
|||
|
need to be performed for transmission in a given transport
|
|||
|
system. "7bit" means that the data is all represented as
|
|||
|
short lines of US-ASCII data. "8bit" means that the lines
|
|||
|
are short, but there may be non-ASCII characters (octets
|
|||
|
with the high-order bit set). "Binary" means that not only
|
|||
|
may non-ASCII characters be present, but also that the lines
|
|||
|
are not necessarily short enough for SMTP transport.
|
|||
|
|
|||
|
The difference between "8bit" (or any other conceivable
|
|||
|
bit-width token) and the "binary" token is that "binary"
|
|||
|
does not require adherence to any limits on line length or
|
|||
|
to the SMTP CRLF semantics, while the bit-width tokens do
|
|||
|
require such adherence. If the body contains data in any
|
|||
|
bit-width other than 7-bit, the appropriate bit-width
|
|||
|
Content-Transfer-Encoding token must be used (e.g., "8bit"
|
|||
|
for unencoded 8 bit wide data). If the body contains binary
|
|||
|
data, the "binary" Content-Transfer-Encoding token must be
|
|||
|
used.
|
|||
|
|
|||
|
NOTE: The distinction between the Content-Transfer-Encoding
|
|||
|
values of "binary," "8bit," etc. may seem unimportant, in
|
|||
|
that all of them really mean "none" -- that is, there has
|
|||
|
been no encoding of the data for transport. However, clear
|
|||
|
labeling will be of enormous value to gateways between
|
|||
|
future mail transport systems with differing capabilities in
|
|||
|
transporting data that do not meet the restrictions of RFC
|
|||
|
821 transport.
|
|||
|
|
|||
|
As of the publication of this document, there are no
|
|||
|
standardized Internet transports for which it is legitimate
|
|||
|
to include unencoded 8-bit or binary data in mail bodies.
|
|||
|
Thus there are no circumstances in which the "8bit" or
|
|||
|
"binary" Content-Transfer-Encoding is actually legal on the
|
|||
|
Internet. However, in the event that 8-bit or binary mail
|
|||
|
transport becomes a reality in Internet mail, or when this
|
|||
|
document is used in conjunction with any other 8-bit or
|
|||
|
binary-capable transport mechanism, 8-bit or binary bodies
|
|||
|
should be labeled as such using this mechanism.
|
|||
|
|
|||
|
NOTE: The five values defined for the Content-Transfer-
|
|||
|
Encoding field imply nothing about the Content-Type other
|
|||
|
than the algorithm by which it was encoded or the transport
|
|||
|
system requirements if unencoded.
|
|||
|
|
|||
|
Implementors may, if necessary, define new Content-
|
|||
|
Transfer-Encoding values, but must use an x-token, which is
|
|||
|
a name prefixed by "X-" to indicate its non-standard status,
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 11]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
e.g., "Content-Transfer-Encoding: x-my-new-encoding".
|
|||
|
However, unlike Content-Types and subtypes, the creation of
|
|||
|
new Content-Transfer-Encoding values is explicitly and
|
|||
|
strongly discouraged, as it seems likely to hinder
|
|||
|
interoperability with little potential benefit. Their use
|
|||
|
is allowed only as the result of an agreement between
|
|||
|
cooperating user agents.
|
|||
|
|
|||
|
If a Content-Transfer-Encoding header field appears as part
|
|||
|
of a message header, it applies to the entire body of that
|
|||
|
message. If a Content-Transfer-Encoding header field
|
|||
|
appears as part of a body part's headers, it applies only to
|
|||
|
the body of that body part. If an entity is of type
|
|||
|
"multipart" or "message", the Content-Transfer-Encoding is
|
|||
|
not permitted to have any value other than a bit width
|
|||
|
(e.g., "7bit", "8bit", etc.) or "binary".
|
|||
|
|
|||
|
It should be noted that email is character-oriented, so that
|
|||
|
the mechanisms described here are mechanisms for encoding
|
|||
|
arbitrary byte streams, not bit streams. If a bit stream is
|
|||
|
to be encoded via one of these mechanisms, it must first be
|
|||
|
converted to an 8-bit byte stream using the network standard
|
|||
|
bit order ("big-endian"), in which the earlier bits in a
|
|||
|
stream become the higher-order bits in a byte. A bit stream
|
|||
|
not ending at an 8-bit boundary must be padded with zeroes.
|
|||
|
This document provides a mechanism for noting the addition
|
|||
|
of such padding in the case of the application Content-Type,
|
|||
|
which has a "padding" parameter.
|
|||
|
|
|||
|
The encoding mechanisms defined here explicitly encode all
|
|||
|
data in ASCII. Thus, for example, suppose an entity has
|
|||
|
header fields such as:
|
|||
|
|
|||
|
Content-Type: text/plain; charset=ISO-8859-1
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
This should be interpreted to mean that the body is a base64
|
|||
|
ASCII encoding of data that was originally in ISO-8859-1,
|
|||
|
and will be in that character set again after decoding.
|
|||
|
|
|||
|
The following sections will define the two standard encoding
|
|||
|
mechanisms. The definition of new content-transfer-
|
|||
|
encodings is explicitly discouraged and should only occur
|
|||
|
when absolutely necessary. All content-transfer-encoding
|
|||
|
namespace except that beginning with "X-" is explicitly
|
|||
|
reserved to the IANA for future use. Private agreements
|
|||
|
about content-transfer-encodings are also explicitly
|
|||
|
discouraged.
|
|||
|
|
|||
|
Certain Content-Transfer-Encoding values may only be used on
|
|||
|
certain Content-Types. In particular, it is expressly
|
|||
|
forbidden to use any encodings other than "7bit", "8bit", or
|
|||
|
"binary" with any Content-Type that recursively includes
|
|||
|
other Content-Type fields, notably the "multipart" and
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 12]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
"message" Content-Types. All encodings that are desired for
|
|||
|
bodies of type multipart or message must be done at the
|
|||
|
innermost level, by encoding the actual body that needs to
|
|||
|
be encoded.
|
|||
|
|
|||
|
NOTE ON ENCODING RESTRICTIONS: Though the prohibition
|
|||
|
against using content-transfer-encodings on data of type
|
|||
|
multipart or message may seem overly restrictive, it is
|
|||
|
necessary to prevent nested encodings, in which data are
|
|||
|
passed through an encoding algorithm multiple times, and
|
|||
|
must be decoded multiple times in order to be properly
|
|||
|
viewed. Nested encodings add considerable complexity to
|
|||
|
user agents: aside from the obvious efficiency problems
|
|||
|
with such multiple encodings, they can obscure the basic
|
|||
|
structure of a message. In particular, they can imply that
|
|||
|
several decoding operations are necessary simply to find out
|
|||
|
what types of objects a message contains. Banning nested
|
|||
|
encodings may complicate the job of certain mail gateways,
|
|||
|
but this seems less of a problem than the effect of nested
|
|||
|
encodings on user agents.
|
|||
|
|
|||
|
NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-
|
|||
|
TRANSFER-ENCODING: It may seem that the Content-Transfer-
|
|||
|
Encoding could be inferred from the characteristics of the
|
|||
|
Content-Type that is to be encoded, or, at the very least,
|
|||
|
that certain Content-Transfer-Encodings could be mandated
|
|||
|
for use with specific Content-Types. There are several
|
|||
|
reasons why this is not the case. First, given the varying
|
|||
|
types of transports used for mail, some encodings may be
|
|||
|
appropriate for some Content-Type/transport combinations and
|
|||
|
not for others. (For example, in an 8-bit transport, no
|
|||
|
encoding would be required for text in certain character
|
|||
|
sets, while such encodings are clearly required for 7-bit
|
|||
|
SMTP.) Second, certain Content-Types may require different
|
|||
|
types of transfer encoding under different circumstances.
|
|||
|
For example, many PostScript bodies might consist entirely
|
|||
|
of short lines of 7-bit data and hence require little or no
|
|||
|
encoding. Other PostScript bodies (especially those using
|
|||
|
Level 2 PostScript's binary encoding mechanism) may only be
|
|||
|
reasonably represented using a binary transport encoding.
|
|||
|
Finally, since Content-Type is intended to be an open-ended
|
|||
|
specification mechanism, strict specification of an
|
|||
|
association between Content-Types and encodings effectively
|
|||
|
couples the specification of an application protocol with a
|
|||
|
specific lower-level transport. This is not desirable since
|
|||
|
the developers of a Content-Type should not have to be aware
|
|||
|
of all the transports in use and what their limitations are.
|
|||
|
|
|||
|
NOTE ON TRANSLATING ENCODINGS: The quoted-printable and
|
|||
|
base64 encodings are designed so that conversion between
|
|||
|
them is possible. The only issue that arises in such a
|
|||
|
conversion is the handling of line breaks. When converting
|
|||
|
from quoted-printable to base64 a line break must be
|
|||
|
converted into a CRLF sequence. Similarly, a CRLF sequence
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 13]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
in base64 data should be converted to a quoted-printable
|
|||
|
line break, but ONLY when converting text data.
|
|||
|
|
|||
|
NOTE ON CANONICAL ENCODING MODEL: There was some
|
|||
|
confusion, in earlier drafts of this memo, regarding the
|
|||
|
model for when email data was to be converted to canonical
|
|||
|
form and encoded, and in particular how this process would
|
|||
|
affect the treatment of CRLFs, given that the representation
|
|||
|
of newlines varies greatly from system to system. For this
|
|||
|
reason, a canonical model for encoding is presented as
|
|||
|
Appendix H.
|
|||
|
|
|||
|
5.1 Quoted-Printable Content-Transfer-Encoding
|
|||
|
|
|||
|
The Quoted-Printable encoding is intended to represent data
|
|||
|
that largely consists of octets that correspond to printable
|
|||
|
characters in the ASCII character set. It encodes the data
|
|||
|
in such a way that the resulting octets are unlikely to be
|
|||
|
modified by mail transport. If the data being encoded are
|
|||
|
mostly ASCII text, the encoded form of the data remains
|
|||
|
largely recognizable by humans. A body which is entirely
|
|||
|
ASCII may also be encoded in Quoted-Printable to ensure the
|
|||
|
integrity of the data should the message pass through a
|
|||
|
character-translating, and/or line-wrapping gateway.
|
|||
|
|
|||
|
In this encoding, octets are to be represented as determined
|
|||
|
by the following rules:
|
|||
|
|
|||
|
Rule #1: (General 8-bit representation) Any octet,
|
|||
|
except those indicating a line break according to the
|
|||
|
newline convention of the canonical form of the data
|
|||
|
being encoded, may be represented by an "=" followed by
|
|||
|
a two digit hexadecimal representation of the octet's
|
|||
|
value. The digits of the hexadecimal alphabet, for this
|
|||
|
purpose, are "0123456789ABCDEF". Uppercase letters must
|
|||
|
be
|
|||
|
used when sending hexadecimal data, though a robust
|
|||
|
implementation may choose to recognize lowercase
|
|||
|
letters on receipt. Thus, for example, the value 12
|
|||
|
(ASCII form feed) can be represented by "=0C", and the
|
|||
|
value 61 (ASCII EQUAL SIGN) can be represented by
|
|||
|
"=3D". Except when the following rules allow an
|
|||
|
alternative encoding, this rule is mandatory.
|
|||
|
|
|||
|
Rule #2: (Literal representation) Octets with decimal
|
|||
|
values of 33 through 60 inclusive, and 62 through 126,
|
|||
|
inclusive, MAY be represented as the ASCII characters
|
|||
|
which correspond to those octets (EXCLAMATION POINT
|
|||
|
through LESS THAN, and GREATER THAN through TILDE,
|
|||
|
respectively).
|
|||
|
|
|||
|
Rule #3: (White Space): Octets with values of 9 and 32
|
|||
|
MAY be represented as ASCII TAB (HT) and SPACE
|
|||
|
characters, respectively, but MUST NOT be so
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 14]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
represented at the end of an encoded line. Any TAB (HT)
|
|||
|
or SPACE characters on an encoded line MUST thus be
|
|||
|
followed on that line by a printable character. In
|
|||
|
particular, an "=" at the end of an encoded line,
|
|||
|
indicating a soft line break (see rule #5) may follow
|
|||
|
one or more TAB (HT) or SPACE characters. It follows
|
|||
|
that an octet with value 9 or 32 appearing at the end
|
|||
|
of an encoded line must be represented according to
|
|||
|
Rule #1. This rule is necessary because some MTAs
|
|||
|
(Message Transport Agents, programs which transport
|
|||
|
messages from one user to another, or perform a part of
|
|||
|
such transfers) are known to pad lines of text with
|
|||
|
SPACEs, and others are known to remove "white space"
|
|||
|
characters from the end of a line. Therefore, when
|
|||
|
decoding a Quoted-Printable body, any trailing white
|
|||
|
space on a line must be deleted, as it will necessarily
|
|||
|
have been added by intermediate transport agents.
|
|||
|
|
|||
|
Rule #4 (Line Breaks): A line break in a text body
|
|||
|
part, independent of what its representation is
|
|||
|
following the canonical representation of the data
|
|||
|
being encoded, must be represented by a (RFC 822) line
|
|||
|
break, which is a CRLF sequence, in the Quoted-
|
|||
|
Printable encoding. If isolated CRs and LFs, or LF CR
|
|||
|
and CR LF sequences are allowed to appear in binary
|
|||
|
data according to the canonical form, they must be
|
|||
|
represented using the "=0D", "=0A", "=0A=0D" and
|
|||
|
"=0D=0A" notations respectively.
|
|||
|
|
|||
|
Note that many implementation may elect to encode the
|
|||
|
local representation of various content types directly.
|
|||
|
In particular, this may apply to plain text material on
|
|||
|
systems that use newline conventions other than CRLF
|
|||
|
delimiters. Such an implementation is permissible, but
|
|||
|
the generation of line breaks must be generalized to
|
|||
|
account for the case where alternate representations of
|
|||
|
newline sequences are used.
|
|||
|
|
|||
|
Rule #5 (Soft Line Breaks): The Quoted-Printable
|
|||
|
encoding REQUIRES that encoded lines be no more than 76
|
|||
|
characters long. If longer lines are to be encoded with
|
|||
|
the Quoted-Printable encoding, 'soft' line breaks must
|
|||
|
be used. An equal sign as the last character on a
|
|||
|
encoded line indicates such a non-significant ('soft')
|
|||
|
line break in the encoded text. Thus if the "raw" form
|
|||
|
of the line is a single unencoded line that says:
|
|||
|
|
|||
|
Now's the time for all folk to come to the aid of
|
|||
|
their country.
|
|||
|
|
|||
|
This can be represented, in the Quoted-Printable
|
|||
|
encoding, as
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 15]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Now's the time =
|
|||
|
for all folk to come=
|
|||
|
to the aid of their country.
|
|||
|
|
|||
|
This provides a mechanism with which long lines are
|
|||
|
encoded in such a way as to be restored by the user
|
|||
|
agent. The 76 character limit does not count the
|
|||
|
trailing CRLF, but counts all other characters,
|
|||
|
including any equal signs.
|
|||
|
|
|||
|
Since the hyphen character ("-") is represented as itself in
|
|||
|
the Quoted-Printable encoding, care must be taken, when
|
|||
|
encapsulating a quoted-printable encoded body in a multipart
|
|||
|
entity, to ensure that the encapsulation boundary does not
|
|||
|
appear anywhere in the encoded body. (A good strategy is to
|
|||
|
choose a boundary that includes a character sequence such as
|
|||
|
"=_" which can never appear in a quoted-printable body. See
|
|||
|
the definition of multipart messages later in this
|
|||
|
document.)
|
|||
|
|
|||
|
NOTE: The quoted-printable encoding represents something of
|
|||
|
a compromise between readability and reliability in
|
|||
|
transport. Bodies encoded with the quoted-printable
|
|||
|
encoding will work reliably over most mail gateways, but may
|
|||
|
not work perfectly over a few gateways, notably those
|
|||
|
involving translation into EBCDIC. (In theory, an EBCDIC
|
|||
|
gateway could decode a quoted-printable body and re-encode
|
|||
|
it using base64, but such gateways do not yet exist.) A
|
|||
|
higher level of confidence is offered by the base64
|
|||
|
Content-Transfer-Encoding. A way to get reasonably reliable
|
|||
|
transport through EBCDIC gateways is to also quote the ASCII
|
|||
|
characters
|
|||
|
|
|||
|
!"#$@[\]^`{|}~
|
|||
|
|
|||
|
according to rule #1. See Appendix B for more information.
|
|||
|
|
|||
|
Because quoted-printable data is generally assumed to be
|
|||
|
line-oriented, it is to be expected that the breaks between
|
|||
|
the lines of quoted printable data may be altered in
|
|||
|
transport, in the same manner that plain text mail has
|
|||
|
always been altered in Internet mail when passing between
|
|||
|
systems with differing newline conventions. If such
|
|||
|
alterations are likely to constitute a corruption of the
|
|||
|
data, it is probably more sensible to use the base64
|
|||
|
encoding rather than the quoted-printable encoding.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 16]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
5.2 Base64 Content-Transfer-Encoding
|
|||
|
|
|||
|
The Base64 Content-Transfer-Encoding is designed to
|
|||
|
represent arbitrary sequences of octets in a form that is
|
|||
|
not humanly readable. The encoding and decoding algorithms
|
|||
|
are simple, but the encoded data are consistently only about
|
|||
|
33 percent larger than the unencoded data. This encoding is
|
|||
|
based on the one used in Privacy Enhanced Mail applications,
|
|||
|
as defined in RFC 1113. The base64 encoding is adapted
|
|||
|
from RFC 1113, with one change: base64 eliminates the "*"
|
|||
|
mechanism for embedded clear text.
|
|||
|
|
|||
|
A 65-character subset of US-ASCII is used, enabling 6 bits
|
|||
|
to be represented per printable character. (The extra 65th
|
|||
|
character, "=", is used to signify a special processing
|
|||
|
function.)
|
|||
|
|
|||
|
NOTE: This subset has the important property that it is
|
|||
|
represented identically in all versions of ISO 646,
|
|||
|
including US ASCII, and all characters in the subset are
|
|||
|
also represented identically in all versions of EBCDIC.
|
|||
|
Other popular encodings, such as the encoding used by the
|
|||
|
UUENCODE utility and the base85 encoding specified as part
|
|||
|
of Level 2 PostScript, do not share these properties, and
|
|||
|
thus do not fulfill the portability requirements a binary
|
|||
|
transport encoding for mail must meet.
|
|||
|
|
|||
|
The encoding process represents 24-bit groups of input bits
|
|||
|
as output strings of 4 encoded characters. Proceeding from
|
|||
|
left to right, a 24-bit input group is formed by
|
|||
|
concatenating 3 8-bit input groups. These 24 bits are then
|
|||
|
treated as 4 concatenated 6-bit groups, each of which is
|
|||
|
translated into a single digit in the base64 alphabet. When
|
|||
|
encoding a bit stream via the base64 encoding, the bit
|
|||
|
stream must be presumed to be ordered with the most-
|
|||
|
significant-bit first. That is, the first bit in the stream
|
|||
|
will be the high-order bit in the first byte, and the eighth
|
|||
|
bit will be the low-order bit in the first byte, and so on.
|
|||
|
|
|||
|
Each 6-bit group is used as an index into an array of 64
|
|||
|
printable characters. The character referenced by the index
|
|||
|
is placed in the output string. These characters, identified
|
|||
|
in Table 1, below, are selected so as to be universally
|
|||
|
representable, and the set excludes characters with
|
|||
|
particular significance to SMTP (e.g., ".", "CR", "LF") and
|
|||
|
to the encapsulation boundaries defined in this document
|
|||
|
(e.g., "-").
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 17]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Table 1: The Base64 Alphabet
|
|||
|
|
|||
|
Value Encoding Value Encoding Value Encoding Value
|
|||
|
Encoding
|
|||
|
0 A 17 R 34 i 51 z
|
|||
|
1 B 18 S 35 j 52 0
|
|||
|
2 C 19 T 36 k 53 1
|
|||
|
3 D 20 U 37 l 54 2
|
|||
|
4 E 21 V 38 m 55 3
|
|||
|
5 F 22 W 39 n 56 4
|
|||
|
6 G 23 X 40 o 57 5
|
|||
|
7 H 24 Y 41 p 58 6
|
|||
|
8 I 25 Z 42 q 59 7
|
|||
|
9 J 26 a 43 r 60 8
|
|||
|
10 K 27 b 44 s 61 9
|
|||
|
11 L 28 c 45 t 62 +
|
|||
|
12 M 29 d 46 u 63 /
|
|||
|
13 N 30 e 47 v
|
|||
|
14 O 31 f 48 w (pad) =
|
|||
|
15 P 32 g 49 x
|
|||
|
16 Q 33 h 50 y
|
|||
|
|
|||
|
The output stream (encoded bytes) must be represented in
|
|||
|
lines of no more than 76 characters each. All line breaks
|
|||
|
or other characters not found in Table 1 must be ignored by
|
|||
|
decoding software. In base64 data, characters other than
|
|||
|
those in Table 1, line breaks, and other white space
|
|||
|
probably indicate a transmission error, about which a
|
|||
|
warning message or even a message rejection might be
|
|||
|
appropriate under some circumstances.
|
|||
|
|
|||
|
Special processing is performed if fewer than 24 bits are
|
|||
|
available at the end of the data being encoded. A full
|
|||
|
encoding quantum is always completed at the end of a body.
|
|||
|
When fewer than 24 input bits are available in an input
|
|||
|
group, zero bits are added (on the right) to form an
|
|||
|
integral number of 6-bit groups. Output character positions
|
|||
|
which are not required to represent actual input data are
|
|||
|
set to the character "=". Since all base64 input is an
|
|||
|
integral number of octets, only the following cases can
|
|||
|
arise: (1) the final quantum of encoding input is an
|
|||
|
integral multiple of 24 bits; here, the final unit of
|
|||
|
encoded output will be an integral multiple of 4 characters
|
|||
|
with no "=" padding, (2) the final quantum of encoding input
|
|||
|
is exactly 8 bits; here, the final unit of encoded output
|
|||
|
will be two characters followed by two "=" padding
|
|||
|
characters, or (3) the final quantum of encoding input is
|
|||
|
exactly 16 bits; here, the final unit of encoded output will
|
|||
|
be three characters followed by one "=" padding character.
|
|||
|
|
|||
|
Care must be taken to use the proper octets for line breaks
|
|||
|
if base64 encoding is applied directly to text material that
|
|||
|
has not been converted to canonical form. In particular,
|
|||
|
text line breaks should be converted into CRLF sequences
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 18]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
prior to base64 encoding. The important thing to note is
|
|||
|
that this may be done directly by the encoder rather than in
|
|||
|
a prior canonicalization step in some implementations.
|
|||
|
|
|||
|
NOTE: There is no need to worry about quoting apparent
|
|||
|
encapsulation boundaries within base64-encoded parts of
|
|||
|
multipart entities because no hyphen characters are used in
|
|||
|
the base64 encoding.
|
|||
|
|
|||
|
6 Additional Optional Content- Header Fields
|
|||
|
|
|||
|
6.1 Optional Content-ID Header Field
|
|||
|
|
|||
|
In constructing a high-level user agent, it may be desirable
|
|||
|
to allow one body to make reference to another.
|
|||
|
Accordingly, bodies may be labeled using the "Content-ID"
|
|||
|
header field, which is syntactically identical to the
|
|||
|
"Message-ID" header field:
|
|||
|
|
|||
|
Content-ID := msg-id
|
|||
|
|
|||
|
Like the Message-ID values, Content-ID values must be
|
|||
|
generated to be as unique as possible.
|
|||
|
|
|||
|
6.2 Optional Content-Description Header Field
|
|||
|
|
|||
|
The ability to associate some descriptive information with a
|
|||
|
given body is often desirable. For example, it may be useful
|
|||
|
to mark an "image" body as "a picture of the Space Shuttle
|
|||
|
Endeavor." Such text may be placed in the Content-
|
|||
|
Description header field.
|
|||
|
|
|||
|
Content-Description := *text
|
|||
|
|
|||
|
The description is presumed to be given in the US-ASCII
|
|||
|
character set, although the mechanism specified in [RFC-
|
|||
|
1342] may be used for non-US-ASCII Content-Description
|
|||
|
values.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 19]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7 The Predefined Content-Type Values
|
|||
|
|
|||
|
This document defines seven initial Content-Type values and
|
|||
|
an extension mechanism for private or experimental types.
|
|||
|
Further standard types must be defined by new published
|
|||
|
specifications. It is expected that most innovation in new
|
|||
|
types of mail will take place as subtypes of the seven types
|
|||
|
defined here. The most essential characteristics of the
|
|||
|
seven content-types are summarized in Appendix G.
|
|||
|
|
|||
|
7.1 The Text Content-Type
|
|||
|
|
|||
|
The text Content-Type is intended for sending material which
|
|||
|
is principally textual in form. It is the default Content-
|
|||
|
Type. A "charset" parameter may be used to indicate the
|
|||
|
character set of the body text. The primary subtype of text
|
|||
|
is "plain". This indicates plain (unformatted) text. The
|
|||
|
default Content-Type for Internet mail is "text/plain;
|
|||
|
charset=us-ascii".
|
|||
|
|
|||
|
Beyond plain text, there are many formats for representing
|
|||
|
what might be known as "extended text" -- text with embedded
|
|||
|
formatting and presentation information. An interesting
|
|||
|
characteristic of many such representations is that they are
|
|||
|
to some extent readable even without the software that
|
|||
|
interprets them. It is useful, then, to distinguish them,
|
|||
|
at the highest level, from such unreadable data as images,
|
|||
|
audio, or text represented in an unreadable form. In the
|
|||
|
absence of appropriate interpretation software, it is
|
|||
|
reasonable to show subtypes of text to the user, while it is
|
|||
|
not reasonable to do so with most nontextual data.
|
|||
|
|
|||
|
Such formatted textual data should be represented using
|
|||
|
subtypes of text. Plausible subtypes of text are typically
|
|||
|
given by the common name of the representation format, e.g.,
|
|||
|
"text/richtext".
|
|||
|
|
|||
|
7.1.1 The charset parameter
|
|||
|
|
|||
|
A critical parameter that may be specified in the Content-
|
|||
|
Type field for text data is the character set. This is
|
|||
|
specified with a "charset" parameter, as in:
|
|||
|
|
|||
|
Content-type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
Unlike some other parameter values, the values of the
|
|||
|
charset parameter are NOT case sensitive. The default
|
|||
|
character set, which must be assumed in the absence of a
|
|||
|
charset parameter, is US-ASCII.
|
|||
|
|
|||
|
An initial list of predefined character set names can be
|
|||
|
found at the end of this section. Additional character sets
|
|||
|
may be registered with IANA as described in Appendix F,
|
|||
|
although the standardization of their use requires the usual
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 20]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
IAB review and approval. Note that if the specified
|
|||
|
character set includes 8-bit data, a Content-Transfer-
|
|||
|
Encoding header field and a corresponding encoding on the
|
|||
|
data are required in order to transmit the body via some
|
|||
|
mail transfer protocols, such as SMTP.
|
|||
|
|
|||
|
The default character set, US-ASCII, has been the subject of
|
|||
|
some confusion and ambiguity in the past. Not only were
|
|||
|
there some ambiguities in the definition, there have been
|
|||
|
wide variations in practice. In order to eliminate such
|
|||
|
ambiguity and variations in the future, it is strongly
|
|||
|
recommended that new user agents explicitly specify a
|
|||
|
character set via the Content-Type header field. "US-ASCII"
|
|||
|
does not indicate an arbitrary seven-bit character code, but
|
|||
|
specifies that the body uses character coding that uses the
|
|||
|
exact correspondence of codes to characters specified in
|
|||
|
ASCII. National use variations of ISO 646 [ISO-646] are NOT
|
|||
|
ASCII and their use in Internet mail is explicitly
|
|||
|
discouraged. The omission of the ISO 646 character set is
|
|||
|
deliberate in this regard. The character set name of "US-
|
|||
|
ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only.
|
|||
|
The character set name "ASCII" is reserved and must not be
|
|||
|
used for any purpose.
|
|||
|
|
|||
|
NOTE: RFC 821 explicitly specifies "ASCII", and references
|
|||
|
an earlier version of the American Standard. Insofar as one
|
|||
|
of the purposes of specifying a Content-Type and character
|
|||
|
set is to permit the receiver to unambiguously determine how
|
|||
|
the sender intended the coded message to be interpreted,
|
|||
|
assuming anything other than "strict ASCII" as the default
|
|||
|
would risk unintentional and incompatible changes to the
|
|||
|
semantics of messages now being transmitted. This also
|
|||
|
implies that messages containing characters coded according
|
|||
|
to national variations on ISO 646, or using code-switching
|
|||
|
procedures (e.g., those of ISO 2022), as well as 8-bit or
|
|||
|
multiple octet character encodings MUST use an appropriate
|
|||
|
character set specification to be consistent with this
|
|||
|
specification.
|
|||
|
|
|||
|
The complete US-ASCII character set is listed in [US-ASCII].
|
|||
|
Note that the control characters including DEL (0-31, 127)
|
|||
|
have no defined meaning apart from the combination CRLF
|
|||
|
(ASCII values 13 and 10) indicating a new line. Two of the
|
|||
|
characters have de facto meanings in wide use: FF (12) often
|
|||
|
means "start subsequent text on the beginning of a new
|
|||
|
page"; and TAB or HT (9) often (though not always) means
|
|||
|
"move the cursor to the next available column after the
|
|||
|
current position where the column number is a multiple of 8
|
|||
|
(counting the first column as column 0)." Apart from this,
|
|||
|
any use of the control characters or DEL in a body must be
|
|||
|
part of a private agreement between the sender and
|
|||
|
recipient. Such private agreements are discouraged and
|
|||
|
should be replaced by the other capabilities of this
|
|||
|
document.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 21]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
NOTE: Beyond US-ASCII, an enormous proliferation of
|
|||
|
character sets is possible. It is the opinion of the IETF
|
|||
|
working group that a large number of character sets is NOT a
|
|||
|
good thing. We would prefer to specify a single character
|
|||
|
set that can be used universally for representing all of the
|
|||
|
world's languages in electronic mail. Unfortunately,
|
|||
|
existing practice in several communities seems to point to
|
|||
|
the continued use of multiple character sets in the near
|
|||
|
future. For this reason, we define names for a small number
|
|||
|
of character sets for which a strong constituent base
|
|||
|
exists. It is our hope that ISO 10646 or some other
|
|||
|
effort will eventually define a single world character set
|
|||
|
which can then be specified for use in Internet mail, but in
|
|||
|
the advance of that definition we cannot specify the use of
|
|||
|
ISO 10646, Unicode, or any other character set whose
|
|||
|
definition is, as of this writing, incomplete.
|
|||
|
|
|||
|
The defined charset values are:
|
|||
|
|
|||
|
US-ASCII -- as defined in [US-ASCII].
|
|||
|
|
|||
|
ISO-8859-X -- where "X" is to be replaced, as
|
|||
|
necessary, for the parts of ISO-8859 [ISO-
|
|||
|
8859]. Note that the ISO 646 character sets
|
|||
|
have deliberately been omitted in favor of
|
|||
|
their 8859 replacements, which are the
|
|||
|
designated character sets for Internet mail.
|
|||
|
As of the publication of this document, the
|
|||
|
legitimate values for "X" are the digits 1
|
|||
|
through 9.
|
|||
|
|
|||
|
Note that the character set used, if anything other than
|
|||
|
US-ASCII, must always be explicitly specified in the
|
|||
|
Content-Type field.
|
|||
|
|
|||
|
No other character set name may be used in Internet mail
|
|||
|
without the publication of a formal specification and its
|
|||
|
registration with IANA as described in Appendix F, or by
|
|||
|
private agreement, in which case the character set name must
|
|||
|
begin with "X-".
|
|||
|
|
|||
|
Implementors are discouraged from defining new character
|
|||
|
sets for mail use unless absolutely necessary.
|
|||
|
|
|||
|
The "charset" parameter has been defined primarily for the
|
|||
|
purpose of textual data, and is described in this section
|
|||
|
for that reason. However, it is conceivable that non-
|
|||
|
textual data might also wish to specify a charset value for
|
|||
|
some purpose, in which case the same syntax and values
|
|||
|
should be used.
|
|||
|
|
|||
|
In general, mail-sending software should always use the
|
|||
|
"lowest common denominator" character set possible. For
|
|||
|
example, if a body contains only US-ASCII characters, it
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 22]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
should be marked as being in the US-ASCII character set, not
|
|||
|
ISO-8859-1, which, like all the ISO-8859 family of character
|
|||
|
sets, is a superset of US-ASCII. More generally, if a
|
|||
|
widely-used character set is a subset of another character
|
|||
|
set, and a body contains only characters in the widely-used
|
|||
|
subset, it should be labeled as being in that subset. This
|
|||
|
will increase the chances that the recipient will be able to
|
|||
|
view the mail correctly.
|
|||
|
|
|||
|
7.1.2 The Text/plain subtype
|
|||
|
|
|||
|
The primary subtype of text is "plain". This indicates
|
|||
|
plain (unformatted) text. The default Content-Type for
|
|||
|
Internet mail, "text/plain; charset=us-ascii", describes
|
|||
|
existing Internet practice, that is, it is the type of body
|
|||
|
defined by RFC 822.
|
|||
|
|
|||
|
7.1.3 The Text/richtext subtype
|
|||
|
|
|||
|
In order to promote the wider interoperability of simple
|
|||
|
formatted text, this document defines an extremely simple
|
|||
|
subtype of "text", the "richtext" subtype. This subtype was
|
|||
|
designed to meet the following criteria:
|
|||
|
|
|||
|
1. The syntax must be extremely simple to parse,
|
|||
|
so that even teletype-oriented mail systems can
|
|||
|
easily strip away the formatting information and
|
|||
|
leave only the readable text.
|
|||
|
|
|||
|
2. The syntax must be extensible to allow for new
|
|||
|
formatting commands that are deemed essential.
|
|||
|
|
|||
|
3. The capabilities must be extremely limited, to
|
|||
|
ensure that it can represent no more than is
|
|||
|
likely to be representable by the user's primary
|
|||
|
word processor. While this limits what can be
|
|||
|
sent, it increases the likelihood that what is
|
|||
|
sent can be properly displayed.
|
|||
|
|
|||
|
4. The syntax must be compatible with SGML, so
|
|||
|
that, with an appropriate DTD (Document Type
|
|||
|
Definition, the standard mechanism for defining a
|
|||
|
document type using SGML), a general SGML parser
|
|||
|
could be made to parse richtext. However, despite
|
|||
|
this compatibility, the syntax should be far
|
|||
|
simpler than full SGML, so that no SGML knowledge
|
|||
|
is required in order to implement it.
|
|||
|
|
|||
|
The syntax of "richtext" is very simple. It is assumed, at
|
|||
|
the top-level, to be in the US-ASCII character set, unless
|
|||
|
of course a different charset parameter was specified in the
|
|||
|
Content-type field. All characters represent themselves,
|
|||
|
with the exception of the "<" character (ASCII 60), which is
|
|||
|
used to mark the beginning of a formatting command.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 23]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Formatting instructions consist of formatting commands
|
|||
|
surrounded by angle brackets ("<>", ASCII 60 and 62). Each
|
|||
|
formatting command may be no more than 40 characters in
|
|||
|
length, all in US-ASCII, restricted to the alphanumeric and
|
|||
|
hyphen ("-") characters. Formatting commands may be preceded
|
|||
|
by a forward slash or solidus ("/", ASCII 47), making them
|
|||
|
negations, and such negations must always exist to balance
|
|||
|
the initial opening commands, except as noted below. Thus,
|
|||
|
if the formatting command "<bold>" appears at some point,
|
|||
|
there must later be a "</bold>" to balance it. There are
|
|||
|
only three exceptions to this "balancing" rule: First, the
|
|||
|
command "<lt>" is used to represent a literal "<" character.
|
|||
|
Second, the command "<nl>" is used to represent a required
|
|||
|
line break. (Otherwise, CRLFs in the data are treated as
|
|||
|
equivalent to a single SPACE character.) Finally, the
|
|||
|
command "<np>" is used to represent a page break. (NOTE:
|
|||
|
The 40 character limit on formatting commands does not
|
|||
|
include the "<", ">", or "/" characters that might be
|
|||
|
attached to such commands.)
|
|||
|
|
|||
|
Initially defined formatting commands, not all of which will
|
|||
|
be implemented by all richtext implementations, include:
|
|||
|
|
|||
|
Bold -- causes the subsequent text to be in a bold
|
|||
|
font.
|
|||
|
Italic -- causes the subsequent text to be in an italic
|
|||
|
font.
|
|||
|
Fixed -- causes the subsequent text to be in a fixed
|
|||
|
width font.
|
|||
|
Smaller -- causes the subsequent text to be in a
|
|||
|
smaller font.
|
|||
|
Bigger -- causes the subsequent text to be in a bigger
|
|||
|
font.
|
|||
|
Underline -- causes the subsequent text to be
|
|||
|
underlined.
|
|||
|
Center -- causes the subsequent text to be centered.
|
|||
|
FlushLeft -- causes the subsequent text to be left
|
|||
|
justified.
|
|||
|
FlushRight -- causes the subsequent text to be right
|
|||
|
justified.
|
|||
|
Indent -- causes the subsequent text to be indented at
|
|||
|
the left margin.
|
|||
|
IndentRight -- causes the subsequent text to be
|
|||
|
indented at the right margin.
|
|||
|
Outdent -- causes the subsequent text to be outdented
|
|||
|
at the left margin.
|
|||
|
OutdentRight -- causes the subsequent text to be
|
|||
|
outdented at the right margin.
|
|||
|
SamePage -- causes the subsequent text to be grouped,
|
|||
|
if possible, on one page.
|
|||
|
Subscript -- causes the subsequent text to be
|
|||
|
interpreted as a subscript.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 24]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Superscript -- causes the subsequent text to be
|
|||
|
interpreted as a superscript.
|
|||
|
Heading -- causes the subsequent text to be interpreted
|
|||
|
as a page heading.
|
|||
|
Footing -- causes the subsequent text to be interpreted
|
|||
|
as a page footing.
|
|||
|
ISO-8859-X (for any value of X that is legal as a
|
|||
|
"charset" parameter) -- causes the subsequent text
|
|||
|
to be interpreted as text in the appropriate
|
|||
|
character set.
|
|||
|
US-ASCII -- causes the subsequent text to be
|
|||
|
interpreted as text in the US-ASCII character set.
|
|||
|
Excerpt -- causes the subsequent text to be interpreted
|
|||
|
as a textual excerpt from another source.
|
|||
|
Typically this will be displayed using indentation
|
|||
|
and an alternate font, but such decisions are up
|
|||
|
to the viewer.
|
|||
|
Paragraph -- causes the subsequent text to be
|
|||
|
interpreted as a single paragraph, with
|
|||
|
appropriate paragraph breaks (typically blank
|
|||
|
space) before and after.
|
|||
|
Signature -- causes the subsequent text to be
|
|||
|
interpreted as a "signature". Some systems may
|
|||
|
wish to display signatures in a smaller font or
|
|||
|
otherwise set them apart from the main text of the
|
|||
|
message.
|
|||
|
Comment -- causes the subsequent text to be interpreted
|
|||
|
as a comment, and hence not shown to the reader.
|
|||
|
No-op -- has no effect on the subsequent text.
|
|||
|
lt -- <lt> is replaced by a literal "<" character. No
|
|||
|
balancing </lt> is allowed.
|
|||
|
nl -- <nl> causes a line break. No balancing </nl> is
|
|||
|
allowed.
|
|||
|
np -- <np> causes a page break. No balancing </np> is
|
|||
|
allowed.
|
|||
|
|
|||
|
Each positive formatting command affects all subsequent text
|
|||
|
until the matching negative formatting command. Such pairs
|
|||
|
of formatting commands must be properly balanced and nested.
|
|||
|
Thus, a proper way to describe text in bold italics is:
|
|||
|
|
|||
|
<bold><italic>the-text</italic></bold>
|
|||
|
|
|||
|
or, alternately,
|
|||
|
|
|||
|
<italic><bold>the-text</bold></italic>
|
|||
|
|
|||
|
but, in particular, the following is illegal
|
|||
|
richtext:
|
|||
|
|
|||
|
<bold><italic>the-text</bold></italic>
|
|||
|
|
|||
|
NOTE: The nesting requirement for formatting commands
|
|||
|
imposes a slightly higher burden upon the composers of
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 25]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
richtext bodies, but potentially simplifies richtext
|
|||
|
displayers by allowing them to be stack-based. The main
|
|||
|
goal of richtext is to be simple enough to make multifont,
|
|||
|
formatted email widely readable, so that those with the
|
|||
|
capability of sending it will be able to do so with
|
|||
|
confidence. Thus slightly increased complexity in the
|
|||
|
composing software was deemed a reasonable tradeoff for
|
|||
|
simplified reading software. Nonetheless, implementors of
|
|||
|
richtext readers are encouraged to follow the general
|
|||
|
Internet guidelines of being conservative in what you send
|
|||
|
and liberal in what you accept. Those implementations that
|
|||
|
can do so are encouraged to deal reasonably with improperly
|
|||
|
nested richtext.
|
|||
|
|
|||
|
Implementations must regard any unrecognized formatting
|
|||
|
command as equivalent to "No-op", thus facilitating future
|
|||
|
extensions to "richtext". Private extensions may be defined
|
|||
|
using formatting commands that begin with "X-", by analogy
|
|||
|
to Internet mail header field names.
|
|||
|
|
|||
|
It is worth noting that no special behavior is required for
|
|||
|
the TAB (HT) character. It is recommended, however, that, at
|
|||
|
least when fixed-width fonts are in use, the common
|
|||
|
semantics of the TAB (HT) character should be observed,
|
|||
|
namely that it moves to the next column position that is a
|
|||
|
multiple of 8. (In other words, if a TAB (HT) occurs in
|
|||
|
column n, where the leftmost column is column 0, then that
|
|||
|
TAB (HT) should be replaced by 8-(n mod 8) SPACE
|
|||
|
characters.)
|
|||
|
|
|||
|
Richtext also differentiates between "hard" and "soft" line
|
|||
|
breaks. A line break (CRLF) in the richtext data stream is
|
|||
|
interpreted as a "soft" line break, one that is included
|
|||
|
only for purposes of mail transport, and is to be treated as
|
|||
|
white space by richtext interpreters. To include a "hard"
|
|||
|
line break (one that must be displayed as such), the "<nl>"
|
|||
|
or "<paragraph> formatting constructs should be used. In
|
|||
|
general, a soft line break should be treated as white space,
|
|||
|
but when soft line breaks immediately follow a <nl> or a
|
|||
|
</paragraph> tag they should be ignored rather than treated
|
|||
|
as white space.
|
|||
|
|
|||
|
Putting all this together, the following "text/richtext"
|
|||
|
body fragment:
|
|||
|
|
|||
|
<bold>Now</bold> is the time for
|
|||
|
<italic>all</italic> good men
|
|||
|
<smaller>(and <lt>women>)</smaller> to
|
|||
|
<ignoreme></ignoreme> come
|
|||
|
|
|||
|
to the aid of their
|
|||
|
<nl>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 26]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
beloved <nl><nl>country. <comment> Stupid
|
|||
|
quote! </comment> -- the end
|
|||
|
|
|||
|
represents the following formatted text (which will, no
|
|||
|
doubt, look cryptic in the text-only version of this
|
|||
|
document):
|
|||
|
|
|||
|
Now is the time for all good men (and <women>) to
|
|||
|
come to the aid of their
|
|||
|
beloved
|
|||
|
|
|||
|
country. -- the end
|
|||
|
|
|||
|
Richtext conformance: A minimal richtext implementation is
|
|||
|
one that simply converts "<lt>" to "<", converts CRLFs to
|
|||
|
SPACE, converts <nl> to a newline according to local newline
|
|||
|
convention, removes everything between a <comment> command
|
|||
|
and the next balancing </comment> command, and removes all
|
|||
|
other formatting commands (all text enclosed in angle
|
|||
|
brackets).
|
|||
|
|
|||
|
NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML: Richtext is
|
|||
|
decidedly not SGML, and must not be used to transport
|
|||
|
arbitrary SGML documents. Those who wish to use SGML
|
|||
|
document types as a mail transport format must define a new
|
|||
|
text or application subtype, e.g., "text/sgml-dtd-whatever"
|
|||
|
or "application/sgml-dtd-whatever", depending on the
|
|||
|
perceived readability of the DTD in use. Richtext is
|
|||
|
designed to be compatible with SGML, and specifically so
|
|||
|
that it will be possible to define a richtext DTD if one is
|
|||
|
needed. However, this does not imply that arbitrary SGML
|
|||
|
can be called richtext, nor that richtext implementors have
|
|||
|
any need to understand SGML; the description in this
|
|||
|
document is a complete definition of richtext, which is far
|
|||
|
simpler than complete SGML.
|
|||
|
|
|||
|
NOTE ON THE INTENDED USE OF RICHTEXT: It is recognized that
|
|||
|
implementors of future mail systems will want rich text
|
|||
|
functionality far beyond that currently defined for
|
|||
|
richtext. The intent of richtext is to provide a common
|
|||
|
format for expressing that functionality in a form in which
|
|||
|
much of it, at least, will be understood by interoperating
|
|||
|
software. Thus, in particular, software with a richer
|
|||
|
notion of formatted text than richtext can still use
|
|||
|
richtext as its basic representation, but can extend it with
|
|||
|
new formatting commands and by hiding information specific
|
|||
|
to that software system in richtext comments. As such
|
|||
|
systems evolve, it is expected that the definition of
|
|||
|
richtext will be further refined by future published
|
|||
|
specifications, but richtext as defined here provides a
|
|||
|
platform on which evolutionary refinements can be based.
|
|||
|
|
|||
|
IMPLEMENTATION NOTE: In some environments, it might be
|
|||
|
impossible to combine certain richtext formatting commands,
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 27]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
whereas in others they might be combined easily. For
|
|||
|
example, the combination of <bold> and <italic> might
|
|||
|
produce bold italics on systems that support such fonts, but
|
|||
|
there exist systems that can make text bold or italicized,
|
|||
|
but not both. In such cases, the most recently issued
|
|||
|
recognized formatting command should be preferred.
|
|||
|
|
|||
|
One of the major goals in the design of richtext was to make
|
|||
|
it so simple that even text-only mailers will implement
|
|||
|
richtext-to-plain-text translators, thus increasing the
|
|||
|
likelihood that multifont text will become "safe" to use
|
|||
|
very widely. To demonstrate this simplicity, an extremely
|
|||
|
simple 35-line C program that converts richtext input into
|
|||
|
plain text output is included in Appendix D.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 28]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.2 The Multipart Content-Type
|
|||
|
|
|||
|
In the case of multiple part messages, in which one or more
|
|||
|
different sets of data are combined in a single body, a
|
|||
|
"multipart" Content-Type field must appear in the entity's
|
|||
|
header. The body must then contain one or more "body parts,"
|
|||
|
each preceded by an encapsulation boundary, and the last one
|
|||
|
followed by a closing boundary. Each part starts with an
|
|||
|
encapsulation boundary, and then contains a body part
|
|||
|
consisting of header area, a blank line, and a body area.
|
|||
|
Thus a body part is similar to an RFC 822 message in syntax,
|
|||
|
but different in meaning.
|
|||
|
|
|||
|
A body part is NOT to be interpreted as actually being an
|
|||
|
RFC 822 message. To begin with, NO header fields are
|
|||
|
actually required in body parts. A body part that starts
|
|||
|
with a blank line, therefore, is allowed and is a body part
|
|||
|
for which all default values are to be assumed. In such a
|
|||
|
case, the absence of a Content-Type header field implies
|
|||
|
that the encapsulation is plain US-ASCII text. The only
|
|||
|
header fields that have defined meaning for body parts are
|
|||
|
those the names of which begin with "Content-". All other
|
|||
|
header fields are generally to be ignored in body parts.
|
|||
|
Although they should generally be retained in mail
|
|||
|
processing, they may be discarded by gateways if necessary.
|
|||
|
Such other fields are permitted to appear in body parts but
|
|||
|
should not be depended on. "X-" fields may be created for
|
|||
|
experimental or private purposes, with the recognition that
|
|||
|
the information they contain may be lost at some gateways.
|
|||
|
|
|||
|
The distinction between an RFC 822 message and a body part
|
|||
|
is subtle, but important. A gateway between Internet and
|
|||
|
X.400 mail, for example, must be able to tell the difference
|
|||
|
between a body part that contains an image and a body part
|
|||
|
that contains an encapsulated message, the body of which is
|
|||
|
an image. In order to represent the latter, the body part
|
|||
|
must have "Content-Type: message", and its body (after the
|
|||
|
blank line) must be the encapsulated message, with its own
|
|||
|
"Content-Type: image" header field. The use of similar
|
|||
|
syntax facilitates the conversion of messages to body parts,
|
|||
|
and vice versa, but the distinction between the two must be
|
|||
|
understood by implementors. (For the special case in which
|
|||
|
all parts actually are messages, a "digest" subtype is also
|
|||
|
defined.)
|
|||
|
|
|||
|
As stated previously, each body part is preceded by an
|
|||
|
encapsulation boundary. The encapsulation boundary MUST NOT
|
|||
|
appear inside any of the encapsulated parts. Thus, it is
|
|||
|
crucial that the composing agent be able to choose and
|
|||
|
specify the unique boundary that will separate the parts.
|
|||
|
|
|||
|
All present and future subtypes of the "multipart" type must
|
|||
|
use an identical syntax. Subtypes may differ in their
|
|||
|
semantics, and may impose additional restrictions on syntax,
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 29]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
but must conform to the required syntax for the multipart
|
|||
|
type. This requirement ensures that all conformant user
|
|||
|
agents will at least be able to recognize and separate the
|
|||
|
parts of any multipart entity, even of an unrecognized
|
|||
|
subtype.
|
|||
|
|
|||
|
As stated in the definition of the Content-Transfer-Encoding
|
|||
|
field, no encoding other than "7bit", "8bit", or "binary" is
|
|||
|
permitted for entities of type "multipart". The multipart
|
|||
|
delimiters and header fields are always 7-bit ASCII in any
|
|||
|
case, and data within the body parts can be encoded on a
|
|||
|
part-by-part basis, with Content-Transfer-Encoding fields
|
|||
|
for each appropriate body part.
|
|||
|
|
|||
|
Mail gateways, relays, and other mail handling agents are
|
|||
|
commonly known to alter the top-level header of an RFC 822
|
|||
|
message. In particular, they frequently add, remove, or
|
|||
|
reorder header fields. Such alterations are explicitly
|
|||
|
forbidden for the body part headers embedded in the bodies
|
|||
|
of messages of type "multipart."
|
|||
|
|
|||
|
7.2.1 Multipart: The common syntax
|
|||
|
|
|||
|
All subtypes of "multipart" share a common syntax, defined
|
|||
|
in this section. A simple example of a multipart message
|
|||
|
also appears in this section. An example of a more complex
|
|||
|
multipart message is given in Appendix C.
|
|||
|
|
|||
|
The Content-Type field for multipart entities requires one
|
|||
|
parameter, "boundary", which is used to specify the
|
|||
|
encapsulation boundary. The encapsulation boundary is
|
|||
|
defined as a line consisting entirely of two hyphen
|
|||
|
characters ("-", decimal code 45) followed by the boundary
|
|||
|
parameter value from the Content-Type header field.
|
|||
|
|
|||
|
NOTE: The hyphens are for rough compatibility with the
|
|||
|
earlier RFC 934 method of message encapsulation, and for
|
|||
|
ease of searching for the boundaries in some
|
|||
|
implementations. However, it should be noted that multipart
|
|||
|
messages are NOT completely compatible with RFC 934
|
|||
|
encapsulations; in particular, they do not obey RFC 934
|
|||
|
quoting conventions for embedded lines that begin with
|
|||
|
hyphens. This mechanism was chosen over the RFC 934
|
|||
|
mechanism because the latter causes lines to grow with each
|
|||
|
level of quoting. The combination of this growth with the
|
|||
|
fact that SMTP implementations sometimes wrap long lines
|
|||
|
made the RFC 934 mechanism unsuitable for use in the event
|
|||
|
that deeply-nested multipart structuring is ever desired.
|
|||
|
|
|||
|
Thus, a typical multipart Content-Type header field might
|
|||
|
look like this:
|
|||
|
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 30]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
boundary=gc0p4Jq0M2Yt08jU534c0p
|
|||
|
|
|||
|
This indicates that the entity consists of several parts,
|
|||
|
each itself with a structure that is syntactically identical
|
|||
|
to an RFC 822 message, except that the header area might be
|
|||
|
completely empty, and that the parts are each preceded by
|
|||
|
the line
|
|||
|
|
|||
|
--gc0p4Jq0M2Yt08jU534c0p
|
|||
|
|
|||
|
Note that the encapsulation boundary must occur at the
|
|||
|
beginning of a line, i.e., following a CRLF, and that that
|
|||
|
initial CRLF is considered to be part of the encapsulation
|
|||
|
boundary rather than part of the preceding part. The
|
|||
|
boundary must be followed immediately either by another CRLF
|
|||
|
and the header fields for the next part, or by two CRLFs, in
|
|||
|
which case there are no header fields for the next part (and
|
|||
|
it is therefore assumed to be of Content-Type text/plain).
|
|||
|
|
|||
|
NOTE: The CRLF preceding the encapsulation line is
|
|||
|
considered part of the boundary so that it is possible to
|
|||
|
have a part that does not end with a CRLF (line break).
|
|||
|
Body parts that must be considered to end with line breaks,
|
|||
|
therefore, should have two CRLFs preceding the encapsulation
|
|||
|
line, the first of which is part of the preceding body part,
|
|||
|
and the second of which is part of the encapsulation
|
|||
|
boundary.
|
|||
|
|
|||
|
The requirement that the encapsulation boundary begins with
|
|||
|
a CRLF implies that the body of a multipart entity must
|
|||
|
itself begin with a CRLF before the first encapsulation line
|
|||
|
-- that is, if the "preamble" area is not used, the entity
|
|||
|
headers must be followed by TWO CRLFs. This is indeed how
|
|||
|
such entities should be composed. A tolerant mail reading
|
|||
|
program, however, may interpret a body of type multipart
|
|||
|
that begins with an encapsulation line NOT initiated by a
|
|||
|
CRLF as also being an encapsulation boundary, but a
|
|||
|
compliant mail sending program must not generate such
|
|||
|
entities.
|
|||
|
|
|||
|
Encapsulation boundaries must not appear within the
|
|||
|
encapsulations, and must be no longer than 70 characters,
|
|||
|
not counting the two leading hyphens.
|
|||
|
|
|||
|
The encapsulation boundary following the last body part is a
|
|||
|
distinguished delimiter that indicates that no further body
|
|||
|
parts will follow. Such a delimiter is identical to the
|
|||
|
previous delimiters, with the addition of two more hyphens
|
|||
|
at the end of the line:
|
|||
|
|
|||
|
--gc0p4Jq0M2Yt08jU534c0p--
|
|||
|
|
|||
|
There appears to be room for additional information prior to
|
|||
|
the first encapsulation boundary and following the final
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 31]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
boundary. These areas should generally be left blank, and
|
|||
|
implementations should ignore anything that appears before
|
|||
|
the first boundary or after the last one.
|
|||
|
|
|||
|
NOTE: These "preamble" and "epilogue" areas are not used
|
|||
|
because of the lack of proper typing of these parts and the
|
|||
|
lack of clear semantics for handling these areas at
|
|||
|
gateways, particularly X.400 gateways.
|
|||
|
|
|||
|
NOTE: Because encapsulation boundaries must not appear in
|
|||
|
the body parts being encapsulated, a user agent must
|
|||
|
exercise care to choose a unique boundary. The boundary in
|
|||
|
the example above could have been the result of an algorithm
|
|||
|
designed to produce boundaries with a very low probability
|
|||
|
of already existing in the data to be encapsulated without
|
|||
|
having to prescan the data. Alternate algorithms might
|
|||
|
result in more 'readable' boundaries for a recipient with an
|
|||
|
old user agent, but would require more attention to the
|
|||
|
possibility that the boundary might appear in the
|
|||
|
encapsulated part. The simplest boundary possible is
|
|||
|
something like "---", with a closing boundary of "-----".
|
|||
|
|
|||
|
As a very simple example, the following multipart message
|
|||
|
has two parts, both of them plain text, one of them
|
|||
|
explicitly typed and one of them implicitly typed:
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Subject: Sample message
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: multipart/mixed; boundary="simple
|
|||
|
boundary"
|
|||
|
|
|||
|
This is the preamble. It is to be ignored, though it
|
|||
|
is a handy place for mail composers to include an
|
|||
|
explanatory note to non-MIME compliant readers.
|
|||
|
--simple boundary
|
|||
|
|
|||
|
This is implicitly typed plain ASCII text.
|
|||
|
It does NOT end with a linebreak.
|
|||
|
--simple boundary
|
|||
|
Content-type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
This is explicitly typed plain ASCII text.
|
|||
|
It DOES end with a linebreak.
|
|||
|
|
|||
|
--simple boundary--
|
|||
|
This is the epilogue. It is also to be ignored.
|
|||
|
|
|||
|
The use of a Content-Type of multipart in a body part within
|
|||
|
another multipart entity is explicitly allowed. In such
|
|||
|
cases, for obvious reasons, care must be taken to ensure
|
|||
|
that each nested multipart entity must use a different
|
|||
|
boundary delimiter. See Appendix C for an example of nested
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 32]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
multipart entities.
|
|||
|
|
|||
|
The use of the multipart Content-Type with only a single
|
|||
|
body part may be useful in certain contexts, and is
|
|||
|
explicitly permitted.
|
|||
|
|
|||
|
The only mandatory parameter for the multipart Content-Type
|
|||
|
is the boundary parameter, which consists of 1 to 70
|
|||
|
characters from a set of characters known to be very robust
|
|||
|
through email gateways, and NOT ending with white space.
|
|||
|
(If a boundary appears to end with white space, the white
|
|||
|
space must be presumed to have been added by a gateway, and
|
|||
|
should be deleted.) It is formally specified by the
|
|||
|
following BNF:
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /
|
|||
|
"_"
|
|||
|
/ "," / "-" / "." / "/" / ":" / "=" / "?"
|
|||
|
|
|||
|
Overall, the body of a multipart entity may be specified as
|
|||
|
follows:
|
|||
|
|
|||
|
multipart-body := preamble 1*encapsulation
|
|||
|
close-delimiter epilogue
|
|||
|
|
|||
|
encapsulation := delimiter CRLF body-part
|
|||
|
|
|||
|
delimiter := CRLF "--" boundary ; taken from Content-Type
|
|||
|
field.
|
|||
|
; when content-type is
|
|||
|
multipart
|
|||
|
; There must be no space
|
|||
|
; between "--" and boundary.
|
|||
|
|
|||
|
close-delimiter := delimiter "--" ; Again, no space before
|
|||
|
"--"
|
|||
|
|
|||
|
preamble := *text ; to be ignored upon
|
|||
|
receipt.
|
|||
|
|
|||
|
epilogue := *text ; to be ignored upon
|
|||
|
receipt.
|
|||
|
|
|||
|
body-part = <"message" as defined in RFC 822,
|
|||
|
with all header fields optional, and with the
|
|||
|
specified delimiter not occurring anywhere in
|
|||
|
the message body, either on a line by itself
|
|||
|
or as a substring anywhere. Note that the
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 33]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
semantics of a part differ from the semantics
|
|||
|
of a message, as described in the text.>
|
|||
|
|
|||
|
NOTE: Conspicuously missing from the multipart type is a
|
|||
|
notion of structured, related body parts. In general, it
|
|||
|
seems premature to try to standardize interpart structure
|
|||
|
yet. It is recommended that those wishing to provide a more
|
|||
|
structured or integrated multipart messaging facility should
|
|||
|
define a subtype of multipart that is syntactically
|
|||
|
identical, but that always expects the inclusion of a
|
|||
|
distinguished part that can be used to specify the structure
|
|||
|
and integration of the other parts, probably referring to
|
|||
|
them by their Content-ID field. If this approach is used,
|
|||
|
other implementations will not recognize the new subtype,
|
|||
|
but will treat it as the primary subtype (multipart/mixed)
|
|||
|
and will thus be able to show the user the parts that are
|
|||
|
recognized.
|
|||
|
|
|||
|
7.2.2 The Multipart/mixed (primary) subtype
|
|||
|
|
|||
|
The primary subtype for multipart, "mixed", is intended for
|
|||
|
use when the body parts are independent and intended to be
|
|||
|
displayed serially. Any multipart subtypes that an
|
|||
|
implementation does not recognize should be treated as being
|
|||
|
of subtype "mixed".
|
|||
|
|
|||
|
7.2.3 The Multipart/alternative subtype
|
|||
|
|
|||
|
The multipart/alternative type is syntactically identical to
|
|||
|
multipart/mixed, but the semantics are different. In
|
|||
|
particular, each of the parts is an "alternative" version of
|
|||
|
the same information. User agents should recognize that the
|
|||
|
content of the various parts are interchangeable. The user
|
|||
|
agent should either choose the "best" type based on the
|
|||
|
user's environment and preferences, or offer the user the
|
|||
|
available alternatives. In general, choosing the best type
|
|||
|
means displaying only the LAST part that can be displayed.
|
|||
|
This may be used, for example, to send mail in a fancy text
|
|||
|
format in such a way that it can easily be displayed
|
|||
|
anywhere:
|
|||
|
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
To: Ned Freed <ned@innosoft.com>
|
|||
|
Subject: Formatted text mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-Type: multipart/alternative; boundary=boundary42
|
|||
|
|
|||
|
|
|||
|
--boundary42
|
|||
|
Content-Type: text/plain; charset=us-ascii
|
|||
|
|
|||
|
...plain text version of message goes here....
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 34]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
--boundary42
|
|||
|
Content-Type: text/richtext
|
|||
|
|
|||
|
.... richtext version of same message goes here ...
|
|||
|
--boundary42
|
|||
|
Content-Type: text/x-whatever
|
|||
|
|
|||
|
.... fanciest formatted version of same message goes here
|
|||
|
...
|
|||
|
--boundary42--
|
|||
|
|
|||
|
In this example, users whose mail system understood the
|
|||
|
"text/x-whatever" format would see only the fancy version,
|
|||
|
while other users would see only the richtext or plain text
|
|||
|
version, depending on the capabilities of their system.
|
|||
|
|
|||
|
In general, user agents that compose multipart/alternative
|
|||
|
entities should place the body parts in increasing order of
|
|||
|
preference, that is, with the preferred format last. For
|
|||
|
fancy text, the sending user agent should put the plainest
|
|||
|
format first and the richest format last. Receiving user
|
|||
|
agents should pick and display the last format they are
|
|||
|
capable of displaying. In the case where one of the
|
|||
|
alternatives is itself of type "multipart" and contains
|
|||
|
unrecognized sub-parts, the user agent may choose either to
|
|||
|
show that alternative, an earlier alternative, or both.
|
|||
|
|
|||
|
NOTE: From an implementor's perspective, it might seem more
|
|||
|
sensible to reverse this ordering, and have the plainest
|
|||
|
alternative last. However, placing the plainest alternative
|
|||
|
first is the friendliest possible option when
|
|||
|
mutlipart/alternative entities are viewed using a non-MIME-
|
|||
|
compliant mail reader. While this approach does impose some
|
|||
|
burden on compliant mail readers, interoperability with
|
|||
|
older mail readers was deemed to be more important in this
|
|||
|
case.
|
|||
|
|
|||
|
It may be the case that some user agents, if they can
|
|||
|
recognize more than one of the formats, will prefer to offer
|
|||
|
the user the choice of which format to view. This makes
|
|||
|
sense, for example, if mail includes both a nicely-formatted
|
|||
|
image version and an easily-edited text version. What is
|
|||
|
most critical, however, is that the user not automatically
|
|||
|
be shown multiple versions of the same data. Either the
|
|||
|
user should be shown the last recognized version or should
|
|||
|
explicitly be given the choice.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 35]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.2.4 The Multipart/digest subtype
|
|||
|
|
|||
|
This document defines a "digest" subtype of the multipart
|
|||
|
Content-Type. This type is syntactically identical to
|
|||
|
multipart/mixed, but the semantics are different. In
|
|||
|
particular, in a digest, the default Content-Type value for
|
|||
|
a body part is changed from "text/plain" to
|
|||
|
"message/rfc822". This is done to allow a more readable
|
|||
|
digest format that is largely compatible (except for the
|
|||
|
quoting convention) with RFC 934.
|
|||
|
|
|||
|
A digest in this format might, then, look something like
|
|||
|
this:
|
|||
|
|
|||
|
From: Moderator-Address
|
|||
|
MIME-Version: 1.0
|
|||
|
Subject: Internet Digest, volume 42
|
|||
|
Content-Type: multipart/digest;
|
|||
|
boundary="---- next message ----"
|
|||
|
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else
|
|||
|
Subject: my opinion
|
|||
|
|
|||
|
...body goes here ...
|
|||
|
|
|||
|
------ next message ----
|
|||
|
|
|||
|
From: someone-else-again
|
|||
|
Subject: my different opinion
|
|||
|
|
|||
|
... another body goes here...
|
|||
|
|
|||
|
------ next message ------
|
|||
|
|
|||
|
7.2.5 The Multipart/parallel subtype
|
|||
|
|
|||
|
This document defines a "parallel" subtype of the multipart
|
|||
|
Content-Type. This type is syntactically identical to
|
|||
|
multipart/mixed, but the semantics are different. In
|
|||
|
particular, in a parallel entity, all of the parts are
|
|||
|
intended to be presented in parallel, i.e., simultaneously,
|
|||
|
on hardware and software that are capable of doing so.
|
|||
|
Composing agents should be aware that many mail readers will
|
|||
|
lack this capability and will show the parts serially in any
|
|||
|
event.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 36]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.3 The Message Content-Type
|
|||
|
|
|||
|
It is frequently desirable, in sending mail, to encapsulate
|
|||
|
another mail message. For this common operation, a special
|
|||
|
Content-Type, "message", is defined. The primary subtype,
|
|||
|
message/rfc822, has no required parameters in the Content-
|
|||
|
Type field. Additional subtypes, "partial" and "External-
|
|||
|
body", do have required parameters. These subtypes are
|
|||
|
explained below.
|
|||
|
|
|||
|
NOTE: It has been suggested that subtypes of message might
|
|||
|
be defined for forwarded or rejected messages. However,
|
|||
|
forwarded and rejected messages can be handled as multipart
|
|||
|
messages in which the first part contains any control or
|
|||
|
descriptive information, and a second part, of type
|
|||
|
message/rfc822, is the forwarded or rejected message.
|
|||
|
Composing rejection and forwarding messages in this manner
|
|||
|
will preserve the type information on the original message
|
|||
|
and allow it to be correctly presented to the recipient, and
|
|||
|
hence is strongly encouraged.
|
|||
|
|
|||
|
As stated in the definition of the Content-Transfer-Encoding
|
|||
|
field, no encoding other than "7bit", "8bit", or "binary" is
|
|||
|
permitted for messages or parts of type "message". The
|
|||
|
message header fields are always US-ASCII in any case, and
|
|||
|
data within the body can still be encoded, in which case the
|
|||
|
Content-Transfer-Encoding header field in the encapsulated
|
|||
|
message will reflect this. Non-ASCII text in the headers of
|
|||
|
an encapsulated message can be specified using the
|
|||
|
mechanisms described in [RFC-1342].
|
|||
|
|
|||
|
Mail gateways, relays, and other mail handling agents are
|
|||
|
commonly known to alter the top-level header of an RFC 822
|
|||
|
message. In particular, they frequently add, remove, or
|
|||
|
reorder header fields. Such alterations are explicitly
|
|||
|
forbidden for the encapsulated headers embedded in the
|
|||
|
bodies of messages of type "message."
|
|||
|
|
|||
|
7.3.1 The Message/rfc822 (primary) subtype
|
|||
|
|
|||
|
A Content-Type of "message/rfc822" indicates that the body
|
|||
|
contains an encapsulated message, with the syntax of an RFC
|
|||
|
822 message.
|
|||
|
|
|||
|
7.3.2 The Message/Partial subtype
|
|||
|
|
|||
|
A subtype of message, "partial", is defined in order to
|
|||
|
allow large objects to be delivered as several separate
|
|||
|
pieces of mail and automatically reassembled by the
|
|||
|
receiving user agent. (The concept is similar to IP
|
|||
|
fragmentation/reassembly in the basic Internet Protocols.)
|
|||
|
This mechanism can be used when intermediate transport
|
|||
|
agents limit the size of individual messages that can be
|
|||
|
sent. Content-Type "message/partial" thus indicates that
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 37]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
the body contains a fragment of a larger message.
|
|||
|
|
|||
|
Three parameters must be specified in the Content-Type field
|
|||
|
of type message/partial: The first, "id", is a unique
|
|||
|
identifier, as close to a world-unique identifier as
|
|||
|
possible, to be used to match the parts together. (In
|
|||
|
general, the identifier is essentially a message-id; if
|
|||
|
placed in double quotes, it can be any message-id, in
|
|||
|
accordance with the BNF for "parameter" given earlier in
|
|||
|
this specification.) The second, "number", an integer, is
|
|||
|
the part number, which indicates where this part fits into
|
|||
|
the sequence of fragments. The third, "total", another
|
|||
|
integer, is the total number of parts. This third subfield
|
|||
|
is required on the final part, and is optional on the
|
|||
|
earlier parts. Note also that these parameters may be given
|
|||
|
in any order.
|
|||
|
|
|||
|
Thus, part 2 of a 3-part message may have either of the
|
|||
|
following header fields:
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
number=2; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
|
|||
|
number=2
|
|||
|
|
|||
|
But part 3 MUST specify the total number of parts:
|
|||
|
|
|||
|
Content-Type: Message/Partial;
|
|||
|
number=3; total=3;
|
|||
|
id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
|
|||
|
|
|||
|
Note that part numbering begins with 1, not 0.
|
|||
|
|
|||
|
When the parts of a message broken up in this manner are put
|
|||
|
together, the result is a complete RFC 822 format message,
|
|||
|
which may have its own Content-Type header field, and thus
|
|||
|
may contain any other data type.
|
|||
|
|
|||
|
Message fragmentation and reassembly: The semantics of a
|
|||
|
reassembled partial message must be those of the "inner"
|
|||
|
message, rather than of a message containing the inner
|
|||
|
message. This makes it possible, for example, to send a
|
|||
|
large audio message as several partial messages, and still
|
|||
|
have it appear to the recipient as a simple audio message
|
|||
|
rather than as an encapsulated message containing an audio
|
|||
|
message. That is, the encapsulation of the message is
|
|||
|
considered to be "transparent".
|
|||
|
|
|||
|
When generating and reassembling the parts of a
|
|||
|
message/partial message, the headers of the encapsulated
|
|||
|
message must be merged with the headers of the enclosing
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 38]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
entities. In this process the following rules must be
|
|||
|
observed:
|
|||
|
|
|||
|
(1) All of the headers from the initial enclosing
|
|||
|
entity (part one), except those that start with
|
|||
|
"Content-" and "Message-ID", must be copied, in
|
|||
|
order, to the new message.
|
|||
|
|
|||
|
(2) Only those headers in the enclosed message
|
|||
|
which start with "Content-" and "Message-ID" must
|
|||
|
be appended, in order, to the headers of the new
|
|||
|
message. Any headers in the enclosed message
|
|||
|
which do not start with "Content-" (except for
|
|||
|
"Message-ID") will be ignored.
|
|||
|
|
|||
|
(3) All of the headers from the second and any
|
|||
|
subsequent messages will be ignored.
|
|||
|
|
|||
|
For example, if an audio message is broken into two parts,
|
|||
|
the first part might look something like this:
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
Message-ID: id1@host.com
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: message/partial;
|
|||
|
id="ABC@host.com";
|
|||
|
number=1; total=2
|
|||
|
|
|||
|
X-Weird-Header-1: Bar
|
|||
|
X-Weird-Header-2: Hello
|
|||
|
Message-ID: anotherid@foo.com
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here...
|
|||
|
|
|||
|
and the second half might look something like this:
|
|||
|
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: id2@host.com
|
|||
|
Content-type: message/partial;
|
|||
|
id="ABC@host.com"; number=2; total=2
|
|||
|
|
|||
|
... second half of encoded audio data goes here...
|
|||
|
|
|||
|
Then, when the fragmented message is reassembled, the
|
|||
|
resulting message to be displayed to the user should look
|
|||
|
something like this:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 39]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
X-Weird-Header-1: Foo
|
|||
|
From: Bill@host.com
|
|||
|
To: joe@otherhost.com
|
|||
|
Subject: Audio mail
|
|||
|
Message-ID: anotherid@foo.com
|
|||
|
MIME-Version: 1.0
|
|||
|
Content-type: audio/basic
|
|||
|
Content-transfer-encoding: base64
|
|||
|
|
|||
|
... first half of encoded audio data goes here...
|
|||
|
... second half of encoded audio data goes here...
|
|||
|
|
|||
|
It should be noted that, because some message transfer
|
|||
|
agents may choose to automatically fragment large messages,
|
|||
|
and because such agents may use different fragmentation
|
|||
|
thresholds, it is possible that the pieces of a partial
|
|||
|
message, upon reassembly, may prove themselves to comprise a
|
|||
|
partial message. This is explicitly permitted.
|
|||
|
|
|||
|
It should also be noted that the inclusion of a "References"
|
|||
|
field in the headers of the second and subsequent pieces of
|
|||
|
a fragmented message that references the Message-Id on the
|
|||
|
previous piece may be of benefit to mail readers that
|
|||
|
understand and track references. However, the generation of
|
|||
|
such "References" fields is entirely optional.
|
|||
|
|
|||
|
7.3.3 The Message/External-Body subtype
|
|||
|
|
|||
|
The external-body subtype indicates that the actual body
|
|||
|
data are not included, but merely referenced. In this case,
|
|||
|
the parameters describe a mechanism for accessing the
|
|||
|
external data.
|
|||
|
|
|||
|
When a message body or body part is of type
|
|||
|
"message/external-body", it consists of a header, two
|
|||
|
consecutive CRLFs, and the message header for the
|
|||
|
encapsulated message. If another pair of consecutive CRLFs
|
|||
|
appears, this of course ends the message header for the
|
|||
|
encapsulated message. However, since the encapsulated
|
|||
|
message's body is itself external, it does NOT appear in the
|
|||
|
area that follows. For example, consider the following
|
|||
|
message:
|
|||
|
|
|||
|
Content-type: message/external-body; access-
|
|||
|
type=local-file;
|
|||
|
name=/u/nsb/Me.gif
|
|||
|
|
|||
|
Content-type: image/gif
|
|||
|
|
|||
|
THIS IS NOT REALLY THE BODY!
|
|||
|
|
|||
|
The area at the end, which might be called the "phantom
|
|||
|
body", is ignored for most external-body messages. However,
|
|||
|
it may be used to contain auxilliary information for some
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 40]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
such messages, as indeed it is when the access-type is
|
|||
|
"mail-server". Of the access-types defined by this
|
|||
|
document, the phantom body is used only when the access-type
|
|||
|
is "mail-server". In all other cases, the phantom body is
|
|||
|
ignored.
|
|||
|
|
|||
|
The only always-mandatory parameter for message/external-
|
|||
|
body is "access-type"; all of the other parameters may be
|
|||
|
mandatory or optional depending on the value of access-type.
|
|||
|
|
|||
|
ACCESS-TYPE -- One or more case-insensitive words,
|
|||
|
comma-separated, indicating supported access
|
|||
|
mechanisms by which the file or data may be
|
|||
|
obtained. Values include, but are not limited to,
|
|||
|
"FTP", "ANON-FTP", "TFTP", "AFS", "LOCAL-FILE",
|
|||
|
and "MAIL-SERVER". Future values, except for
|
|||
|
experimental values beginning with "X-", must be
|
|||
|
registered with IANA, as described in Appendix F .
|
|||
|
|
|||
|
In addition, the following two parameters are optional for
|
|||
|
ALL access-types:
|
|||
|
|
|||
|
EXPIRATION -- The date (in the RFC 822 "date-time"
|
|||
|
syntax, as extended by RFC 1123 to permit 4 digits
|
|||
|
in the date field) after which the existence of
|
|||
|
the external data is not guaranteed.
|
|||
|
|
|||
|
SIZE -- The size (in octets) of the data. The
|
|||
|
intent of this parameter is to help the recipient
|
|||
|
decide whether or not to expend the necessary
|
|||
|
resources to retrieve the external data.
|
|||
|
|
|||
|
PERMISSION -- A field that indicates whether or
|
|||
|
not it is expected that clients might also attempt
|
|||
|
to overwrite the data. By default, or if
|
|||
|
permission is "read", the assumption is that they
|
|||
|
are not, and that if the data is retrieved once,
|
|||
|
it is never needed again. If PERMISSION is "read-
|
|||
|
write", this assumption is invalid, and any local
|
|||
|
copy must be considered no more than a cache.
|
|||
|
"Read" and "Read-write" are the only defined
|
|||
|
values of permission.
|
|||
|
|
|||
|
The precise semantics of the access-types defined here are
|
|||
|
described in the sections that follow.
|
|||
|
|
|||
|
7.3.3.1 The "ftp" and "tftp" access-types
|
|||
|
|
|||
|
An access-type of FTP or TFTP indicates that the message
|
|||
|
body is accessible as a file using the FTP [RFC-959] or TFTP
|
|||
|
[RFC-783] protocols, respectively. For these access-types,
|
|||
|
the following additional parameters are mandatory:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 41]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
NAME -- The name of the file that contains the
|
|||
|
actual body data.
|
|||
|
|
|||
|
SITE -- A machine from which the file may be
|
|||
|
obtained, using the given protocol
|
|||
|
|
|||
|
Before the data is retrieved, using these protocols, the
|
|||
|
user will generally need to be asked to provide a login id
|
|||
|
and a password for the machine named by the site parameter.
|
|||
|
|
|||
|
In addition, the following optional parameters may also
|
|||
|
appear when the access-type is FTP or ANON-FTP:
|
|||
|
|
|||
|
DIRECTORY -- A directory from which the data named
|
|||
|
by NAME should be retrieved.
|
|||
|
|
|||
|
MODE -- A transfer mode for retrieving the
|
|||
|
information, e.g. "image".
|
|||
|
|
|||
|
7.3.3.2 The "anon-ftp" access-type
|
|||
|
|
|||
|
The "anon-ftp" access-type is identical to the "ftp" access
|
|||
|
type, except that the user need not be asked to provide a
|
|||
|
name and password for the specified site. Instead, the ftp
|
|||
|
protocol will be used with login "anonymous" and a password
|
|||
|
that corresponds to the user's email address.
|
|||
|
|
|||
|
7.3.3.3 The "local-file" and "afs" access-types
|
|||
|
|
|||
|
An access-type of "local-file" indicates that the actual
|
|||
|
body is accessible as a file on the local machine. An
|
|||
|
access-type of "afs" indicates that the file is accessible
|
|||
|
via the global AFS file system. In both cases, only a
|
|||
|
single parameter is required:
|
|||
|
|
|||
|
NAME -- The name of the file that contains the
|
|||
|
actual body data.
|
|||
|
|
|||
|
The following optional parameter may be used to describe the
|
|||
|
locality of reference for the data, that is, the site or
|
|||
|
sites at which the file is expected to be visible:
|
|||
|
|
|||
|
SITE -- A domain specifier for a machine or set of
|
|||
|
machines that are known to have access to the data
|
|||
|
file. Asterisks may be used for wildcard matching
|
|||
|
to a part of a domain name, such as
|
|||
|
"*.bellcore.com", to indicate a set of machines on
|
|||
|
which the data should be directly visible, while a
|
|||
|
single asterisk may be used to indicate a file
|
|||
|
that is expected to be universally available,
|
|||
|
e.g., via a global file system.
|
|||
|
|
|||
|
7.3.3.4 The "mail-server" access-type
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 42]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
The "mail-server" access-type indicates that the actual body
|
|||
|
is available from a mail server. The mandatory parameter
|
|||
|
for this access-type is:
|
|||
|
|
|||
|
SERVER -- The email address of the mail server
|
|||
|
from which the actual body data can be obtained.
|
|||
|
|
|||
|
Because mail servers accept a variety of syntax, some of
|
|||
|
which is multiline, the full command to be sent to a mail
|
|||
|
server is not included as a parameter on the content-type
|
|||
|
line. Instead, it may be provided as the "phantom body"
|
|||
|
when the content-type is message/external-body and the
|
|||
|
access-type is mail-server.
|
|||
|
|
|||
|
Note that MIME does not define a mail server syntax.
|
|||
|
Rather, it allows the inclusion of arbitrary mail server
|
|||
|
commands in the phantom body. Implementations should
|
|||
|
include the phantom body in the body of the message it sends
|
|||
|
to the mail server address to retrieve the relevant data.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 43]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.3.3.5 Examples and Further Explanations
|
|||
|
|
|||
|
With the emerging possibility of very wide-area file
|
|||
|
systems, it becomes very hard to know in advance the set of
|
|||
|
machines where a file will and will not be accessible
|
|||
|
directly from the file system. Therefore it may make sense
|
|||
|
to provide both a file name, to be tried directly, and the
|
|||
|
name of one or more sites from which the file is known to be
|
|||
|
accessible. An implementation can try to retrieve remote
|
|||
|
files using FTP or any other protocol, using anonymous file
|
|||
|
retrieval or prompting the user for the necessary name and
|
|||
|
password. If an external body is accessible via multiple
|
|||
|
mechanisms, the sender may include multiple parts of type
|
|||
|
message/external-body within an entity of type
|
|||
|
multipart/alternative.
|
|||
|
|
|||
|
However, the external-body mechanism is not intended to be
|
|||
|
limited to file retrieval, as shown by the mail-server
|
|||
|
access-type. Beyond this, one can imagine, for example,
|
|||
|
using a video server for external references to video clips.
|
|||
|
|
|||
|
If an entity is of type "message/external-body", then the
|
|||
|
body of the entity will contain the header fields of the
|
|||
|
encapsulated message. The body itself is to be found in the
|
|||
|
external location. This means that if the body of the
|
|||
|
"message/external-body" message contains two consecutive
|
|||
|
CRLFs, everything after those pairs is NOT part of the
|
|||
|
message itself. For most message/external-body messages,
|
|||
|
this trailing area must simply be ignored. However, it is a
|
|||
|
convenient place for additional data that cannot be included
|
|||
|
in the content-type header field. In particular, if the
|
|||
|
"access-type" value is "mail-server", then the trailing area
|
|||
|
must contain commands to be sent to the mail server at the
|
|||
|
address given by NAME@SITE, where NAME and SITE are the
|
|||
|
values of the NAME and SITE parameters, respectively.
|
|||
|
|
|||
|
The embedded message header fields which appear in the body
|
|||
|
of the message/external-body data can be used to declare the
|
|||
|
Content-type of the external body. Thus a complete
|
|||
|
message/external-body message, referring to a document in
|
|||
|
PostScript format, might look like this:
|
|||
|
|
|||
|
From: Whomever
|
|||
|
Subject: whatever
|
|||
|
MIME-Version: 1.0
|
|||
|
Message-ID: id1@host.com
|
|||
|
Content-Type: multipart/alternative; boundary=42
|
|||
|
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
name="BodyFormats.ps";
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 44]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
site="thumper.bellcore.com";
|
|||
|
access-type=ANON-FTP;
|
|||
|
directory="pub";
|
|||
|
mode="image";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
name="/u/nsb/writing/rfcs/RFC-XXXX.ps";
|
|||
|
site="thumper.bellcore.com";
|
|||
|
access-type=AFS
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
|
|||
|
--42
|
|||
|
Content-Type: message/external-body;
|
|||
|
access-type=mail-server
|
|||
|
server="listserv@bogus.bitnet";
|
|||
|
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
|
|||
|
|
|||
|
Content-type: application/postscript
|
|||
|
|
|||
|
get rfc-xxxx doc
|
|||
|
|
|||
|
--42--
|
|||
|
|
|||
|
Like the message/partial type, the message/external-body
|
|||
|
type is intended to be transparent, that is, to convey the
|
|||
|
data type in the external body rather than to convey a
|
|||
|
message with a body of that type. Thus the headers on the
|
|||
|
outer and inner parts must be merged using the same rules as
|
|||
|
for message/partial. In particular, this means that the
|
|||
|
Content-type header is overridden, but the From and Subject
|
|||
|
headers are preserved.
|
|||
|
|
|||
|
Note that since the external bodies are not transported as
|
|||
|
mail, they need not conform to the 7-bit and line length
|
|||
|
requirements, but might in fact be binary files. Thus a
|
|||
|
Content-Transfer-Encoding is not generally necessary, though
|
|||
|
it is permitted.
|
|||
|
|
|||
|
Note that the body of a message of type "message/external-
|
|||
|
body" is governed by the basic syntax for an RFC 822
|
|||
|
message. In particular, anything before the first
|
|||
|
consecutive pair of CRLFs is header information, while
|
|||
|
anything after it is body information, which is ignored for
|
|||
|
most access-types.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 45]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.4 The Application Content-Type
|
|||
|
|
|||
|
The "application" Content-Type is to be used for data which
|
|||
|
do not fit in any of the other categories, and particularly
|
|||
|
for data to be processed by mail-based uses of application
|
|||
|
programs. This is information which must be processed by an
|
|||
|
application before it is viewable or usable to a user.
|
|||
|
Expected uses for Content-Type application include mail-
|
|||
|
based file transfer, spreadsheets, data for mail-based
|
|||
|
scheduling systems, and languages for "active"
|
|||
|
(computational) email. (The latter, in particular, can pose
|
|||
|
security problems which should be understood by
|
|||
|
implementors, and are considered in detail in the discussion
|
|||
|
of the application/PostScript content-type.)
|
|||
|
|
|||
|
For example, a meeting scheduler might define a standard
|
|||
|
representation for information about proposed meeting dates.
|
|||
|
An intelligent user agent would use this information to
|
|||
|
conduct a dialog with the user, and might then send further
|
|||
|
mail based on that dialog. More generally, there have been
|
|||
|
several "active" messaging languages developed in which
|
|||
|
programs in a suitably specialized language are sent through
|
|||
|
the mail and automatically run in the recipient's
|
|||
|
environment.
|
|||
|
|
|||
|
Such applications may be defined as subtypes of the
|
|||
|
"application" Content-Type. This document defines three
|
|||
|
subtypes: octet-stream, ODA, and PostScript.
|
|||
|
|
|||
|
In general, the subtype of application will often be the
|
|||
|
name of the application for which the data are intended.
|
|||
|
This does not mean, however, that any application program
|
|||
|
name may be used freely as a subtype of application. Such
|
|||
|
usages must be registered with IANA, as described in
|
|||
|
Appendix F.
|
|||
|
|
|||
|
7.4.1 The Application/Octet-Stream (primary) subtype
|
|||
|
|
|||
|
The primary subtype of application, "octet-stream", may be
|
|||
|
used to indicate that a body contains binary data. The set
|
|||
|
of possible parameters includes, but is not limited to:
|
|||
|
|
|||
|
NAME -- a suggested name for the binary data if
|
|||
|
stored as a file.
|
|||
|
|
|||
|
TYPE -- the general type or category of binary
|
|||
|
data. This is intended as information for the
|
|||
|
human recipient rather than for any automatic
|
|||
|
processing.
|
|||
|
|
|||
|
CONVERSIONS -- the set of operations that have
|
|||
|
been performed on the data before putting it in
|
|||
|
the mail (and before any Content-Transfer-Encoding
|
|||
|
that might have been applied). If multiple
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 46]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
conversions have occurred, they must be separated
|
|||
|
by commas and specified in the order they were
|
|||
|
applied -- that is, the leftmost conversion must
|
|||
|
have occurred first, and conversions are undone
|
|||
|
from right to left. Note that NO conversion
|
|||
|
values are defined by this document. Any
|
|||
|
conversion values that that do not begin with "X-"
|
|||
|
must be preceded by a published specification and
|
|||
|
by registration with IANA, as described in
|
|||
|
Appendix F.
|
|||
|
|
|||
|
PADDING -- the number of bits of padding that were
|
|||
|
appended to the bitstream comprising the actual
|
|||
|
contents to produce the enclosed byte-oriented
|
|||
|
data. This is useful for enclosing a bitstream in
|
|||
|
a body when the total number of bits is not a
|
|||
|
multiple of the byte size.
|
|||
|
|
|||
|
The values for these attributes are left undefined at
|
|||
|
present, but may require specification in the future. An
|
|||
|
example of a common (though UNIX-specific) usage might be:
|
|||
|
|
|||
|
Content-Type: application/octet-stream;
|
|||
|
name=foo.tar.Z; type=tar;
|
|||
|
conversions="x-encrypt,x-compress"
|
|||
|
|
|||
|
However, it should be noted that the use of such conversions
|
|||
|
is explicitly discouraged due to a lack of portability and
|
|||
|
standardization. The use of uuencode is particularly
|
|||
|
discouraged, in favor of the Content-Transfer-Encoding
|
|||
|
mechanism, which is both more standardized and more portable
|
|||
|
across mail boundaries.
|
|||
|
|
|||
|
The recommended action for an implementation that receives
|
|||
|
application/octet-stream mail is to simply offer to put the
|
|||
|
data in a file, with any Content-Transfer-Encoding undone,
|
|||
|
or perhaps to use it as input to a user-specified process.
|
|||
|
|
|||
|
To reduce the danger of transmitting rogue programs through
|
|||
|
the mail, it is strongly recommended that implementations
|
|||
|
NOT implement a path-search mechanism whereby an arbitrary
|
|||
|
program named in the Content-Type parameter (e.g., an
|
|||
|
"interpreter=" parameter) is found and executed using the
|
|||
|
mail body as input.
|
|||
|
|
|||
|
7.4.2 The Application/PostScript subtype
|
|||
|
|
|||
|
A Content-Type of "application/postscript" indicates a
|
|||
|
PostScript program. The language is defined in
|
|||
|
[POSTSCRIPT]. It is recommended that Postscript as sent
|
|||
|
through email should use Postscript document structuring
|
|||
|
conventions if at all possible, and correctly.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 47]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
The execution of general-purpose PostScript interpreters
|
|||
|
entails serious security risks, and implementors are
|
|||
|
discouraged from simply sending PostScript email bodies to
|
|||
|
"off-the-shelf" interpreters. While it is usually safe to
|
|||
|
send PostScript to a printer, where the potential for harm
|
|||
|
is greatly constrained, implementors should consider all of
|
|||
|
the following before they add interactive display of
|
|||
|
PostScript bodies to their mail readers.
|
|||
|
|
|||
|
The remainder of this section outlines some, though probably
|
|||
|
not all, of the possible problems with sending PostScript
|
|||
|
through the mail.
|
|||
|
|
|||
|
Dangerous operations in the PostScript language include, but
|
|||
|
may not be limited to, the PostScript operators deletefile,
|
|||
|
renamefile, filenameforall, and file. File is only
|
|||
|
dangerous when applied to something other than standard
|
|||
|
input or output. Implementations may also define additional
|
|||
|
nonstandard file operators; these may also pose a threat to
|
|||
|
security. Filenameforall, the wildcard file search
|
|||
|
operator, may appear at first glance to be harmless. Note,
|
|||
|
however, that this operator has the potential to reveal
|
|||
|
information about what files the recipient has access to,
|
|||
|
and this information may itself be sensitive. Message
|
|||
|
senders should avoid the use of potentially dangerous file
|
|||
|
operators, since these operators are quite likely to be
|
|||
|
unavailable in secure PostScript implementations. Message-
|
|||
|
receiving and -displaying software should either completely
|
|||
|
disable all potentially dangerous file operators or take
|
|||
|
special care not to delegate any special authority to their
|
|||
|
operation. These operators should be viewed as being done by
|
|||
|
an outside agency when interpreting PostScript documents.
|
|||
|
Such disabling and/or checking should be done completely
|
|||
|
outside of the reach of the PostScript language itself; care
|
|||
|
should be taken to insure that no method exists for
|
|||
|
reenabling full-function versions of these operators.
|
|||
|
|
|||
|
The PostScript language provides facilities for exiting the
|
|||
|
normal interpreter, or server, loop. Changes made in this
|
|||
|
"outer" environment are customarily retained across
|
|||
|
documents, and may in some cases be retained semipermanently
|
|||
|
in nonvolatile memory. The operators associated with exiting
|
|||
|
the interpreter loop have the potential to interfere with
|
|||
|
subsequent document processing. As such, their unrestrained
|
|||
|
use constitutes a threat of service denial. PostScript
|
|||
|
operators that exit the interpreter loop include, but may
|
|||
|
not be limited to, the exitserver and startjob operators.
|
|||
|
Message-sending software should not generate PostScript that
|
|||
|
depends on exiting the interpreter loop to operate. The
|
|||
|
ability to exit will probably be unavailable in secure
|
|||
|
PostScript implementations. Message-receiving and
|
|||
|
-displaying software should, if possible, disable the
|
|||
|
ability to make retained changes to the PostScript
|
|||
|
environment. Eliminate the startjob and exitserver commands.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 48]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
If these commands cannot be eliminated, at least set the
|
|||
|
password associated with them to a hard-to-guess value.
|
|||
|
|
|||
|
PostScript provides operators for setting system-wide and
|
|||
|
device-specific parameters. These parameter settings may be
|
|||
|
retained across jobs and may potentially pose a threat to
|
|||
|
the correct operation of the interpreter. The PostScript
|
|||
|
operators that set system and device parameters include, but
|
|||
|
may not be limited to, the setsystemparams and setdevparams
|
|||
|
operators. Message-sending software should not generate
|
|||
|
PostScript that depends on the setting of system or device
|
|||
|
parameters to operate correctly. The ability to set these
|
|||
|
parameters will probably be unavailable in secure PostScript
|
|||
|
implementations. Message-receiving and -displaying software
|
|||
|
should, if possible, disable the ability to change system
|
|||
|
and device parameters. If these operators cannot be
|
|||
|
disabled, at least set the password associated with them to
|
|||
|
a hard-to-guess value.
|
|||
|
|
|||
|
Some PostScript implementations provide nonstandard
|
|||
|
facilities for the direct loading and execution of machine
|
|||
|
code. Such facilities are quite obviously open to
|
|||
|
substantial abuse. Message-sending software should not
|
|||
|
make use of such features. Besides being totally hardware-
|
|||
|
specific, they are also likely to be unavailable in secure
|
|||
|
implementations of PostScript. Message-receiving and
|
|||
|
-displaying software should not allow such operators to be
|
|||
|
used if they exist.
|
|||
|
|
|||
|
PostScript is an extensible language, and many, if not most,
|
|||
|
implementations of it provide a number of their own
|
|||
|
extensions. This document does not deal with such extensions
|
|||
|
explicitly since they constitute an unknown factor.
|
|||
|
Message-sending software should not make use of nonstandard
|
|||
|
extensions; they are likely to be missing from some
|
|||
|
implementations. Message-receiving and -displaying software
|
|||
|
should make sure that any nonstandard PostScript operators
|
|||
|
are secure and don't present any kind of threat.
|
|||
|
|
|||
|
It is possible to write PostScript that consumes huge
|
|||
|
amounts of various system resources. It is also possible to
|
|||
|
write PostScript programs that loop infinitely. Both types
|
|||
|
of programs have the potential to cause damage if sent to
|
|||
|
unsuspecting recipients. Message-sending software should
|
|||
|
avoid the construction and dissemination of such programs,
|
|||
|
which is antisocial. Message-receiving and -displaying
|
|||
|
software should provide appropriate mechanisms to abort
|
|||
|
processing of a document after a reasonable amount of time
|
|||
|
has elapsed. In addition, PostScript interpreters should be
|
|||
|
limited to the consumption of only a reasonable amount of
|
|||
|
any given system resource.
|
|||
|
|
|||
|
Finally, bugs may exist in some PostScript interpreters
|
|||
|
which could possibly be exploited to gain unauthorized
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 49]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
access to a recipient's system. Apart from noting this
|
|||
|
possibility, there is no specific action to take to prevent
|
|||
|
this, apart from the timely correction of such bugs if any
|
|||
|
are found.
|
|||
|
|
|||
|
7.4.3 The Application/ODA subtype
|
|||
|
|
|||
|
The "ODA" subtype of application is used to indicate that a
|
|||
|
body contains information encoded according to the Office
|
|||
|
Document Architecture [ODA] standards, using the ODIF
|
|||
|
representation format. For application/oda, the Content-
|
|||
|
Type line should also specify an attribute/value pair that
|
|||
|
indicates the document application profile (DAP), using the
|
|||
|
key word "profile". Thus an appropriate header field might
|
|||
|
look like this:
|
|||
|
|
|||
|
Content-Type: application/oda; profile=Q112
|
|||
|
|
|||
|
Consult the ODA standard [ODA] for further information.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 50]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
7.5 The Image Content-Type
|
|||
|
|
|||
|
A Content-Type of "image" indicates that the bodycontains an
|
|||
|
image. The subtype names the specific image format. These
|
|||
|
names are case insensitive. Two initial subtypes are "jpeg"
|
|||
|
for the JPEG format, JFIF encoding, and "gif" for GIF format
|
|||
|
[GIF].
|
|||
|
|
|||
|
The list of image subtypes given here is neither exclusive
|
|||
|
nor exhaustive, and is expected to grow as more types are
|
|||
|
registered with IANA, as described in Appendix F.
|
|||
|
|
|||
|
7.6 The Audio Content-Type
|
|||
|
|
|||
|
A Content-Type of "audio" indicates that the body contains
|
|||
|
audio data. Although there is not yet a consensus on an
|
|||
|
"ideal" audio format for use with computers, there is a
|
|||
|
pressing need for a format capable of providing
|
|||
|
interoperable behavior.
|
|||
|
|
|||
|
The initial subtype of "basic" is specified to meet this
|
|||
|
requirement by providing an absolutely minimal lowest common
|
|||
|
denominator audio format. It is expected that richer
|
|||
|
formats for higher quality and/or lower bandwidth audio will
|
|||
|
be defined by a later document.
|
|||
|
|
|||
|
The content of the "audio/basic" subtype is audio encoded
|
|||
|
using 8-bit ISDN u-law [PCM]. When this subtype is present,
|
|||
|
a sample rate of 8000 Hz and a single channel is assumed.
|
|||
|
|
|||
|
7.7 The Video Content-Type
|
|||
|
|
|||
|
A Content-Type of "video" indicates that the body contains a
|
|||
|
time-varying-picture image, possibly with color and
|
|||
|
coordinated sound. The term "video" is used extremely
|
|||
|
generically, rather than with reference to any particular
|
|||
|
technology or format, and is not meant to preclude subtypes
|
|||
|
such as animated drawings encoded compactly. The subtype
|
|||
|
"mpeg" refers to video coded according to the MPEG standard
|
|||
|
[MPEG].
|
|||
|
|
|||
|
Note that although in general this document strongly
|
|||
|
discourages the mixing of multiple media in a single body,
|
|||
|
it is recognized that many so-called "video" formats include
|
|||
|
a representation for synchronized audio, and this is
|
|||
|
explicitly permitted for subtypes of "video".
|
|||
|
|
|||
|
7.8 Experimental Content-Type Values
|
|||
|
|
|||
|
A Content-Type value beginning with the characters "X-" is a
|
|||
|
private value, to be used by consenting mail systems by
|
|||
|
mutual agreement. Any format without a rigorous and public
|
|||
|
definition must be named with an "X-" prefix, and publicly
|
|||
|
specified values shall never begin with "X-". (Older
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 51]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
versions of the widely-used Andrew system use the "X-BE2"
|
|||
|
name, so new systems should probably choose a different
|
|||
|
name.)
|
|||
|
|
|||
|
In general, the use of "X-" top-level types is strongly
|
|||
|
discouraged. Implementors should invent subtypes of the
|
|||
|
existing types whenever possible. The invention of new
|
|||
|
types is intended to be restricted primarily to the
|
|||
|
development of new media types for email, such as digital
|
|||
|
odors or holography, and not for new data formats in
|
|||
|
general. In many cases, a subtype of application will be
|
|||
|
more appropriate than a new top-level type.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 52]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Summary
|
|||
|
|
|||
|
Using the MIME-Version, Content-Type, and Content-Transfer-
|
|||
|
Encoding header fields, it is possible to include, in a
|
|||
|
standardized way, arbitrary types of data objects with RFC
|
|||
|
822 conformant mail messages. No restrictions imposed by
|
|||
|
either RFC 821 or RFC 822 are violated, and care has been
|
|||
|
taken to avoid problems caused by additional restrictions
|
|||
|
imposed by the characteristics of some Internet mail
|
|||
|
transport mechanisms (see Appendix B). The "multipart" and
|
|||
|
"message" Content-Types allow mixing and hierarchical
|
|||
|
structuring of objects of different types in a single
|
|||
|
message. Further Content-Types provide a standardized
|
|||
|
mechanism for tagging messages or body parts as audio,
|
|||
|
image, or several other kinds of data. A distinguished
|
|||
|
parameter syntax allows further specification of data format
|
|||
|
details, particularly the specification of alternate
|
|||
|
character sets. Additional optional header fields provide
|
|||
|
mechanisms for certain extensions deemed desirable by many
|
|||
|
implementors. Finally, a number of useful Content-Types are
|
|||
|
defined for general use by consenting user agents, notably
|
|||
|
text/richtext, message/partial, and message/external-body.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 53]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Acknowledgements
|
|||
|
|
|||
|
This document is the result of the collective effort of a
|
|||
|
large number of people, at several IETF meetings, on the
|
|||
|
IETF-SMTP and IETF-822 mailing lists, and elsewhere.
|
|||
|
Although any enumeration seems doomed to suffer from
|
|||
|
egregious omissions, the following are among the many
|
|||
|
contributors to this effort:
|
|||
|
|
|||
|
Harald Tveit Alvestrand Timo Lehtinen
|
|||
|
Randall Atkinson John R. MacMillan
|
|||
|
Philippe Brandon Rick McGowan
|
|||
|
Kevin Carosso Leo Mclaughlin
|
|||
|
Uhhyung Choi Goli Montaser-Kohsari
|
|||
|
Cristian Constantinof Keith Moore
|
|||
|
Mark Crispin Tom Moore
|
|||
|
Dave Crocker Erik Naggum
|
|||
|
Terry Crowley Mark Needleman
|
|||
|
Walt Daniels John Noerenberg
|
|||
|
Frank Dawson Mats Ohrman
|
|||
|
Hitoshi Doi Julian Onions
|
|||
|
Kevin Donnelly Michael Patton
|
|||
|
Keith Edwards David J. Pepper
|
|||
|
Chris Eich Blake C. Ramsdell
|
|||
|
Johnny Eriksson Luc Rooijakkers
|
|||
|
Craig Everhart Marshall T. Rose
|
|||
|
Patrik Faeltstroem Jonathan Rosenberg
|
|||
|
Erik E. Fair Jan Rynning
|
|||
|
Roger Fajman Harri Salminen
|
|||
|
Alain Fontaine Michael Sanderson
|
|||
|
James M. Galvin Masahiro Sekiguchi
|
|||
|
Philip Gladstone Mark Sherman
|
|||
|
Thomas Gordon Keld Simonsen
|
|||
|
Phill Gross Bob Smart
|
|||
|
James Hamilton Peter Speck
|
|||
|
Steve Hardcastle-Kille Henry Spencer
|
|||
|
David Herron Einar Stefferud
|
|||
|
Bruce Howard Michael Stein
|
|||
|
Bill Janssen Klaus Steinberger
|
|||
|
Olle Jaernefors Peter Svanberg
|
|||
|
Risto Kankkunen James Thompson
|
|||
|
Phil Karn Steve Uhler
|
|||
|
Alan Katz Stuart Vance
|
|||
|
Tim Kehres Erik van der Poel
|
|||
|
Neil Katin Guido van Rossum
|
|||
|
Kyuho Kim Peter Vanderbilt
|
|||
|
Anders Klemets Greg Vaudreuil
|
|||
|
John Klensin Ed Vielmetti
|
|||
|
Valdis Kletniek Ryan Waldron
|
|||
|
Jim Knowles Wally Wedel
|
|||
|
Stev Knowles Sven-Ove Westberg
|
|||
|
Bob Kummerfeld Brian Wideen
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 54]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Pekka Kytolaakso John Wobus
|
|||
|
Stellan Lagerstr.m Glenn Wright
|
|||
|
Vincent Lau Rayan Zachariassen
|
|||
|
Donald Lindsay David Zimmerman
|
|||
|
The authors apologize for any omissions from this list,
|
|||
|
which are certainly unintentional.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 55]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix A -- Minimal MIME-Conformance
|
|||
|
|
|||
|
The mechanisms described in this document are open-ended.
|
|||
|
It is definitely not expected that all implementations will
|
|||
|
support all of the Content-Types described, nor that they
|
|||
|
will all share the same extensions. In order to promote
|
|||
|
interoperability, however, it is useful to define the
|
|||
|
concept of "MIME-conformance" to define a certain level of
|
|||
|
implementation that allows the useful interworking of
|
|||
|
messages with content that differs from US ASCII text. In
|
|||
|
this section, we specify the requirements for such
|
|||
|
conformance.
|
|||
|
|
|||
|
A mail user agent that is MIME-conformant MUST:
|
|||
|
|
|||
|
1. Always generate a "MIME-Version: 1.0" header
|
|||
|
field.
|
|||
|
|
|||
|
2. Recognize the Content-Transfer-Encoding header
|
|||
|
field, and decode all received data encoded with
|
|||
|
either the quoted-printable or base64
|
|||
|
implementations. Encode any data sent that is
|
|||
|
not in seven-bit mail-ready representation using
|
|||
|
one of these transformations and include the
|
|||
|
appropriate Content-Transfer-Encoding header
|
|||
|
field, unless the underlying transport mechanism
|
|||
|
supports non-seven-bit data, as SMTP does not.
|
|||
|
|
|||
|
3. Recognize and interpret the Content-Type
|
|||
|
header field, and avoid showing users raw data
|
|||
|
with a Content-Type field other than text. Be
|
|||
|
able to send at least text/plain messages, with
|
|||
|
the character set specified as a parameter if it
|
|||
|
is not US-ASCII.
|
|||
|
|
|||
|
4. Explicitly handle the following Content-Type
|
|||
|
values, to at least the following extents:
|
|||
|
|
|||
|
Text:
|
|||
|
-- Recognize and display "text" mail
|
|||
|
with the character set "US-ASCII."
|
|||
|
-- Recognize other character sets at
|
|||
|
least to the extent of being able
|
|||
|
to inform the user about what
|
|||
|
character set the message uses.
|
|||
|
-- Recognize the "ISO-8859-*" character
|
|||
|
sets to the extent of being able to
|
|||
|
display those characters that are
|
|||
|
common to ISO-8859-* and US-ASCII,
|
|||
|
namely all characters represented
|
|||
|
by octet values 0-127.
|
|||
|
-- For unrecognized subtypes, show or
|
|||
|
offer to show the user the "raw"
|
|||
|
version of the data. An ability at
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 56]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
least to convert "text/richtext" to
|
|||
|
plain text, as shown in Appendix D,
|
|||
|
is encouraged, but not required for
|
|||
|
conformance.
|
|||
|
Message:
|
|||
|
--Recognize and display at least the
|
|||
|
primary (822) encapsulation.
|
|||
|
Multipart:
|
|||
|
-- Recognize the primary (mixed)
|
|||
|
subtype. Display all relevant
|
|||
|
information on the message level
|
|||
|
and the body part header level and
|
|||
|
then display or offer to display
|
|||
|
each of the body parts
|
|||
|
individually.
|
|||
|
-- Recognize the "alternative" subtype,
|
|||
|
and avoid showing the user
|
|||
|
redundant parts of
|
|||
|
multipart/alternative mail.
|
|||
|
-- Treat any unrecognized subtypes as if
|
|||
|
they were "mixed".
|
|||
|
Application:
|
|||
|
-- Offer the ability to remove either of
|
|||
|
the two types of Content-Transfer-
|
|||
|
Encoding defined in this document
|
|||
|
and put the resulting information
|
|||
|
in a user file.
|
|||
|
|
|||
|
5. Upon encountering any unrecognized Content-
|
|||
|
Type, an implementation must treat it as if it had
|
|||
|
a Content-Type of "application/octet-stream" with
|
|||
|
no parameter sub-arguments. How such data are
|
|||
|
handled is up to an implementation, but likely
|
|||
|
options for handling such unrecognized data
|
|||
|
include offering the user to write it into a file
|
|||
|
(decoded from its mail transport format) or
|
|||
|
offering the user to name a program to which the
|
|||
|
decoded data should be passed as input.
|
|||
|
Unrecognized predefined types, which in a MIME-
|
|||
|
conformant mailer might still include audio,
|
|||
|
image, or video, should also be treated in this
|
|||
|
way.
|
|||
|
|
|||
|
A user agent that meets the above conditions is said to be
|
|||
|
MIME-conformant. The meaning of this phrase is that it is
|
|||
|
assumed to be "safe" to send virtually any kind of
|
|||
|
properly-marked data to users of such mail systems, because
|
|||
|
such systems will at least be able to treat the data as
|
|||
|
undifferentiated binary, and will not simply splash it onto
|
|||
|
the screen of unsuspecting users. There is another sense
|
|||
|
in which it is always "safe" to send data in a format that
|
|||
|
is MIME-conformant, which is that such data will not break
|
|||
|
or be broken by any known systems that are conformant with
|
|||
|
RFC 821 and RFC 822. User agents that are MIME-conformant
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 57]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
have the additional guarantee that the user will not be
|
|||
|
shown data that were never intended to be viewed as text.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 58]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix B -- General Guidelines For Sending Email Data
|
|||
|
|
|||
|
Internet email is not a perfect, homogeneous system. Mail
|
|||
|
may become corrupted at several stages in its travel to a
|
|||
|
final destination. Specifically, email sent throughout the
|
|||
|
Internet may travel across many networking technologies.
|
|||
|
Many networking and mail technologies do not support the
|
|||
|
full functionality possible in the SMTP transport
|
|||
|
environment. Mail traversing these systems is likely to be
|
|||
|
modified in such a way that it can be transported.
|
|||
|
|
|||
|
There exist many widely-deployed non-conformant MTAs in the
|
|||
|
Internet. These MTAs, speaking the SMTP protocol, alter
|
|||
|
messages on the fly to take advantage of the internal data
|
|||
|
structure of the hosts they are implemented on, or are just
|
|||
|
plain broken.
|
|||
|
|
|||
|
The following guidelines may be useful to anyone devising a
|
|||
|
data format (Content-Type) that will survive the widest
|
|||
|
range of networking technologies and known broken MTAs
|
|||
|
unscathed. Note that anything encoded in the base64
|
|||
|
encoding will satisfy these rules, but that some well-known
|
|||
|
mechanisms, notably the UNIX uuencode facility, will not.
|
|||
|
Note also that anything encoded in the Quoted-Printable
|
|||
|
encoding will survive most gateways intact, but possibly not
|
|||
|
some gateways to systems that use the EBCDIC character set.
|
|||
|
|
|||
|
(1) Under some circumstances the encoding used for
|
|||
|
data may change as part of normal gateway or user
|
|||
|
agent operation. In particular, conversion from
|
|||
|
base64 to quoted-printable and vice versa may be
|
|||
|
necessary. This may result in the confusion of
|
|||
|
CRLF sequences with line breaks in text body
|
|||
|
parts. As such, the persistence of CRLF as
|
|||
|
something other than a line break should not be
|
|||
|
relied on.
|
|||
|
|
|||
|
(2) Many systems may elect to represent and store
|
|||
|
text data using local newline conventions. Local
|
|||
|
newline conventions may not match the RFC822 CRLF
|
|||
|
convention -- systems are known that use plain CR,
|
|||
|
plain LF, CRLF, or counted records. The result is
|
|||
|
that isolated CR and LF characters are not well
|
|||
|
tolerated in general; they may be lost or
|
|||
|
converted to delimiters on some systems, and hence
|
|||
|
should not be relied on.
|
|||
|
|
|||
|
(3) TAB (HT) characters may be misinterpreted or
|
|||
|
may be automatically converted to variable numbers
|
|||
|
of spaces. This is unavoidable in some
|
|||
|
environments, notably those not based on the ASCII
|
|||
|
character set. Such conversion is STRONGLY
|
|||
|
DISCOURAGED, but it may occur, and mail formats
|
|||
|
should not rely on the persistence of TAB (HT)
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 59]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
characters.
|
|||
|
|
|||
|
(4) Lines longer than 76 characters may be wrapped
|
|||
|
or truncated in some environments. Line wrapping
|
|||
|
and line truncation are STRONGLY DISCOURAGED, but
|
|||
|
unavoidable in some cases. Applications which
|
|||
|
require long lines should somehow differentiate
|
|||
|
between soft and hard line breaks. (A simple way
|
|||
|
to do this is to use the quoted-printable
|
|||
|
encoding.)
|
|||
|
|
|||
|
(5) Trailing "white space" characters (SPACE, TAB
|
|||
|
(HT)) on a line may be discarded by some transport
|
|||
|
agents, while other transport agents may pad lines
|
|||
|
with these characters so that all lines in a mail
|
|||
|
file are of equal length. The persistence of
|
|||
|
trailing white space, therefore, should not be
|
|||
|
relied on.
|
|||
|
|
|||
|
(6) Many mail domains use variations on the ASCII
|
|||
|
character set, or use character sets such as
|
|||
|
EBCDIC which contain most but not all of the US-
|
|||
|
ASCII characters. The correct translation of
|
|||
|
characters not in the "invariant" set cannot be
|
|||
|
depended on across character converting gateways.
|
|||
|
For example, this situation is a problem when
|
|||
|
sending uuencoded information across BITNET, an
|
|||
|
EBCDIC system. Similar problems can occur without
|
|||
|
crossing a gateway, since many Internet hosts use
|
|||
|
character sets other than ASCII internally. The
|
|||
|
definition of Printable Strings in X.400 adds
|
|||
|
further restrictions in certain special cases. In
|
|||
|
particular, the only characters that are known to
|
|||
|
be consistent across all gateways are the 73
|
|||
|
characters that correspond to the upper and lower
|
|||
|
case letters A-Z and a-z, the 10 digits 0-9, and
|
|||
|
the following eleven special characters:
|
|||
|
|
|||
|
"'" (ASCII code 39)
|
|||
|
"(" (ASCII code 40)
|
|||
|
")" (ASCII code 41)
|
|||
|
"+" (ASCII code 43)
|
|||
|
"," (ASCII code 44)
|
|||
|
"-" (ASCII code 45)
|
|||
|
"." (ASCII code 46)
|
|||
|
"/" (ASCII code 47)
|
|||
|
":" (ASCII code 58)
|
|||
|
"=" (ASCII code 61)
|
|||
|
"?" (ASCII code 63)
|
|||
|
|
|||
|
A maximally portable mail representation, such as
|
|||
|
the base64 encoding, will confine itself to
|
|||
|
relatively short lines of text in which the only
|
|||
|
meaningful characters are taken from this set of
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 60]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
73 characters.
|
|||
|
|
|||
|
Please note that the above list is NOT a list of recommended
|
|||
|
practices for MTAs. RFC 821 MTAs are prohibited from
|
|||
|
altering the character of white space or wrapping long
|
|||
|
lines. These BAD and illegal practices are known to occur
|
|||
|
on established networks, and implementions should be robust
|
|||
|
in dealing with the bad effects they can cause.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 61]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix C -- A Complex Multipart Example
|
|||
|
|
|||
|
What follows is the outline of a complex multipart message.
|
|||
|
This message has five parts to be displayed serially: two
|
|||
|
introductory plain text parts, an embedded multipart
|
|||
|
message, a richtext part, and a closing encapsulated text
|
|||
|
message in a non-ASCII character set. The embedded
|
|||
|
multipart message has two parts to be displayed in parallel,
|
|||
|
a picture and an audio fragment.
|
|||
|
|
|||
|
MIME-Version: 1.0
|
|||
|
From: Nathaniel Borenstein <nsb@bellcore.com>
|
|||
|
Subject: A multipart example
|
|||
|
Content-Type: multipart/mixed;
|
|||
|
boundary=unique-boundary-1
|
|||
|
|
|||
|
This is the preamble area of a multipart message.
|
|||
|
Mail readers that understand multipart format
|
|||
|
should ignore this preamble.
|
|||
|
If you are reading this text, you might want to
|
|||
|
consider changing to a mail reader that understands
|
|||
|
how to properly display multipart messages.
|
|||
|
--unique-boundary-1
|
|||
|
|
|||
|
...Some text appears here...
|
|||
|
[Note that the preceding blank line means
|
|||
|
no header fields were given and this is text,
|
|||
|
with charset US ASCII. It could have been
|
|||
|
done with explicit typing as in the next part.]
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-type: text/plain; charset=US-ASCII
|
|||
|
|
|||
|
This could have been part of the previous part,
|
|||
|
but illustrates explicit versus implicit
|
|||
|
typing of body parts.
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-Type: multipart/parallel;
|
|||
|
boundary=unique-boundary-2
|
|||
|
|
|||
|
|
|||
|
--unique-boundary-2
|
|||
|
Content-Type: audio/basic
|
|||
|
Content-Transfer-Encoding: base64
|
|||
|
|
|||
|
... base64-encoded 8000 Hz single-channel
|
|||
|
u-law-format audio data goes here....
|
|||
|
|
|||
|
--unique-boundary-2
|
|||
|
Content-Type: image/gif
|
|||
|
Content-Transfer-Encoding: Base64
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 62]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
... base64-encoded image data goes here....
|
|||
|
|
|||
|
--unique-boundary-2--
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-type: text/richtext
|
|||
|
|
|||
|
This is <bold><italic>richtext.</italic></bold>
|
|||
|
<nl><nl>Isn't it
|
|||
|
<bigger><bigger>cool?</bigger></bigger>
|
|||
|
|
|||
|
--unique-boundary-1
|
|||
|
Content-Type: message/rfc822
|
|||
|
|
|||
|
From: (name in US-ASCII)
|
|||
|
Subject: (subject in US-ASCII)
|
|||
|
Content-Type: Text/plain; charset=ISO-8859-1
|
|||
|
Content-Transfer-Encoding: Quoted-printable
|
|||
|
|
|||
|
... Additional text in ISO-8859-1 goes here ...
|
|||
|
|
|||
|
--unique-boundary-1--
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 63]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix D -- A Simple Richtext-to-Text Translator in C
|
|||
|
|
|||
|
One of the major goals in the design of the richtext subtype
|
|||
|
of the text Content-Type is to make formatted text so simple
|
|||
|
that even text-only mailers will implement richtext-to-
|
|||
|
plain-text translators, thus increasing the likelihood that
|
|||
|
multifont text will become "safe" to use very widely. To
|
|||
|
demonstrate this simplicity, what follows is an extremely
|
|||
|
simple 44-line C program that converts richtext input into
|
|||
|
plain text output:
|
|||
|
|
|||
|
#include <stdio.h>
|
|||
|
#include <ctype.h>
|
|||
|
main() {
|
|||
|
int c, i;
|
|||
|
char token[50];
|
|||
|
|
|||
|
while((c = getc(stdin)) != EOF) {
|
|||
|
if (c == '<') {
|
|||
|
for (i=0; (i<49 && (c = getc(stdin)) != '>'
|
|||
|
&& c != EOF); ++i) {
|
|||
|
token[i] = isupper(c) ? tolower(c) : c;
|
|||
|
}
|
|||
|
if (c == EOF) break;
|
|||
|
if (c != '>') while ((c = getc(stdin)) !=
|
|||
|
'>'
|
|||
|
&& c != EOF) {;}
|
|||
|
if (c == EOF) break;
|
|||
|
token[i] = '\0';
|
|||
|
if (!strcmp(token, "lt")) {
|
|||
|
putc('<', stdout);
|
|||
|
} else if (!strcmp(token, "nl")) {
|
|||
|
putc('\n', stdout);
|
|||
|
} else if (!strcmp(token, "/paragraph")) {
|
|||
|
fputs("\n\n", stdout);
|
|||
|
} else if (!strcmp(token, "comment")) {
|
|||
|
int commct=1;
|
|||
|
while (commct > 0) {
|
|||
|
while ((c = getc(stdin)) != '<'
|
|||
|
&& c != EOF) ;
|
|||
|
if (c == EOF) break;
|
|||
|
for (i=0; (c = getc(stdin)) != '>'
|
|||
|
&& c != EOF; ++i) {
|
|||
|
token[i] = isupper(c) ?
|
|||
|
tolower(c) : c;
|
|||
|
}
|
|||
|
if (c== EOF) break;
|
|||
|
token[i] = NULL;
|
|||
|
if (!strcmp(token, "/comment")) --
|
|||
|
commct;
|
|||
|
if (!strcmp(token, "comment"))
|
|||
|
++commct;
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 64]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
}
|
|||
|
} /* Ignore all other tokens */
|
|||
|
} else if (c != '\n') putc(c, stdout);
|
|||
|
}
|
|||
|
putc('\n', stdout); /* for good measure */
|
|||
|
}
|
|||
|
It should be noted that one can do considerably better than
|
|||
|
this in displaying richtext data on a dumb terminal. In
|
|||
|
particular, one can replace font information such as "bold"
|
|||
|
with textual emphasis (like *this* or _T_H_I_S_). One can
|
|||
|
also properly handle the richtext formatting commands
|
|||
|
regarding indentation, justification, and others. However,
|
|||
|
the above program is all that is necessary in order to
|
|||
|
present richtext on a dumb terminal.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 65]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix E -- Collected Grammar
|
|||
|
|
|||
|
This appendix contains the complete BNF grammar for all the
|
|||
|
syntax specified by this document.
|
|||
|
|
|||
|
By itself, however, this grammar is incomplete. It refers
|
|||
|
to several entities that are defined by RFC 822. Rather
|
|||
|
than reproduce those definitions here, and risk
|
|||
|
unintentional differences between the two, this document
|
|||
|
simply refers the reader to RFC 822 for the remaining
|
|||
|
definitions. Wherever a term is undefined, it refers to the
|
|||
|
RFC 822 definition.
|
|||
|
|
|||
|
attribute := token
|
|||
|
|
|||
|
body-part = <"message" as defined in RFC 822,
|
|||
|
with all header fields optional, and with the
|
|||
|
specified delimiter not occurring anywhere in
|
|||
|
the message body, either on a line by itself
|
|||
|
or as a substring anywhere.>
|
|||
|
|
|||
|
boundary := 0*69<bchars> bcharsnospace
|
|||
|
|
|||
|
bchars := bcharsnospace / " "
|
|||
|
|
|||
|
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /
|
|||
|
"_"
|
|||
|
/ "," / "-" / "." / "/" / ":" / "=" / "?"
|
|||
|
|
|||
|
close-delimiter := delimiter "--"
|
|||
|
|
|||
|
Content-Description := *text
|
|||
|
|
|||
|
Content-ID := msg-id
|
|||
|
|
|||
|
Content-Transfer-Encoding := "BASE64" / "QUOTED-
|
|||
|
PRINTABLE" /
|
|||
|
"8BIT" / "7BIT" /
|
|||
|
"BINARY" / x-token
|
|||
|
|
|||
|
Content-Type := type "/" subtype *[";" parameter]
|
|||
|
|
|||
|
delimiter := CRLF "--" boundary ; taken from Content-Type
|
|||
|
field.
|
|||
|
; when content-type is
|
|||
|
multipart
|
|||
|
; There should be no space
|
|||
|
; between "--" and boundary.
|
|||
|
|
|||
|
encapsulation := delimiter CRLF body-part
|
|||
|
|
|||
|
epilogue := *text ; to be ignored upon
|
|||
|
receipt.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 66]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
MIME-Version := 1*text
|
|||
|
|
|||
|
multipart-body := preamble 1*encapsulation close-delimiter
|
|||
|
epilogue
|
|||
|
|
|||
|
parameter := attribute "=" value
|
|||
|
|
|||
|
preamble := *text ; to be ignored upon
|
|||
|
receipt.
|
|||
|
|
|||
|
subtype := token
|
|||
|
|
|||
|
token := 1*<any CHAR except SPACE, CTLs, or tspecials>
|
|||
|
|
|||
|
tspecials := "(" / ")" / "<" / ">" / "@" ; Must be in
|
|||
|
/ "," / ";" / ":" / "\" / <"> ; quoted-string,
|
|||
|
/ "/" / "[" / "]" / "?" / "." ; to use within
|
|||
|
/ "=" ; parameter values
|
|||
|
|
|||
|
|
|||
|
type := "application" / "audio" ; case-
|
|||
|
insensitive
|
|||
|
/ "image" / "message"
|
|||
|
/ "multipart" / "text"
|
|||
|
/ "video" / x-token
|
|||
|
|
|||
|
value := token / quoted-string
|
|||
|
|
|||
|
x-token := <The two characters "X-" followed, with no
|
|||
|
intervening white space, by any token>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 67]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix F -- IANA Registration Procedures
|
|||
|
|
|||
|
MIME has been carefully designed to have extensible
|
|||
|
mechanisms, and it is expected that the set of content-
|
|||
|
type/subtype pairs and their associated parameters will grow
|
|||
|
significantly with time. Several other MIME fields, notably
|
|||
|
character set names, access-type parameters for the
|
|||
|
message/external-body type, conversions parameters for the
|
|||
|
application type, and possibly even Content-Transfer-
|
|||
|
Encoding values, are likely to have new values defined over
|
|||
|
time. In order to ensure that the set of such values is
|
|||
|
developed in an orderly, well-specified, and public manner,
|
|||
|
MIME defines a registration process which uses the Internet
|
|||
|
Assigned Numbers Authority (IANA) as a central registry for
|
|||
|
such values.
|
|||
|
|
|||
|
In general, parameters in the content-type header field are
|
|||
|
used to convey supplemental information for various content
|
|||
|
types, and their use is defined when the content-type and
|
|||
|
subtype are defined. New parameters should not be defined
|
|||
|
as a way to introduce new functionality.
|
|||
|
|
|||
|
In order to simplify and standardize the registration
|
|||
|
process, this appendix gives templates for the registration
|
|||
|
of new values with IANA. Each of these is given in the form
|
|||
|
of an email message template, to be filled in by the
|
|||
|
registering party.
|
|||
|
|
|||
|
F.1 Registration of New Content-type/subtype Values
|
|||
|
|
|||
|
Note that MIME is generally expected to be extended by
|
|||
|
subtypes. If a new fundamental top-level type is needed,
|
|||
|
its specification should be published as an RFC or
|
|||
|
submitted in a form suitable to become an RFC, and be
|
|||
|
subject to the Internet standards process.
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME content-type/subtype
|
|||
|
|
|||
|
MIME type name:
|
|||
|
|
|||
|
(If the above is not an existing top-level MIME type,
|
|||
|
please explain why an existing type cannot be used.)
|
|||
|
|
|||
|
MIME subtype name:
|
|||
|
|
|||
|
Required parameters:
|
|||
|
|
|||
|
Optional parameters:
|
|||
|
|
|||
|
Encoding considerations:
|
|||
|
|
|||
|
Security considerations:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 68]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be if a new top-level type is being defined, and
|
|||
|
must be a publicly available specification in any
|
|||
|
case.)
|
|||
|
|
|||
|
Person & email address to contact for further
|
|||
|
information:
|
|||
|
F.2 Registration of New Character Set Values
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME character set value
|
|||
|
|
|||
|
MIME character set name:
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be or an international standard.)
|
|||
|
|
|||
|
Person & email address to contact for further
|
|||
|
information:
|
|||
|
|
|||
|
F.3 Registration of New Access-type Values for
|
|||
|
Message/external-body
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME Access-type for
|
|||
|
Message/external-body content-type
|
|||
|
|
|||
|
MIME access-type name:
|
|||
|
|
|||
|
Required parameters:
|
|||
|
|
|||
|
Optional parameters:
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be.)
|
|||
|
|
|||
|
Person & email address to contact for further
|
|||
|
information:
|
|||
|
|
|||
|
|
|||
|
F.4 Registration of New Conversions Values for Application
|
|||
|
|
|||
|
To: IANA@isi.edu
|
|||
|
Subject: Registration of new MIME Conversions value
|
|||
|
for Application content-type
|
|||
|
|
|||
|
MIME Conversions name:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 69]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Published specification:
|
|||
|
|
|||
|
(The published specification must be an Internet RFC or
|
|||
|
RFC-to-be.)
|
|||
|
|
|||
|
Person & email address to contact for further
|
|||
|
information:
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 70]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix G -- Summary of the Seven Content-types
|
|||
|
|
|||
|
Content-type: text
|
|||
|
|
|||
|
Subtypes defined by this document: plain, richtext
|
|||
|
|
|||
|
Important Parameters: charset
|
|||
|
|
|||
|
Encoding notes: quoted-printable generally preferred if an
|
|||
|
encoding is needed and the character set is mostly an
|
|||
|
ASCII superset.
|
|||
|
|
|||
|
Security considerations: Rich text formats such as TeX and
|
|||
|
Troff often contain mechanisms for executing arbitrary
|
|||
|
commands or file system operations, and should not be
|
|||
|
used automatically unless these security problems have
|
|||
|
been addressed. Even plain text may contain control
|
|||
|
characters that can be used to exploit the capabilities
|
|||
|
of "intelligent" terminals and cause security
|
|||
|
violations. User interfaces designed to run on such
|
|||
|
terminals should be aware of and try to prevent such
|
|||
|
problems.
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: multipart
|
|||
|
|
|||
|
Subtypes defined by this document: mixed, alternative,
|
|||
|
digest, parallel.
|
|||
|
|
|||
|
Important Parameters: boundary
|
|||
|
|
|||
|
Encoding notes: No content-transfer-encoding is permitted.
|
|||
|
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: message
|
|||
|
|
|||
|
Subtypes defined by this document: rfc822, partial,
|
|||
|
external-body
|
|||
|
|
|||
|
Important Parameters: id, number, total
|
|||
|
|
|||
|
Encoding notes: No content-transfer-encoding is permitted.
|
|||
|
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: application
|
|||
|
|
|||
|
Subtypes defined by this document: octet-stream,
|
|||
|
postscript, oda
|
|||
|
|
|||
|
Important Parameters: profile
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 71]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred for octet-stream
|
|||
|
or other unreadable subtypes.
|
|||
|
|
|||
|
Security considerations: This type is intended for the
|
|||
|
transmission of data to be interpreted by locally-installed
|
|||
|
programs. If used, for example, to transmit executable
|
|||
|
binary programs or programs in general-purpose interpreted
|
|||
|
languages, such as LISP programs or shell scripts, severe
|
|||
|
security problems could result. In general, authors of
|
|||
|
mail-reading agents are cautioned against giving their
|
|||
|
systems the power to execute mail-based application data
|
|||
|
without carefully considering the security implications.
|
|||
|
While it is certainly possible to define safe application
|
|||
|
formats and even safe interpreters for unsafe formats, each
|
|||
|
interpreter should be evaluated separately for possible
|
|||
|
security problems.
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: image
|
|||
|
|
|||
|
Subtypes defined by this document: jpeg, gif
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: audio
|
|||
|
|
|||
|
Subtypes defined by this document: basic
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
|
|||
|
________________________________________________________________
|
|||
|
|
|||
|
Content-type: video
|
|||
|
|
|||
|
Subtypes defined by this document: mpeg
|
|||
|
|
|||
|
Important Parameters: none
|
|||
|
|
|||
|
Encoding notes: base64 generally preferred
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 72]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Appendix H -- Canonical Encoding Model
|
|||
|
|
|||
|
|
|||
|
|
|||
|
There was some confusion, in earlier drafts of this memo,
|
|||
|
regarding the model for when email data was to be converted
|
|||
|
to canonical form and encoded, and in particular how this
|
|||
|
process would affect the treatment of CRLFs, given that the
|
|||
|
representation of newlines varies greatly from system to
|
|||
|
system. For this reason, a canonical model for encoding is
|
|||
|
presented below.
|
|||
|
|
|||
|
The process of composing a MIME message part can be modelled
|
|||
|
as being done in a number of steps. Note that these steps
|
|||
|
are roughly similar to those steps used in RFC1113:
|
|||
|
|
|||
|
Step 1. Creation of local form.
|
|||
|
|
|||
|
The body part to be transmitted is created in the system's
|
|||
|
native format. The native character set is used, and where
|
|||
|
appropriate local end of line conventions are used as well.
|
|||
|
The may be a UNIX-style text file, or a Sun raster image, or
|
|||
|
a VMS indexed file, or audio data in a system-dependent
|
|||
|
format stored only in memory, or anything else that
|
|||
|
corresponds to the local model for the representation of
|
|||
|
some form of information.
|
|||
|
|
|||
|
Step 2. Conversion to canonical form.
|
|||
|
|
|||
|
The entire body part, including "out-of-band" information
|
|||
|
such as record lengths and possibly file attribute
|
|||
|
information, is converted to a universal canonical form.
|
|||
|
The specific content type of the body part as well as its
|
|||
|
associated attributes dictate the nature of the canonical
|
|||
|
form that is used. Conversion to the proper canonical form
|
|||
|
may involve character set conversion, transformation of
|
|||
|
audio data, compression, or various other operations
|
|||
|
specific to the various content types.
|
|||
|
|
|||
|
For example, in the case of text/plain data, the text must
|
|||
|
be converted to a supported character set and lines must be
|
|||
|
delimited with CRLF delimiters in accordance with RFC822.
|
|||
|
Note that the restriction on line lengths implied by RFC822
|
|||
|
is eliminated if the next step employs either quoted-
|
|||
|
printable or base64 encoding.
|
|||
|
|
|||
|
Step 3. Apply transfer encoding.
|
|||
|
|
|||
|
A Content-Transfer-Encoding appropriate for this body part
|
|||
|
is applied. Note that there is no fixed relationship
|
|||
|
between the content type and the transfer encoding. In
|
|||
|
particular, it may be appropriate to base the choice of
|
|||
|
base64 or quoted-printable on character frequency counts
|
|||
|
which are specific to a given instance of body part.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 73]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Step 4. Insertion into message.
|
|||
|
|
|||
|
The encoded object is inserted into a MIME message with
|
|||
|
appropriate body part headers and boundary markers.
|
|||
|
|
|||
|
It is vital to note that these steps are only a model; they
|
|||
|
are specifically NOT a blueprint for how an actual system
|
|||
|
would be built. In particular, the model fails to account
|
|||
|
for two common designs:
|
|||
|
|
|||
|
1. In many cases the conversion to a canonical
|
|||
|
form prior to encoding will be subsumed into the
|
|||
|
encoder itself, which understands local formats
|
|||
|
directly. For example, the local newline
|
|||
|
convention for text bodyparts might be carried
|
|||
|
through to the encoder itself along with knowledge
|
|||
|
of what that format is.
|
|||
|
|
|||
|
2. The output of the encoders may have to pass
|
|||
|
through one or more additional steps prior to
|
|||
|
being transmitted as a message. As such, the
|
|||
|
output of the encoder may not be compliant with
|
|||
|
the formats specified by RFC822. In particular,
|
|||
|
once again it may be appropriate for the
|
|||
|
converter's output to be expressed using local
|
|||
|
newline conventions rather than using the standard
|
|||
|
RFC822 CRLF delimiters.
|
|||
|
|
|||
|
Other implementation variations are conceivable as well.
|
|||
|
The only important aspect of this discussion is that the
|
|||
|
resulting messages are consistent with those produced by the
|
|||
|
model described here.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 74]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
References
|
|||
|
|
|||
|
[US-ASCII] Coded Character Set--7-Bit American Standard Code
|
|||
|
for Information Interchange, ANSI X3.4-1986.
|
|||
|
|
|||
|
[ATK] Borenstein, Nathaniel S., Multimedia Applications
|
|||
|
Development with the Andrew Toolkit, Prentice-Hall, 1990.
|
|||
|
|
|||
|
[GIF] Graphics Interchange Format (Version 89a), Compuserve,
|
|||
|
Inc., Columbus, Ohio, 1990.
|
|||
|
|
|||
|
[ISO-2022] International Standard--Information Processing--
|
|||
|
ISO 7-bit and 8-bit coded character sets--Code extension
|
|||
|
techniques, ISO 2022:1986.
|
|||
|
|
|||
|
[ISO-8859] Information Processing -- 8-bit Single-Byte Coded
|
|||
|
Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO
|
|||
|
8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2,
|
|||
|
1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part
|
|||
|
4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5:
|
|||
|
Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6:
|
|||
|
Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7:
|
|||
|
Latin/Greek alphabet, ISO 8859-7, 1987. Part 8:
|
|||
|
Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin
|
|||
|
alphabet No. 5, ISO 8859-9, 1990.
|
|||
|
|
|||
|
[ISO-646] International Standard--Information Processing--
|
|||
|
ISO 7-bit coded character set for information interchange,
|
|||
|
ISO 646:1983.
|
|||
|
|
|||
|
[MPEG] Video Coding Draft Standard ISO 11172 CD, ISO
|
|||
|
IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991.
|
|||
|
|
|||
|
[ODA] ISO 8613; Information Processing: Text and Office
|
|||
|
System; Office Document Architecture (ODA) and Interchange
|
|||
|
Format (ODIF), Part 1-8, 1989.
|
|||
|
|
|||
|
[PCM] CCITT, Fascicle III.4 - Recommendation G.711, Geneva,
|
|||
|
1972, "Pulse Code Modulation (PCM) of Voice Frequencies".
|
|||
|
|
|||
|
[POSTSCRIPT] Adobe Systems, Inc., PostScript Language
|
|||
|
Reference Manual, Addison-Wesley, 1985.
|
|||
|
|
|||
|
[X400] Schicker, Pietro, "Message Handling Systems, X.400",
|
|||
|
Message Handling Systems and Distributed Applications, E.
|
|||
|
Stefferud, O-j. Jacobsen, and P. Schicker, eds., North-
|
|||
|
Holland, 1989, pp. 3-41.
|
|||
|
|
|||
|
[RFC-783] Sollins, K.R. TFTP Protocol (revision 2). June,
|
|||
|
1981, MIT, RFC-783.
|
|||
|
|
|||
|
[RFC-821] Postel, J.B. Simple Mail Transfer Protocol.
|
|||
|
August, 1982, USC/Information Sciences Institute, RFC-821.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 75]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
[RFC-822] Crocker, D. Standard for the format of ARPA
|
|||
|
Internet text messages. August, 1982, UDEL, RFC-822.
|
|||
|
|
|||
|
[RFC-934] Rose, M.T.; Stefferud, E.A. Proposed standard
|
|||
|
for message encapsulation. January, 1985, Delaware
|
|||
|
and NMA, RFC-934.
|
|||
|
|
|||
|
[RFC-959] Postel, J.B.; Reynolds, J.K. File Transfer
|
|||
|
Protocol. October, 1985, USC/Information Sciences
|
|||
|
Institute, RFC-959.
|
|||
|
|
|||
|
[RFC-1049] Sirbu, M.A. Content-Type header field for
|
|||
|
Internet messages. March, 1988, CMU, RFC-1049.
|
|||
|
|
|||
|
[RFC-1113] Linn, J. Privacy enhancement for Internet
|
|||
|
electronic mail: Part I - message encipherment and
|
|||
|
authentication procedures. August, 1989, IAB Privacy Task
|
|||
|
Force, RFC-1113.
|
|||
|
|
|||
|
[RFC-1154] Robinson, D.; Ullmann, R. Encoding header field
|
|||
|
for Internet messages. April, 1990, Prime Computer,
|
|||
|
Inc., RFC-1154.
|
|||
|
|
|||
|
[RFC-1342] Moore, Keith, Representation of Non-Ascii Text in
|
|||
|
Internet Message Headers. June, 1992, University of
|
|||
|
Tennessee, RFC-1342.
|
|||
|
|
|||
|
Security Considerations
|
|||
|
|
|||
|
Security issues are discussed in Section 7.4.2 and in
|
|||
|
Appendix G. Implementors should pay special attention to
|
|||
|
the security implications of any mail content-types that can
|
|||
|
cause the remote execution of any actions in the recipient's
|
|||
|
environment. In such cases, the discussion of the
|
|||
|
applicaton/postscript content-type in Section 7.4.2 may
|
|||
|
serve as a model for considering other content-types with
|
|||
|
remote execution capabilities.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 76]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
Authors' Addresses
|
|||
|
|
|||
|
For more information, the authors of this document may be
|
|||
|
contacted via Internet mail:
|
|||
|
|
|||
|
Nathaniel S. Borenstein
|
|||
|
MRE 2D-296, Bellcore
|
|||
|
445 South St.
|
|||
|
Morristown, NJ 07962-1910
|
|||
|
|
|||
|
Phone: +1 201 829 4270
|
|||
|
Fax: +1 201 829 7019
|
|||
|
Email: nsb@bellcore.com
|
|||
|
|
|||
|
|
|||
|
Ned Freed
|
|||
|
Innosoft International, Inc.
|
|||
|
250 West First Street
|
|||
|
Suite 240
|
|||
|
Claremont, CA 91711
|
|||
|
|
|||
|
Phone: +1 714 624 7907
|
|||
|
Fax: +1 714 621 5319
|
|||
|
Email: ned@innosoft.com
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page 77]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
THIS PAGE INTENTIONALLY LEFT BLANK.
|
|||
|
|
|||
|
Please discard this page and place the following table of
|
|||
|
contents after the title page.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page i]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Table of Contents
|
|||
|
|
|||
|
|
|||
|
1 Introduction....................................... 1
|
|||
|
2 Notations, Conventions, and Generic BNF Grammar.... 3
|
|||
|
3 The MIME-Version Header Field...................... 5
|
|||
|
4 The Content-Type Header Field...................... 6
|
|||
|
5 The Content-Transfer-Encoding Header Field......... 10
|
|||
|
5.1 Quoted-Printable Content-Transfer-Encoding......... 14
|
|||
|
5.2 Base64 Content-Transfer-Encoding................... 17
|
|||
|
6 Additional Optional Content- Header Fields......... 19
|
|||
|
6.1 Optional Content-ID Header Field................... 19
|
|||
|
6.2 Optional Content-Description Header Field.......... 19
|
|||
|
7 The Predefined Content-Type Values................. 20
|
|||
|
7.1 The Text Content-Type.............................. 20
|
|||
|
7.1.1 The charset parameter.............................. 20
|
|||
|
7.1.2 The Text/plain subtype............................. 23
|
|||
|
7.1.3 The Text/richtext subtype.......................... 23
|
|||
|
7.2 The Multipart Content-Type......................... 29
|
|||
|
7.2.1 Multipart: The common syntax...................... 30
|
|||
|
7.2.2 The Multipart/mixed (primary) subtype.............. 34
|
|||
|
7.2.3 The Multipart/alternative subtype.................. 34
|
|||
|
7.2.4 The Multipart/digest subtype....................... 36
|
|||
|
7.2.5 The Multipart/parallel subtype..................... 36
|
|||
|
7.3 The Message Content-Type........................... 37
|
|||
|
7.3.1 The Message/rfc822 (primary) subtype............... 37
|
|||
|
7.3.2 The Message/Partial subtype........................ 37
|
|||
|
7.3.3 The Message/External-Body subtype.................. 40
|
|||
|
7.4 The Application Content-Type....................... 46
|
|||
|
7.4.1 The Application/Octet-Stream (primary) subtype..... 46
|
|||
|
7.4.2 The Application/PostScript subtype................. 47
|
|||
|
7.4.3 The Application/ODA subtype........................ 50
|
|||
|
7.5 The Image Content-Type............................. 51
|
|||
|
7.6 The Audio Content-Type............................. 51
|
|||
|
7.7 The Video Content-Type............................. 51
|
|||
|
7.8 Experimental Content-Type Values................... 51
|
|||
|
Summary............................................ 53
|
|||
|
Acknowledgements................................... 54
|
|||
|
Appendix A -- Minimal MIME-Conformance............. 56
|
|||
|
Appendix B -- General Guidelines For Sending Email Data59
|
|||
|
Appendix C -- A Complex Multipart Example.......... 62
|
|||
|
Appendix D -- A Simple Richtext-to-Text Translator in C64
|
|||
|
Appendix E -- Collected Grammar.................... 66
|
|||
|
Appendix F -- IANA Registration Procedures......... 68
|
|||
|
F.1 Registration of New Content-type/subtype Values..68
|
|||
|
F.2 Registration of New Character Set Values...... 69
|
|||
|
F.3 Registration of New Access-type Values for Message/external-body69
|
|||
|
F.4 Registration of New Conversions Values for Application69
|
|||
|
Appendix G -- Summary of the Seven Content-types... 71
|
|||
|
Appendix H -- Canonical Encoding Model............. 73
|
|||
|
References......................................... 75
|
|||
|
Security Considerations............................ 76
|
|||
|
Authors' Addresses................................. 77
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page ii]
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Borenstein & Freed [Page iii]
|
|||
|
|