Документ взят из кэша поисковой машины. Адрес оригинального документа : http://mirror.msu.net/pub/rfc-editor/rfc-ed-all/pdfrfc/rfc1947.txt.pdf
Дата изменения: Wed Mar 27 23:19:54 2002
Дата индексирования: Tue Oct 2 18:50:38 2012
Кодировка:

Поисковые слова: c0 f0 f5 e8 ec e5 e4
Network Working Group Request for Comments: 1947 Category: Informational

D. Spinellis SENA S.A. May 1996

Greek Character Encoding for Electronic Mail Messages Status of This Memo This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Overview and Rational This document describes a standard encoding for electronic mail [RFC822] containing Greek text and provides implementation guidelines. The standard is based on MIME [RFC1521] and the ISO 8859-7 character encoding. Although the implementation of this standard is straightforward several non-standard but "functional" - though unlikely to inter-operate - alternatives are in common use. For this reason we highlight common implementation and mail user agent setup errors. Description In order to transfer Greek text via electronic mail the text is first translated into the ISO 8859-7 character set, and then encoded using either the Base64 (preferable for text that is mainly Greek) or the Quoted-Printable (justifiable in cases where some Greek words appear inside predominately Latin text) method, as defined in MIME. The following table provides most common Greek encodings (see also [RFC1345]): 0646 ---0386 0388 37 -ea eb M7 -a2 b8 51 -86 8d MC -cd ce 23 -71 72 69 LG L1 G7 GO GC 28 97 Description -- -- -- -- -- -- -- -- ----------86 b6 Capital alpha with acute 8d b8 Capital epsilon with acute 8f b9 Capital eta with acute 90 ba Capital iota with acute 92 bc Capital omicron with acute 95 be Capital upsilon with acute 98 bf Capital omega with acute a1 c0 Small iota with acute and

0389 ec b9 8f d7 73 038a ed ba 90 d8 75 038c ee bc 92 d9 76 038e ef be 95 da 77 038f f0 bf 98 df 78 0390 c0 a1 fd

Spinellis

Informational

[Page 1]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

0391 0392 0393 0394 0395 0396 0397 0398 0399 039a 039b 039c 039d 039e 039f 03a0 03a1 03a3 03a4 03a5 03a6 03a7 03a8 03a9 03aa 03ab 03ac 03ad 03ae 03af 03b0 03b1 03b2 03b3 03b4 03b5 03b6 03b7 03b8 03b9 03ba 03bb 03bc 03bd

80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 95 96 97

c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d3 d4 d5 d6 d7 d8 d9 da db

a4 a5 a6 a7 a8 a9 aa ac ad b5 b6 b8 b7 bd be c6 c7 cf d0 d1 d2 d3 d4 d5

b0 b5 a1 a2 b6 b7 b8 a3 b9 ba a4 bb c1 a5 c3 a6 c4 aa c6 cb bc cc be bf ab bd

41 42 43 44 45 46 47 48 49 51 52 53 54 55 56 57 58 59 62 63 64 65 66 67

a4 a5 a6 a7 a8 a9 aa ac ad b5 b6 b7 b8 bd be c6 c7 cf d0 d1 d2 d3 d4 d5 91 96

61 62 67 64 65 7a 68 75 69 6b 6c 6d 6e 6a 6f 70 72 73 74 79 66 78 63 76

23 40

5c

5e

21 3f 5f

5d 3a 5b

41 42 43 44 45 46 47 48 49 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 58 59 5a

61 62 67 64 65 7a 68 75 69 6b 6c 6d 6e 6a 6f 70 72 73 74 79 66 78 63 76

41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 53 54 55 56 57 58 59

41 42 44 45 46 49 4a 4b 4c 4d 4e 4f 50 51 52 53 55 56 58 59 5a 5b 5c 5d

c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0

e1 e2 e3 e5

dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed

9b 9d 9e 9f fc d6 d7 d8 dd de e0 e1 e2 e3 e4 e5 e6 e7

c0 db dc dd fe e1 e2 e7 e4 e5 fa e8 f5 e9 eb ec ed ee

b1 b2 b3 b5

9b 9d 9e 9f fc d6 d7 d8 dd de e0 e1 e2 e3 e4 e5 e6 e7 61 62 63 64 65 66 67 68 69 6b 6c 6d 6e 41 42 47 44 45 5a 48 55 49 4b 4c 4d 4e 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 61 62 64 65 66 69 6a 6b 6c 6d 6e 6f 70

98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4

8a 8b 8c 8d 8e 8f 9a 9b 9c 9d 9e 9f aa

e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed

diaeresis Capital alpha Capital beta Capital gamma Capital delta Capital epsilon Capital zeta Capital eta Capital theta Capital iota Capital kappa Capital lamda Capital mu Capital nu Capital xi Capital omicron Capital pi Capital rho Capital sigma Capital tau Capital upsilon Capital phi Capital chi Capital psi Capital omega Capital iota with diaeresis Capital upsilon with diaeresis Small alpha with acute Small epsilon with acute Small eta with acute Small iota with acute Small upsilon with acute and diaeresis Small alpha Small beta Small gamma Small delta Small epsilon Small zeta Small eta Small theta Small iota Small kappa Small lamda Small mu Small nu

Spinellis

Informational

[Page 2]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

03be 03bf 03c0 03c1 03c2 03c3 03c4 03c5 03c6 03c7 03c8 03c9 03ca 03cb

a5 a6 a7 a8 aa a9 ab ac ad ae af e0 e4 e8

ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb

e8 e9 ea eb ed ec ee f2 f3 f4 f6 fa a0 fb

ea ef f0 f2 f7 f3 f4 f9 e6 f8 e3 f6 fb fc

ab ac ad ae af ba bb bc bd be bf db b4 b8

e8 e9 ea eb ed ec ee f2 f3 f4 f6 fa a0 fb

6f 70 71 72 77 73 74 75 76 78 79 7a

4a 4f 50 52 57 53 54 59 46 58 43 56

6e 6f 70 71 72 73 74 75 76 77 78 79

71 72 73 75 77 76 78 79 7a 7b 7c 7d

ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb

03cc e6 fc a2 de b6 a2 03cd e7 fd a3 e0 b7 a3 03ce e9 fe fd f1 b9 fd Note: All values are in hexadecimal.

Small xi Small omicron Small pi Small rho Small final sigma Small sigma Small tau Small upsilon Small phi Small chi Small psi Small omega Small iota with diaeresis Small upsilon with diaeresis fc Small omicron with acute fd Small upsilon with acute fe Small omega with acute

The column headers refer to the following character sets: 0646 37 The ISO 2DIS 10646 code. PC code page 737 also known as 437G. Note that some implementations of this code page do not include capital letters with acute. Character set 8859-7 as implemented in Microsoft Windows 3.1, Microsoft Windows 3.11, and Microsoft Windows 95. IBM code page 851. The Greek code page implemented on the Apple Macintosh computers. IBM code page 423 (EBCDIC-CP-GR). IBM code page 869. Latin Greek (iso-ir-19). Latin Greek 1 (iso-ir-27). This page only contains the Greek capital letters whose glyphs do not exist in the Latin alphabet. The other capital letters are rendered using the equivalent Latin letter (e.g. "Greek capital letter alpha" is rendered as "Latin capital letter A"). When mapping "Latin Greek 1" text to ISO 8859-7 the Latin capital letters should only be transcribed to the equivalent Greek ones if a suitable heuristic determines that the

M7

51 MC 23 69 LG L1

Spinellis

Informational

[Page 3]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

specific Latin letters are used to represent Greek glyphs. G7 GO GC 28 97 7 bit Greek (iso-ir-88). Old 7 bit Greek (iso-ir-18). Greek CCITT (iso-ir-150). Character set ISO 5428:1980 (iso-ir-55). The target character set ISO 8859-7:1987 (ELOT-928) (iso-ir-126).

MIME Headers A mail message that contains Greek text must contain at least the following MIME headers: MIME-Version: 1.0 Content-type: text/plain; charset=ISO-8859-7 Content-transfer-encoding: BASE64 | Quoted-Printable In the future, when all email systems implement fully transparent 8-bit e-mail as defined in RFC 1425 and RFC 1426 the message body encoding phase described in this standard will be no longer needed. In this case the requisite MIME headers are modified as follows: MIME-Version: 1.0 Content-type: text/plain; charset=ISO-8859-7 Content-transfer-encoding: 8BIT Even when RFC 1425 is used, Q or B encoding will continue to apply to message headers as detailed in the following section. Optional It is recommended, although not required, to support Greek encoding in mail headers as specified in RFC 1522. Specifically, the B-encoding format is to be the default method used for encoding Greek text in RFC-822 mail headers, and the Q-encoding format the method to use for the exceptional case of encoding a single Greek word or letter in an otherwise Latin-character-based header.

Spinellis

Informational

[Page 4]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

Example Below is a short example of Quoted-Printable encoded Greek email: Date: Wed, 31 Jan 96 20:15:03 EET From: Diomidis Spinellis Subject: Sample Greek mail To: Achilleas Voliotis MIME-Version: 1.0 Content-ID: Content-Type: Text/plain; charset=ISO-8859-7 Content-Transfer-Encoding: Base64 yuHr5+zd8eEsCgrU7yDl6+vn7enq/CDh6/bc4uf07yDh8O/05evl3/Th6SDh8PwgMjYg4/Hc 7Ozh9OEuCg== Discussion It is possible [RFC1428] (and unfortunately common practice) to set up an arrangement of mail user and transfer agents that allow end users to communicate with Greek e-mail messages while violating a number of standards. Such arrangements are unlikely to offer wide scale interoperability. One common error is to arrange the rendering and composition of Greek messages by rigging a mail user agent hosted in an ISO 8859-1 environment to use a presentation font that contains Greek glyphs and a keyboard input method that generates Greek text using those glyphs. The resulting messages begin with header items indicating contents in the ISO 8859-1 character set and include text in a totally different encoding. Unfortunately this "solution" appears to "work" across similar systems and is widely used. One other error is to tag Greek text generated on Microsoft Windows platforms as ISO 8859-7 without an intermediate translation phase. It is important to note that the character set used by the Microsoft Windows Greek implementations is NOT the same as the ISO 8859-7 representation. First of all, the character set used to represent Greek characters differs slightly from the ISO 8859-7 encoding (this difference was instrumented in order to rectify the appearance of an early version of Microsoft Word for Windows in which the end-of-section symbol clashed with the "Greek capital alpha with acute" glyph). In addition, a number of 8-bit characters available on Greek Windows implementations are not part of the ISO 8859-7 character set.

Spinellis

Informational

[Page 5]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

Note that the ISO 8859-7 encoding is equivalent to the Greek Standards Organisation ELOT-928 encoding. References [ISO-8859] Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets, Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. [RFC822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, UDEL, August 1982. Simonsen, K., "Character Mnemonics & Character Sets" RFC 1345, Rationel Almen Planlaegning, June 1992. Klensin, J., Freed N., Rose M., Stefferud E., and D. Crocker, "SMTP Service Extensions", RFC 1425, United Nations University, Innosoft International, Inc., Dover Beach Consulting, Inc., Network Management Associates, Inc., The Branch Office, February 1993. Klensin, J., Freed N., Rose M., Stefferud E., and D. Crocker, "SMTP Service Extension for 8bit-MIME Transport", RFC 1426, United Nations University, Innosoft International, Inc., Dover Beach Consulting, Inc., Network Management Associates, Inc., The Branch Office, February 1993. Vaudreuil, G., "Transition of Internet Mail from Just-Send-8 to 8bit-SMTP/MIME", RFC 1428, CNRI, February 1993. Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", Bellcore, Innosoft, September 1993. Moore K., "MIME Part Two: Message Header Extensions for Non-ASCII Text", University of Tennessee, September 1993.

[RFC1345]

[RFC1425]

[RFC1426]

[RFC1428]

[RFC1521]

[RFC1522]

Spinellis

Informational

[Page 6]


RFC 1947

Greek Encoding for E-mail Messages

May 1996

Security Considerations Security issues are not discussed in this memo. Author's Address Diomidis Spinellis SENA S.A. Kyprou 27 GR-152 47 Filothei GREECE Phone: +30 (1) 6854535 Fax: +30 (1) 6840631 EMail: D.Spinellis@senanet.com

Spinellis

Informational

[Page 7]