International email


International email arises from the combined provision of internationalized domain names and email address internationalization. The result is email that contains international characters, encoded as UTF-8, in the email header and in supporting mail transfer protocols. The most significant aspect of this is the allowance of email addresses in most of the world's writing systems, at both interface and transport levels.

Email addresses

Traditional email addresses are limited to characters from the English alphabet and a few other special characters.
The following are valid traditional email addresses:
Abc@example.com
Abc.123@example.com
user+mailbox/department=shipping@example.com
!#$%&'*+-/=?^_`.~@example.com
"Abc@def"@example.com
"Fred Bloggs"@example.com
"Joe.\\Blow"@example.com
A Russian might wish to use иван.сергеев@пример.рф as their identifier but be forced to use a transcription such as ivan.sergeev@example.ru or even some other completely unrelated identifier instead. The same is clearly true of Chinese, Japanese and many other nationalities that do not use Latin scripts, but also applies to users from non-English-speaking European countries whose desired addresses might contain diacritics. As a result, email users are forced to identify themselves using non-native scripts - or programmers of email systems must compensate for this by converting identifiers from their native scripts to ASCII scripts and back again at the user interface layer.
International email, by contrast, uses Unicode characters encoded as UTF-8 - allowing for the encoding the text of addresses in most of the world's writing systems.
The following are all valid international email addresses:

अजय@डाटा.भारत
квіточка@пошта.укр
θσερ@εχαμπλε.ψομ
Dörte@Sörensen.example.com
коля@пример.рф

UTF-8 headers

Although the traditional format for email header section allows non-ASCII characters to be included in the value portion of some of the header fields using MIME-encoded words, MIME-encoding must not be used to encode other information in a header, such as an email address, or header fields like Message-ID or Received. Moreover, the MIME-encoding requires extra processing of the header to convert the data to and from its MIME-encoded word representation, and harms readability of a header section.
The 2012 standards RFC 6532 and RFC 6531 allow the inclusion of Unicode characters in a header content using UTF-8 encoding, and their transmission via SMTP - but in practice support is only slowly rolling out.

Interoperability via downgrading

Domain internationalization works by downgrading. UTF-8 parts, known as U-Labels, are transformed into A-Labels via an ad-hoc method called IDNA. For example, Sörensen.example.com is encoded as xn--srensen-90a.example.com. In 2003, when the need was addressed, that seemed easier than checking that all DNS software could comply with UTF-8 strings, although in theory DNS can transport binary data. This encoding is needed before issuing DNS queries.
Note that domain names are also, if not primarily, used for web navigation. EAI differs.
Since traditional email standards constrain all email header values to ASCII only characters, it is possible that the presence of UTF-8 characters in email headers decreases the stability and reliability of transporting such email. This is because some email servers do not support these characters. Checking compliance with UTF-8 strings must be done software package by software package There was an experimental method proposed by the IETF, by which email could be somehow downgraded into the legacy all-ASCII format which all standard email servers support. This proposal was too cumbersome, because the meaning of the left hand side part of an email address is local to the target server. No way to check that xn--something isn't a valid user name, used in some domain. So that experiment has been obsoleted in 2012 by RFC 6530.

Standards framework

The set of Internet RFC documents RFC 6530, RFC 6531, RFC 6532, and RFC 6533, all of them published in February 2012, define mechanisms and protocol extensions needed to fully support internationalized email addresses. These changes include an SMTP extension and extension of email header syntax to accommodate UTF-8 data. The document set also includes discussion of key assumptions and issues in deploying fully internationalized email.

Adoption