VISCII


VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable characters of ASCII unmodified, but it replaces 6 of the 33 control characters with printable characters. It adds 128 precomposed characters. Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data, but legacy VSCII and VISCII files may need conversion.

History and naming

VISCII was designed by the Vietnamese Standardization Working Group based in Silicon Valley, California in 1992 while they were working with the Unicode consortium to include pre-composed Vietnamese characters in the Unicode standard. VISCII, along with VIQR, was first published in a bilingual report in September 1992, in which it was dubbed the "Vietnamese Standard Code for Information Interchange". The report noted a proliferation in computer usage in Vietnam, that existing applications used vendor-specific encodings which were unable to interoperate with one another, and that standardisation between vendors was therefore necessary.
The next year, in 1993, Vietnam adopted TCVN 5712, its first national standard in the information technology domain. This defined a character encoding named VSCII, which had been developed by the TCVN Technical Committee on Information Technology, and with its name standing for "Vietnamese Standard Code for Information Interchange". VSCII is incompatible with, and otherwise unrelated to, the earlier-published VISCII. Unlike VISCII, VSCII is a "Vietnamese Standard" in the sense of a national standard.
VISCII and VIQR were approved as the informational-status, attributed to the Viet-Std group and dated May 1993. This RFC notes them to be "conventions" used by overseas Vietnamese speakers on Usenet, and that it "specifies no level of standard". In spite of this, it continues to call VISCII the "VIetnamese Standard Code for Information Interchange". The labels VISCII and csVISCII are registered with the IANA for VISCII, with reference to RFC 1456.

Design

A traditional extended ASCII character set consists of the ASCII set plus up to 128 characters. Vietnamese requires 134 additional letter-diacritic combinations, which is six too many. There are essentially four different ways to handle this problem:
  1. Use variable-width encoding
  2. Include combining diacritical marks for tone marks or for diacritics in general
  3. Replace some ASCII punctuation, preferably punctuation which is not invariant in ISO 646
  4. Replace at least six of the basic ASCII control characters
VISCII went for the last option, replacing six of the least problematic C0 control codes with six of the least-used uppercase letter-diacritic combinations. While this option may cause programs that use those control codes to malfunction when handling VISCII text, it creates fewer complications than the other two options. Nonetheless, locations of both C0 or C1 control characters and the codes used for the non-breaking space in ISO-8859-1, Mac OS Roman and OEM-US were deliberately assigned to uppercase letters, with the intention of making use of lowercase codepoints with an all-capital font a serviceable workaround if graphical characters could not be displayed for those codes.
However, using up all the extended code points for accented letters left no room to add useful symbols, superscripted numbers, curved quotes, proper dashes, etc., like most other extended ASCII character sets.
Location of characters deliberately mostly follows ISO-8859-1 where there are characters in common between the two code pages, motivated by user friendliness concerns.

Support

VISCII is partially supported by the in California, which has released various VISCII-compliant software packages, libraries, and fonts for MS-DOS and Windows, Unix, and Macintosh. VISCII-compliant software is available at many .
VISCII was historically offered as an encoding for outgoing email by Mozilla Thunderbird.
VISCII was mostly used by overseas Vietnamese speakers, with VSCII being more popular in northern Vietnam and VNI being more popular in southern Vietnam.

Character set

Differences from ISO-8859-1 are shown shaded.