Character encodings, EUC (OMFFEUC)

This library contains one OmniMark external source function and one OmniMark external output function implementation for converting from and to the "Extended Unix Code" encoding of Japanese text, as follows:

reader is an external source function that reads a value source, its argument, and returns the text of that file converted from a EUC encoding to a UTF-8 encoding. That is, the provided source is in EUC, but the program sees UTF-8.
writer is an external output function that accepts UTF-8 encoded data and writes that data to a value output, its argument, converted from a UTF-8 encoding to a EUC encoding. That is, the program writes UTF-8, but the provided output receives EUC.

The data formats are interpreted/produced according to the Japanese Industry Standards JIS X 0201, JIS X 0208 and JIS X 0212. The EUC data format is transformed using the JIS<->EUC conversion algorithms.

The only kinds of errors that can occur are in conversion: finding a character that doesn't have a conversion in the other character set. In this case, the converted value use is DEL (0x7F) in the JIS encoding, and NOT-A-CHARACTER (0xFFFD) in the Unicode (UTF-8) encoding.

These functions are based on the book "Understanding Japanese Information Processing" by Ken Lunde, O'Reilly 1993, ISBN 1-56592-043-0.

Usage Note

To use omffeuc, you must import it into your program using a statement like this:

  import "omffeuc.xmd" prefixed by euc.

(Please see the import topic for more on importing.)

Functions
   euc.reader
   euc.writer
Platforms
   HP/UX
   IBM AIX
   Linux (Intel)
   MS Windows 98/ME
   MS Windows NT/2000/XP
   Sun Solaris

[ INDEX ] [ CONCEPTS ] [ TASKS ] [ SYNTAX ] [ LIBRARIES ] [ LEGACYLIBRARIES ] [ ERRORS ]

OmniMark 7.1.2 Documentation Generated: June 28, 2005 at 5:46:14 pm
If you have any comments about this section of the documentation, send email to [email protected]