Declaration define external string string source function euc-input-file
value string filename
exceptions-to value io-exception exceptions-to optional
Purpose
This external string source function reads the file named by the "filename" argument and returns the text of that file converted from an EUC encoding to a UTF-8 encoding. The file is in EUC, but the program will see the UTF-8 conversion.
Arguments:
- "filename". The name of the EUC encoded file you want to read and convert to UTF-8. If a zero-length "filename" is used (that is, ""), then
euc-input-file
does not open a file, but reads from standard input. The zero-length "filename" option allows the conversion functionality to be used in an OmniMark program that is being used as a filter.
- "exceptions-to". This optional argument indicates that errors are to be recorded in the passed "io-exception" object, and that the OmniMark program is not to be immediately terminated. There are three types of errors, categorized according to how they are handled:
- Whenever an invalid or out-of-range encoding is found, it is converted to the UTF-8 encoding of the Unicode "REPLACEMENT CHARACTER" (0xFFFD). If "exceptions-to" is specified, the "io-exception" object is marked for a data encoding error, and the function continues processing.
- If the external string source function cannot be created, either because the declaration does not match what is expected or because there is not enough memory to create the source object, an error is signaled to OmniMark, and your program is terminated.
- If "exceptions-to" is specified, then for any other type of error that occurs during memory allocation, file opening or closing, or reading or writing, the "io-exception" object is marked for the error found, and processing continues. If "exceptions-to" is not specified, an error is signaled to OmniMark and your program is terminated.
The file format is interpreted according to the Japanese Industry Standards JIS X 0201, JIS X 0208, and JIS X 0212, with the addition of EUC encoding of the single bytes and of the byte pairs. No escape sequences are used by EUC.
Example:
; Setting a local buffer to the contents of a file which contains EUC-encoded
; Japanese text.
set my-buffer to euc-input-file "myfile.euc"
|
Related Topics
|
Other Library Functions
|
OmniMark 9.1.0 Documentation Generated: September 2, 2010 at 1:38:10 pm
If you have any comments about this section of the documentation, please use this form.
Copyright © Stilo International plc, 1988-2010.