function
|
Library: ISO/IEC 8859 (OMFF8859)
Import : omff8859.xmd |
Returns: a scannable input source for streaming data |
export string source function
reader in value encoding-type encoding optional
from value string source input-data
Use iso8859.reader to read a string source and return the text of that source converted
from an ISO/IEC 8859 encoding to a UTF-8 encoding. So, although the provided source is in one of the ISO/IEC 8859
encodings, the program sees UTF-8. Use the encoding argument to specify the desired
encoding.
If the string source input-data contains a byte that is an unused code point in the selected
encoding, that byte will be suppressed. The unused code points are
0x00 through 0x1f, and 0x7f through 0x9f.
0x00 through 0x1f, and 0x7f through 0x9f.
0x00 through 0x1f, 0x7f through 0xc0 except for 0xa0, 0xa4, 0xac, 0xad, 0xbb, and 0xbf, as well as 0xdb
through 0xdf and 0xf3 through 0xff.
0x00 through 0x1f, 0x7f through 0x9f, 0xae,
0xd2, and 0xff.
0x00 through 0x1f, and 0x7f through 0x9f, 0xa1,
0xbf through 0xde, 0xfb, 0xfc, and 0xff
0x00 through 0x1f, 0x7f through 0x9f, 0xdc
through 0xde, and 0xfc through 0xff.
The following example converts a file from ISO/IEC 8859-5 to UTF-8 for further processing by find
rules:
import "omff8859.xmd" prefixed by iso8859. process using group "process input" submit iso8859.reader in iso8859.encoding-8859-5 from file #args[1] group "process input" ; ...
To use iso8859.reader, you must import OMFF8859 into your program using an import
declaration such as:
import "omff8859.xmd" prefixed by iso8859.