Unicode (OMUNICODE)

The OMUNICODE library is based on Unicode Character Database. It provides functions that examine various Unicode character properties.

The functions in the OMUNICODE library work on individual character code points, which are independent of character encoding. You will probably need to convert a character from its input encoding to its code point before you can examine its properties. If your input is UTF-8 encoded, you can use utf8.code-point for that purpose. The following example outputs the code block of every input character, assuming that the input given to the program is UTF-8 encoded:

  import "omunicode.xmd" prefixed by unicode.
  import "omutf8.xmd" prefixed by utf8.
  
  
  process
     repeat scan #main-input
     match utf8.char => character
        output "The Unicode block of '%g(character)' is "
            || unicode.block-name of utf8.code-point of character || "%n"
     again