String sink data type

You can use the string sink data type to create a destination for string data. OmniMark has several types of data destinations (or sinks):

  • stream attachment (buffer, referent, file, etc.)
  • string sink function
  • value string sink function argument

In some cases, these sinks simply direct output to a file, a network, or some other destination. In other cases, they transform the data in some way as it is output. For instance, the external filter function jis.writer transforms its output from UTF-8 encoding to JIS encoding. The addition of the string sink type makes it possible to write such filter functions in internal OmniMark functions.

The string sink data type allows you to write string sink functions that can transform data as it is sent to a destination.

In the following example, the string sink data type is used to declare a string sink function, and to declare a parameter of string sink for the function so that it can send the data that it processes to another sink.

  define string sink function
     uppercase value string sink destination
  as
     using output as destination
        repeat scan #current-input
        match letter+ => chars
           output "ug" % chars
        match [\letter]+ => stuff
           output stuff
        again
  
  process
     using output as uppercase #main-output
        output "Hello World.%n"

In most places in OmniMark the current output scope is active, meaning that you can execute an output action almost anywhere in your program, and the data will stream to the sink, or sinks, that are attached to the current output scope. Because a string sink function is itself a destination for data, however, the current output scope is not active inside a string source function unless you explicitly establish it.

In the example above the sink #main-output is passed to the function as the string sink parameter destination. The using output as qualifier is then used inside the function to establish the current output scope for the function.

You can write a string sink function that consumes its input and does not send it to another sink, but if you want to send the data on to another sink, you must establish a current output scope in the function before you perform an output action. It is an error to perform an output when the current output scope is not active.

String sink functions operate in a streaming fashion. This means that the string sink function runs as a coroutine with the process that produces the data it consumes. For instance, in this program,

  process
     using output as uppercase #main-output
      do xml-parse scan "war-and-peace.xml"
         output "%c"
      done

the output of the parse is streamed incrementally to the uppercase function (just as it would have been streamed incrementally to #main-output if that was used as the output destination directly). The uppercase function, in turn, streams its output incrementally to its current output scope, which is #main-output.

Notice that a string sink function does not receive the data that it consumes from a parameter of the function. Because a string sink function instantiates a sink, it can only be called where a sink expression is expected, that is, anywhere that you could use a regular OmniMark sink, such as an open stream or #main-output. In the example above, the uppercase function is used with using output as to establish it as the current output scope of the program. All data output within the scope of the using output as statement will therefore be streamed to the uppercase function.

Because a sink is an active consumer of data, you cannot declare a variable of type string sink. You can use string sink only to declare a string sink function or a string sink parameter for a function. A string sink parameter must be a value parameter.

An external string sink function can be used anywhere a built-in OmniMark sink, such as #main-output or file <filename> can be used. However, an internal string sink function cannot be used with the output-to statement or with the open...as statement. Internal sinks are always bound to a scope that feeds data to them.

Modifiers can be specified when using an instance of the string sink data type. However, if the string sink is defined by an internal function, the referents-allowed and referents-displayed modifiers cannot be specified.

Once a string sink is established, the following actions can be applied to it:

Actions put and output are used to feed textual data into the sink, and the action signal can be used to send control signals to the sink coroutine.

Ending a string sink

You can use a return action, without a value, to end a string sink function, or you can simply allow the function to end. There is no operational difference between the two, except that no part of function body will be executed after the return is executed. return is therefore useful if you want to end the function within a conditional construct. Alternatively, you can throw an exception from a string sink function. If the exception is not caught within the function itself, it will propagate to the scope where the function was called from.

When a string sink function ends, the execution of its producer will be halted. Only the always clauses in the producer's scope will be run after that. In the similar situation when the body of a scope consuming a string source ends or throws before consuming the entire source, the source will be halted. In either case, the program execution then proceeds after the scope where the string sink or string source function was called. The following program, for example,

  define string sink function
     consume-first-line
  as
     do scan #current-input
     match (any-text ** "%n") => line
        put #main-output line
     done
  
  process
     using output as consume-first-line
     do
        output "My mistress' eyes are nothing like the sun;%n"
        output "Coral is far more red than her lips' red;%n"
     done
     output "If snow be white, why then her breasts are dun;%n"
only outputs these two lines:
  My mistress' eyes are nothing like the sun;
  If snow be white, why then her breasts are dun;

Superseded functionality

The string sink data type replaces the output type, which was used for declaring external output functions, and is now deprecated.

  • Output function type

The external output function declaration:

  define external output function ...

is deprecated in favor of the external string sink function declaration:

  define external string sink function ...
  • Output parameter type

The value output parameter declaration:

  define external function foo
     value output destination

is deprecated in favor of the value string sink parameter declaration:

  define external function foo
     value string sink destination