Managing coroutines

In certain cases, OmniMark creates coroutines in order to efficiently stream data within an OmniMark program. Most of the time, this behavior is handled completely automatically and there is no need for the programmer to worry about it. However, there are occasions in which the behavior of coroutines may lead to unexpected behavior in your program. This topic will help you understand when coroutines are active and what consequences they have for your program.

Coroutining occurs when you have one routine in you program that provides a stream of data and a second routine that consumes that data. Under normal circumstances, the first routine would run to completion, its output would be buffered, and when it was finished, the second routine would start and consume the data from the buffer. This is what happens, for instance, when you call a string returning function in OmniMark:

  define string function get-some-text
   as
     local stream temp-buffer
     open temp-buffer as buffer
     using output as temp-buffer
      repeat for integer i from 1 to 1000
        output "%d(i): abcdefghijklmnopqrstuvwxyz%n"
      again
     close temp-buffer
     return temp-buffer
     
  process
     submit get-some-text
     
  find  lc+ => letters
     output "%ug(letters)"

In this program, the entire output of the function, the alphabet repeated one thousand times, is buffered in the stream temp-buffer and returned to the submit statement in a single block. However, if you replace the string function with a string source function you will get co-routining behavior:

  define string source function make-some-text
   as
     repeat for integer i to 1000
        output "%d(i): abcdefghijklmnopqrstuvwxyz%n"
        log-message "input"
     again
  
  process
    using output as file "out.txt"
     submit make-some-text
  
  find lc+ => l
     output "%ug(l)"
     log-message "main"

In this case, the string source function runs as one coroutine and the calling function runs as another coroutine. Every time the output statement is executed in the string source function, control is passed to the calling program where the submit statement processes the data returned and then passes control back to the string source function. Control is passed back and forth between the two routines until the string source function is completed. You can see this behavior very clearly by tracing the sample program in the debugger or by running it at the command line and observing how the log-message alternatively prints "input" and "main" on the console.

You can use a string source to generate or pre-process markup that is to be sent to a parser. When a string source function is called in a parse context, the string source function and the parser run as coroutines. That is, the string source function is executed incrementally and streams its output to the parser incrementally. Execution passes back and forth between the parser and the string source function until the entire input has been processed. This avoids buffering the entire output of the string source function before the parser starts. This saves computer resources and improves performance. You can process very large amounts of data without running into resource problems.

In the following example, the string source function parser-feeder generates an XML document by submitting a source. The markup is generated partly in the function itself and partly by find rules fired as a result of the submit statement. The output of the function and the find rules becomes the input to the parser. Element rules then transform the XML into HTML.

  define string source function parser-feeder as
     output "<greeting>"
     submit "Hello world."
     output "</greeting>"
  
  find "world" => planet-name
     output "<planet>"
         || planet-name
         || "</planet>"
  
  process
     do xml-parse
      scan parser-feeder
        output "<HTML><BODY>%c"
            || "</BODY></HTML>"
     done
  
  element "greeting"
     output "<P>%c</P>"
  
  element "planet"
     output "<B>%c</B>"

Other advantages of using string source functions include:

  • It allows the parser to report errors in generated markup as that markup is generated, allowing you to use the parser to simplify error detection and correction in programs that generate markup.
  • It allows you to use the current state of the parse tree (as expressed in the execution state of the markup rules) to guide markup generation, allowing you to do context-sensitive markup building.

Debugging coroutines

In rare cases you may experience problems with the use of string source functions or the cross-translate or up-translate aided translation types because of the way OmniMark coordinates the activities of coroutines.

When you use coroutines, you have two processes running cooperatively within a single program. OmniMark runs each process in a separate processing domain. Some resources are owned by one domain or the other, while others are shared between the two domains.

If you experience an error in a program that uses coroutines, the answer may be found in the following:

  • When a new domain is created, its active groups are shared with the parent domain. Changing groups with next group is in one domain affects the active groups in the other domain. A scope protected with save groups or using group, however, isolates the active group from changes done to active groups in other domains. Changing groups in the isolated scope does not affect the active groups in any other domain, nor vice versa. Parsing constructs like do sgml-parse, as well as the aided translation types, save groups in their input domain automatically. Each domain of a parsing construct has its own active groups.
  • Global variables are shared among all active domains. If one domain modifies a global variable, the new value will be immediately visible by all other domains, even if the variable is saved by the modifying domain. For this reason, save and save-clear operations must nest properly when they're applied to a plain global variable. Because domains are parallel, and not nested within each other, a save in one domain may not be properly nested with respect to a save of the same variable in the other domain. This causes a run-time error with global variables. If the variable modifications don't need to be visible to other domains, declare it as domain-bound global. The save and save-clear actions can be applied to a domain-bound global variable without the nesting restriction.
  • You can only use using nested-referents in one domain at a time. In no circumstance can you output referents to the parser itself.

Note that it may not always be obvious which coroutine a certain piece of code is running in. For instance, a find rule could be fired either by a submit in an string source function or by a submit in an element rule. That rule would be running in one domain in the first case and in the other domain in the second case.

If you write code that depends on the timing of the switching between the string source function and the parser, you may need to be aware of the rules OmniMark uses when switching domains.

  • In general, OmniMark switches domains at its own discretion. You cannot control the switch, but you may be able to write a test in one domain to see where the other domain is up to. For instance, in the string source function, you can test to determine what the current element is in the domain of the parser using the element is test.
  • There is guaranteed to be a domain switch from the string source function at the end of every output statement that outputs to the #current-output of the string source function.

The old form of input function

In versions of OmniMark prior to version 7, an input function was an action function which was called as an input function by adding the keyword input to the parser invocation:

  define function parser-feeder as
     output "<greeting>"
     submit "Hello world."
     output "</greeting>"
  
  find "world" => planet-name
     output "<planet>"
         || planet-name
         || "</planet>"
  
  process
     do xml-parse
      scan input parser-feeder
        output "<HTML><BODY>%c"
            || "</BODY></HTML>"
     done
  
  element "greeting"
     output "<P>%c</P>"
  
  element "planet"
     output "<B>%c</B>"

This form is deprecated but is supported for backward compatibility.

Prerequisite Concepts