|
|||||
|
|||||
Prerequisite Concepts | Related Topics | ||||
Input scopes |
In OmniMark, consumption and processing of input data is separated from selecting the source of the input,
meaning that the input variable does not have to be lexically in scope for you to
consume it. Instead, any string source
or markup source
can be placed in an input execution scope. While an input source is in the current input scope, all pattern-matching and
parsing statements and rules will consume it regardless of the lexical scope they occur in.
A new input scope is created by every submit
, every variant of scan
, and every
matches
test. They establish the current input for the execution scopes contained within them. They
also initiate scanning of that source. You can change the current input scope without initiating scanning by using
using input as
.
Once an input scope is established, it is in effect for the execution scope of the submit
,
scan
, or using input as
that established it. Within that scope, you can initiate a new
scan of the current source using #current-input
. This allows you to perform efficient one-pass
scanning of nested structures by initiating a new scan for each level of nesting, without the need to capture the
whole structure and re-scan it.
The following code demonstrates this with the function "sum-of-csv", which calculates the sum of a series of
comma-separated values found in the current input. This function could be called anywhere there is a current input
scope, and it will consume a series of comma-separated numeric values from the current input scope and return the
sum. It will exit as soon as it encounters data that does not fit the pattern it is looking for, leaving the
current input scope intact, but with the comma-separated-value data consumed.
define integer function sum-of-csv as local integer sum initial {0} repeat scan #current-input match white-space* digit+ => number white-space* ","? set sum to sum + number again return sum process repeat scan "Results: (12,34,65, 92 , 75 )" match "Results:" white-space* "(" output "Total: " || "d" % sum-of-csv match ")" again
Note the difference between this code and the more common programming practice represented by the following
program:
define integer function sum read-only integer numbers as local integer total initial {0} repeat over numbers set total to total + numbers again return total process local integer numbers variable repeat scan "Results: (12,34,65, 92 , 75 )" match "Results:" white-space* "(" [digit or space or ","]* => csv ")" repeat scan csv match digit+ => num set new numbers to num match any again output "Total: " || "d" % sum numbers again
The differences between these two pieces of code are twofold. First, in the second, more conventional, code the
outer level of code is responsible for identifying the whole nested structure. This has a kind of symmetry about
it, but it is misleading symmetry. The task of recognizing the beginning of a nested structure takes place outside
the nested structure. (You find the door marked "IN" when you are outside; you find the door marked "OUT" when you
are inside.) The task of recognizing the end of a nested structure should take place inside the nested structure.
In our first example, the function that handles the comma-separated values is responsible for figuring out when
the comma-separated values end. It does this very easily by exiting the repeat scan
as soon as it
sees a character that does not fit the pattern it is looking for.
The second difference between the two programs is that the second program has to scan the csv data twice -- once
when it is trying to find it in the data stream, and again when it is analyzed in the second repeat
scan
. The first program processes the csv data and finds the end of the structure in one pass.
Prerequisite Concepts
|
Related Topics
|
Copyright © Stilo International plc, 1988-2010.