![]() |
|
||||
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|||||
|
|
|||||
| Prerequisite Concepts | Related Topics | ||||
Input scopes |
|||||
In OmniMark, consumption and processing of input data is separated from selecting the source of the input,
meaning that the input variable does not have to be lexically in scope for you to
consume it. Instead, any string source or markup source can be placed in an input execution scope. While an input source is in the current input scope, all pattern-matching and
parsing statements and rules will consume it regardless of the lexical scope they occur in.
A new input scope is created by every submit, every variant of scan, and every
matches test. They establish the current input for the execution scopes contained within them. They
also initiate scanning of that source. You can change the current input scope without initiating scanning by using
using input as.
Once an input scope is established, it is in effect for the execution scope of the submit,
scan, or using input as that established it. Within that scope, you can initiate a new
scan of the current source using #current-input. This allows you to perform efficient one-pass
scanning of nested structures by initiating a new scan for each level of nesting, without the need to capture the
whole structure and re-scan it.
The following code demonstrates this with the function "sum-of-csv", which calculates the sum of a series of
comma-separated values found in the current input. This function could be called anywhere there is a current input
scope, and it will consume a series of comma-separated numeric values from the current input scope and return the
sum. It will exit as soon as it encounters data that does not fit the pattern it is looking for, leaving the
current input scope intact, but with the comma-separated-value data consumed.
define integer function sum-of-csv
as
local integer sum initial {0}
repeat scan #current-input
match white-space*
digit+ => number
white-space*
","?
set sum to sum + number
again
return sum
process
repeat scan "Results: (12,34,65, 92 , 75 )"
match "Results:" white-space* "("
output "Total: " || "d" % sum-of-csv
match ")"
again
Note the difference between this code and the more common programming practice represented by the following
program:
define integer function sum
read-only integer numbers
as
local integer total initial {0}
repeat over numbers
set total to total + numbers
again
return total
process
local integer numbers variable
repeat scan "Results: (12,34,65, 92 , 75 )"
match "Results:" white-space* "(" [digit or space or ","]* => csv ")"
repeat scan csv
match digit+ => num
set new numbers to num
match any
again
output "Total: " || "d" % sum numbers
again
The differences between these two pieces of code are twofold. First, in the second, more conventional, code the
outer level of code is responsible for identifying the whole nested structure. This has a kind of symmetry about
it, but it is misleading symmetry. The task of recognizing the beginning of a nested structure takes place outside
the nested structure. (You find the door marked "IN" when you are outside; you find the door marked "OUT" when you
are inside.) The task of recognizing the end of a nested structure should take place inside the nested structure.
In our first example, the function that handles the comma-separated values is responsible for figuring out when
the comma-separated values end. It does this very easily by exiting the repeat scan as soon as it
sees a character that does not fit the pattern it is looking for.
The second difference between the two programs is that the second program has to scan the csv data twice -- once
when it is trying to find it in the data stream, and again when it is analyzed in the second repeat
scan. The first program processes the csv data and finds the end of the structure in one pass.
|
Prerequisite Concepts
|
Related Topics
|
Copyright © Stilo International plc, 1988-2010.