|
|||||
|
|||||
Prerequisite Concepts | Related Syntax | ||||
String source data type |
You can use the string source data type to represent a source of string data. Since a string source must actively
generate data, you cannot declare a variable of type string source, you can only create a function of that type.
Such a function will stream the data it produces to the calling environment. For example, the following function
produces a stream of numbers:
define string source function numbers from value integer first to value integer last as repeat for integer i from first to last output "%d(i) " again process output numbers from 1 to 100
A string source function streams the data that it creates to the calling context. The calling context becomes the
current output for the duration of the function (unless it is explicitly changed during the execution of the
function). All data output during the execution of the function, including output created by other functions or
rules invoked during the execution of the function, is streamed to the calling context. A
string source
function does not buffer the data it creates, but streams it incrementally to its
calling context. To accomplish this, a string source function runs as a coroutine with the calling environment.
When two functions run as coroutines, control is handed back and forth between them until the data is completely
processed, ensuring that the data is not buffered as it passes from one routine to the next. In OmniMark 8 you can
have as many connected coroutines as you need. In the example above, the numbers function and the output action
run as coroutines. In the following example, the roman
and numbers
functions and
the output
action all run as coroutines with each other:
define string source function numbers from value integer first to value integer last as repeat for integer i from first to last output "%d(i) " again define string source function roman value string source numbers as repeat scan numbers match digit+ => num output "i" % num match [\digit]+ => chars output chars again process output roman numbers from 1 to 100
The roman
function in the example above uses string source
as the type of a function
argument. You can declare an argument of type string source
. Naturally the object passed to such an
argument must be a string source: either a function of type string source
, or a built-in OmniMark
source such as #main-input
. A string source
function (like any source) can only operate
in a streaming fashion if it is called in a streaming context. If a string source
function is called
in non-streaming context, such as the set
action, it will not operate in a streaming fashion
and will buffer its data completely before it returns, as in the following example:
process local string number-string set number-string to numbers from 1 to 100 output number-string
Here the string source
function numbers is called by the set
action for the string
variable number-string
. The function runs to completion and returns its entire value to the
set
action, just as if it had been a string-returning function.
There is nothing inherently incorrect about calling a string source
function in a non-streaming
context. In fact, it may be a useful habit to develop to write string source
functions rather than
string
functions, since a string source
function is never less efficient than a
string
function, and is often more efficient. Just recognize that data is being buffered when a
string source
function (or any other kind of source) is called in a non-streaming context and that
this may have negative consequences for the performance or resource requirements of your program if the amount of
data being buffered is large.
The string source
argument must always be a value
argument. You can use a value
string source
argument for both internal and external functions.
A string source function can be either an internal function or an external function.
The difference between a string source
and a string
is that the string
is
static: it exists in a particular place and can be referenced at will. Therefore if you pass a string
to a function, you can reference that string
as often as you like:
define string function duplicate value string to-be-duplicated as return to-be-duplicated || to-be-duplicated process output duplicate "Hip " || "Hooray%n"
A string source, by contrast, is a dynamic supply of characters, and once that supply is exhausted, you can't get
the same characters again:
define string function duplicate value string source to-be-duplicated as return to-be-duplicated || to-be-duplicated process output duplicate "Hip " || "Hooray%n"
Unlike the first program, which outputs "Hip Hip Hooray", this version outputs only "Hip Hooray", since the
string source to-be-duplicated
is fully drained the first time it is referenced.
It you needed to output the value of a string source
twice, you would need to capture the output of
the source in a variable:
define string function duplicate value string source to-be-duplicated as local string temp set temp to to-be-duplicated return temp || temp process output duplicate "Hip " || "Hooray%n"
It never makes sense to write a function this way, however, since writing the function with a string argument, rather than a string source argument would achieve the exact same effect - draining the stream into a local string variable in the function.
Notice that this restriction only applies to an instantiated source, which, in practice, means OmniMark sources and string source parameters within functions. String source functions can be called as many times as you like, since they instantiate a new source each time they are called.
A string source
can be used wherever a value of type string
is expected. The source
will be drained into the string. With one exception, a string
can be used wherever a string
source
is expected: a new source will be instantiated to provide the contents of the string to the
calling environment. In this case, the string
is used to initialize the string source
,
but the string
is not affected when the string source
is drained. Its value remains
unchanged.
The one place where a string
cannot be used instead of a string source
is as a destination for the
signal
action: signal to
must be followed by a string sink
or
string source
name.
There is one source that is normally present everywhere in an OmniMark program: #current-input
.
During the execution of a normal function, the current input scope of the calling environment is available to the
function as #current-input
, as illustrated in the following program:
define string function parse as do xml-parse scan #current-input return "%c" done element "greeting" output "%c" process using input as "<greeting>Hello World</greeting>" output parse
Because a string source
function is itself a generator of data, however, #current-input
is not attached in a string source
function. Thus if the above program were rewritten as follows,
OmniMark would report an error when the program tried to read the unattached source #current-input
.
define string source function parse as do xml-parse scan #current-input output "%c" done element "greeting" output "%c" process using input as "<greeting>Hello World</greeting>" output parse
To make #current-input
available to a string source
function, it must be passed to the
function explicitly as a string source
argument:
define string source function parse value string source to-be-parsed as do xml-parse scan to-be-parsed output "%c" done element "greeting" output "%c" process using input as "<greeting>Hello World</greeting>" output parse #current-input
The syntax of a string source function definition is:
define string source function <function name> <function argument list> (as <function body> | elsewhere)
or, in the case of an external function:
define external string source function <function name> <function argument list> as <external name> (in function-library <library name>)?
You can use a return
action with no value to end a string source
function, or you can
simply allow the function to end. There is no operational difference between the two, except that no part of
function body will be executed after the return is executed. return
is therefore useful if you want
to end the function within a conditional construct. Alternatively, you can
throw
an exception from a string source function. If the exception is not caught within the function
itself, it will propagate to the scope where the function was called from.
Whether a string source function ends by throwing, returning, or by reaching the end of the function body, its
consumer will continue execution notrmally, but the string source referring to the function will be at
value-end
. On the other hand, if the body of a scope consuming a string source function ends or throws
before consuming the entire source, the function will be halted and only its always rules will run. In either
case, the program execution then proceeds after the scope where the string source
function was called.
In the following example, the function root-element-of consumes only the name of the first element and
discards the rest of the input. The string source function normalize is halted at that point.
define string source function normalize value string source document as do xml-parse scan document output "%c" done define string function root-element-of value string source document as do scan normalize document match "<" [letter \ digit | "-_.:"]+ => element-name return element-name done process output root-element-of #main-input element #implied output "<%q>%c</%q>"
The string source
data type replaces the input
type and the external
source
type, which are deprecated.
The input function type declaration:
define input function ...
is deprecated in favor of the string source function type declaration:
define string source function ...
The external source function type declaration
define external source function ...
is deprecated in favor of the external string source function declaration:
define external string source function ...
The value source parameter declaration:
define external function foo value source origin
is deprecated in favor of the value string source parameter declaration:
define external function foo value string source origin
Prerequisite Concepts Data types Filter Functions String sink data type |
Related Syntax define string source function using input as |
Copyright © Stilo International plc, 1988-2008.