|
|||||
|
|||||
Prerequisite Concepts | Related Topics | ||||
Pattern matching functions |
A pattern matching function is a switch function that is used in a pattern and participates in the pattern matching process by scanning #current-input
. The function's return value is used in the calling pattern to determine if the pattern matched by the function succeeded or failed.
Here is a very simple pattern matching function that matches text up to and including a specified string:
define switch function upto-and-including ( value string pat ) as return #current-input matches any** pat process submit "Mary had a little lamb." find "Mary" upto-and-including("little") =>stuff output stuff find any
Here the function "upto-and-including" uses the matches
operator to determine if the input data, represented by #current-input
, contains the terminating string value. If it does, matches
consumes that portion of the data and the function returns true, allowing the pattern that called the function to continue.
If the input data does not contain the terminal string, matches
returns false, the function returns false, and the pattern that called the function fails.
As the code above shows, data matched by a pattern matching function can be captured in a pattern variable in the usual way. One of the limits of conventional pattern variables is that they cannot be used to build a shelf of values from a repeated pattern. Pattern matching functions offer a way around this limitation:
global stream foo variable define switch function digit-catcher (modifiable stream the-digits) as do scan #current-input match digit+ => digits set new the-digits to digits return true else return false done process submit "(1)(2)(3)(4)" find ("(" digit-catcher(foo) ")")+ repeat over foo output foo || "%n" again
Pattern matching functions are particularly useful in nested pattern matching. The following code uses a pattern matching function to handle nested parentheses:
define switch function between-parens as repeat scan #current-input match [\"()"]+ match "(" between-parens ")" match value-end return false again return true process submit "(1((2)(3))478(954)" find ("(" between-parens => stuff ")") output stuff || "%n" find any
The function "between-parens" matches material between parentheses. If it encounters an opening parenthesis character, it calls itself recursively so that any level of parenthetical matter will be matched. If it encounters a closing parenthesis that is not balanced by a preceding opening parenthesis, the character will not match, the repeat scan will exit, and the function will return true.
Note that we do not actively match the closing parenthesis. Rather, the closing parenthesis is the only thing we don't match. This is a common and useful technique in many kinds of balancing operations. Find everything but the closing delimiter, and allow the repeat scan to exit. This allows the closing delimiter to be matched in the outer pattern, which is good for two reasons. First, it makes the pattern easier to read. Second, it allows you to capture the content of the structure without its delimiters (as we do here).
If the function matches the end of the input without seeing the closing parenthesis, it returns false. If this occurs in an iterative call, value-end will then be matched by each instance of the function as it unwinds.
Interestingly enough, this function can be written in a slightly more compact fashion:
define switch function between-parens as repeat scan #current-input match [\"()"]+ match "(" between-parens ")" again return true process submit "(1((2)(3))478(954)" find ("(" between-parens => stuff ")") output stuff || "%n" find any
This form never returns false. It does, however, work almost identically to the original function. Unless a balancing closing parenthesis is encountered, the function will read to the end of the data, just like the previous version. It then returns true, rather than false, just as if it had ended with the closing delimiter. But the pattern that called the function will now fail because it will not be able to match the closing ")".
You can also use pattern matching functions to do some of the processing of the matched data, though it is important to remember that the code in a pattern matching function is called and executed before the pattern as a whole is complete. This means the function could execute even though the pattern as a whole fails. Thus the function could be called and executed again in a subsequent attempt to match the same data.
Prerequisite Concepts
|
Related Topics
|
Copyright © Stilo International plc, 1988-2010.