take

operator

Return type:
String
Returns:
A string that is the part of the source matched by the pattern.
Syntax
source take pattern


Purpose

The take operator returns a string that is the data from the current source matched by a specified pattern.

You can use take to perform substring operations. The following code uses take to grab the first 4 characters of a string:

  process
      output "ABCDEFGHIJKLM" take any{4}

This program will output "ABCD".

You can use take with any OmniMark source and as part of a nested pattern matching operation. The following code uses take to select a string from #current-input:

  process
     local stream area-code
     using input as "My telephone number is (123) 555-7890"
     repeat scan #current-input
        match any** lookahead "("
           set area-code to #current-input take ("(" digit{3} ")")
     again
     output area-code

Notice that the pattern used in a take must be a simple pattern. If you need more than one pattern element in a take, you must wrap the whole pattern in parentheses. Thus the take pattern in the code above is written ("(" digit{2} ")").

It is an error if the pattern in a take does not match. The following program includes an error in the data that causes the take pattern not to match. This results in a program error, which is caught and processed by the program:

  process
     local stream area-code
     using input as "My telephone number is (1523) 555-7890"
     repeat scan #current-input
        match any** lookahead "("
           set area-code to #current-input take ("(" digit{3} ")")
     again
     output area-code
     catch #program-error code c
        do when c = 6132
           output "Area code not found"
        else
           rethrow
        done

If you want to ensure that a take operation always succeeds, write your pattern so that it can match zero characters. In the following program, this is achieved by adding a "?" to the end of the take pattern. Since any pattern with a ? or * occurrence indicator can match zero times, such a pattern will always match (though it may match zero characters):

  process
     local stream area-code
     using input as "My telephone number is (1523) 555-7890"
     repeat scan #current-input
        match any** lookahead "("
           set area-code to #current-input take ("(" digit{3} ")")?
     again
     output area-code

Since the data error still exists, the take pattern will return zero characters and the take operator will return a zero-length string. The program will therefore output nothing.

Related Syntax