Pattern variables

When using patterns to match sections of input data, you must first capture the data in pattern variables for later use. Pattern variables are assigned using the => symbol, and referenced later. For example, in the first find rule in the following program the matched input data is assigned to the "found-text" pattern variable.

  process
     submit "Mary had a little [white] lamb"
  
  find ("[" letter+ "]") => found-text
     output found-text
  
  find any

This program outputs "[white]".

What if you want to output only the word in the square brackets, but not the brackets themselves? Try this:

  process
     submit "Mary had a little [white] lamb"
  
  find "[" letter+ => found-text "]" 
     output found-text
  
  find any

This program outputs "white". Here, the pattern variable is attached only to the part of the pattern immediately preceding the pattern variable assignment. In fact, this is the default behavior of pattern variables. That's why, to make the previous example work correctly, we had to surround the three elements of the pattern with parentheses to ensure that the text matched by the whole pattern was captured.

You can have more than one pattern variable in a pattern. You can even nest them. For example:

  process
     submit "Mary had a little [white] lamb"
  
  find 	("[" => first-bracket 
     letter+ => found-word 
     "]" => second-bracket) => found-text
  
     output first-bracket
     output found-word
     output second-bracket
     output found-text
  
  find any

The output of this program would be "[white][white]". The first "[white]" is the result of the first three output actions, and the second the result of the fourth output action.

Scope of pattern variables

Pattern variables are "declared" in a manner slightly different from other names: they are declared in a pattern assignment embedded in a pattern (using =>), instead of in a variable declaration.

Because pattern variables are declared and can be referenced before the local scope begins, they are treated differently than the variables declared within the local scope. In effect, they are treated as if two local scopes are defined, one within the other. The outer one defines the pattern variables, and the inner can contain local declarations to declare other kinds of shelves.

This means:

  • Pattern variables which are defined in a rule header or match pattern can also be referenced in the condition which follows it. (Versions of OmniMark prior to V3 did not permit this.)
  • This also means that it is not an error to declare a local variable in the inner local scope whose name is the same as that of a pattern variable defined in the rule header or match pattern. The declared local variable hides the pattern variable.
  • The name of a pattern variable cannot be used as a keyword anywhere within its scope: in the pattern, the following condition, if any, or in the following local scope.

The other context in which patterns, and therefore pattern assignments, can occur, is the second argument of the matches operator. This argument defines an entire local scope itself. This means that:

  • A pattern variable can have the same name as that of a pattern variable (or any other kind) defined in an outer scope. The new pattern variable hides the one defined in the outer scope.
  • A name assigned in such a second argument pattern cannot be used as a keyword within the same second argument pattern. There is no prohibition from using the name as a keyword outside of the second argument pattern, even within the same enclosing expression.

Prerequisite Concepts