swirl
Guide to OmniMark 9   OmniMark home
docs home 
IndexConceptsTasksSyntaxLibrariesLegacy LibrariesErrors
 
Prerequisite Concepts     Related Topics  

Formatting parsed data

When parsing SGML or XML data, the parsed data can be formatted in one of two approaches. The first approach is to capture the parsed data in an OmniMark string and then use regular OmniMark programming techniques to format the data. The other is to give instructions to OmniMark on how the data should be formatted. The latter approach is often faster and cleaner.

OmniMark can be instructed on how to format data using format items and placing the appropriate format modifiers on the format commands %c, %v, and %q.

Formatting data content

Unless intercepted in a data-content rule, the parser outputs data content to the current output scope. Format modifiers can be added to the parse continuation operator, %c, to change how the data content is formatted.

The following modifiers are supported.

It is possible to override the subcomponents, even those going into the same stream, by removing modifiers with the following syntax:

  put my-stream with "" "%c"

Formatting element names and external entities

The %q format item refers to the name of the currently opened element everywhere except in external-text-entity and external-data-entity rules. In functions, even if the function is called from an external-text-entity or an external-data-entity rule, %q still refers to the currently opened element. This is to ensure that a function always behaves in the same way, regardless of what rule it is called from.

When referring to an element, a number of modifiers can be used to change the default behaviour of %q.

%q refers to the name of the current entity in the body of an external-data-entity or an external-text-entity rule. Anywhere else, it refers to the current element.

When referring to an entity, a number of modifiers can be used to change the default behaviour of %q.

These modifiers can also be used in combinations.

If an entity does not have a system identifier, then the e format modifier acts the way ep does.

If an entity does not have a public identifier, or if the #library shelf does not have an entry whose key is the public identifier, then it is an error to use the ep format modifier combination. If such an entity also does not declare a system identifier in the entity declaration, then it is also an error to use the e format modifier. The same observation applies to the system identifier of the entity's notation when using the above format modifiers in combination with the o format modifier.

All of the combinations above may be further combined with the l or u format modifiers. Additionally, the o format modifier can also be combined with the f and k format modifiers, provided that it is not also combined with the e or p modifiers.

The f and k format modifiers can only be used with entity names and notation names.

Formatting attributes and external data entities

The %v format item is used to output an attribute of an element or of an external-data-entity.

The following example outputs the section identifier (the attribute named ID) when processing an SGML document:

  element "section"
     output "Section: %v(ID) %c"
          

The DTD for the above example must contain lines similar to the following:

  <!element section - o (#pcdata)>
  <!attlist section id number #required>
          

In element rules, the named attribute must be an attribute of the element; in external-data-entity rules, it must be a data attribute of the entity being processed. When %v is used in any other context, the named attribute must be an attribute of the containing element.

A number of modifiers can be used to change the default behaviour of %v.

In addition to the above modifiers, the modifiers below can be used if the attribute is declared as cdata.

If the attribute's declared type is entity or entities, and the entity name refers to an external entity, you can use the following modifiers (but not with the f, k, l, and u modifiers):

These modifiers can be combined.

If an entity has no system identifier, then e acts as ep does. It is an error if either e or ep is used and a system identifier cannot be obtained using the #library shelf.

This format accesses letters within system and public identifiers in uppercase or lowercase as they appear in the entity declaration. Letters in element, entity, or notation names appear in uppercase or lowercase as they appear in the processed document, unless the SGML declaration specifies uppercase substitution for that class of name. If so, the name is accessed with letters forced to uppercase. Thus, in the Reference Concrete Syntax, by default, element and notation names appear in uppercase while entity names appear as entered in the document.

For an entities attribute, if the attribute value contains more than one entity name, the using prefix must be used to select one entity whose system or public identifier is to be manipulated or displayed.

If the value of an entity or entities attribute is the name of an internal cdata or sdata entity, the %ev format can be used to determine the replacement text of the internal entity.

The e, p, and ep formats can also be used with notation, under the same conditions as entity or entities.

This example illustrates how the %ev format handles internal and external entities differently.

The element as-is has a single required ENTITY attribute text. The entity named by the attribute value provides the text that is to replace the element, wherever it occurs in a document.

  <!element as-is - o empty>
  <!element as-is text entity #required>
          

The element rule for processing the as-is element does the following:

Note that %ev returns one of two things, depending on whether the entity named by the attribute to which it is applied is internal or external.


  element "as-is"
     do when attribute text is external
        output file "%ev(text)"
     else
        output "%ev(text)"
     done
  
     suppress
          

Prerequisite Concepts
 
  Related Topics
 
 

Top [ INDEX ] [ CONCEPTS ] [ TASKS ] [ SYNTAX ] [ LIBRARIES ] [ LEGACY LIBRARIES ] [ ERRORS ]

OmniMark 9.1.0 Documentation Generated: September 2, 2010 at 1:35:14 pm
If you have any comments about this section of the documentation, please use this form.

Copyright © Stilo International plc, 1988-2010.