When OmniMark's built-in SGML or XML parser is invoked, it will normally expect to parse and compile the document type definition (DTD) before it begins parsing and validating the document instance markup. This will happen automatically, unless well-formed XML parsing is requested.
In some situations this default behaviour may not be desired. When parsing many markup documents referring to
the same DTD, for example, it is unnecessary to re-parse the DTD for every document. In order to help with these
situations, OmniMark provides access to the DTD in its compiled form through the built-in shelf
#current-dtd
.
In order to compile a DTD for reuse, one can capture the value of #current-dtd
during one parse
and store it in any shelf of type dtd
. The built-in shelves sgml-dtds
and xml-dtds
can
serve this purpose.
do sgml-parse document scan file "input-with-dtd-1.sgml" put #suppress #content set new sgml-dtds{"dtd-1"} to #current-dtd done
The same effect can be achieved by specifying the creating
clause of the do
sgml-parse
or do xml-parse
action. The parser will terminate at the end of the SGML
document prolog, create a compiled DTD
, and store it in the specified keyed item of the
sgml-dtds
or xml-dtds
shelf.
do sgml-parse document creating sgml-dtds{"dtd-1"} scan file "input-with-dtd-1.sgml" put #suppress #content done
Once a compiled DTD is obtained, it can be assigned to #current-dtd
in another parse, where it
will be used to parse and validate its input. The input document does not need to contain any DTD itself; if it
does it will be parsed but not used for instance validation.
do sgml-parse document scan file "instance-of-dtd-1.sgml" set #current-dtd to sgml-dtds{"dtd-1"} output "%c" done
The clauses with sgml-dtds
and with xml-dtds
of the do
sgml-parse
and do xml-parse
actions, respectively, can be used for the same purpose.
The DTD to which a document instance conforms is not always known in advance. When processing a collection of
documents of diverse types, it may be necessary to assign #current-dtd
to each document after its
parsing has started. This is only allowed up to the point when the dtd-end
is reached. The following
example rule selects the appropriate DTD for each input document at the moment it is referenced, based on its
system identifier. The first time a DTD is encountered, it will be compiled and stored. On each subsequent
reference to the same DTD, it will be reused.
external-text-entity #dtd when entity is system do when sgml-dtds has key "%eq" set #current-dtd to sgml-dtds{"%eq"} output "<!element %g(#doctype) - o empty>" ; a dummy DTD that will be ignored else set new sgml-dtds{"%eq"} to sgml-dtd cast #current-dtd output file "%eq" done
The rule above emits the sole <!element %g(#doctype) declaration in order to satisfy the
SGML parser which requires a valid DTD. Since this DTD will be replaced by the one assigned to #current-dtd
, its content doesn't matter as long as it's valid and declares an element whose name matches
#doctype
.