Internal SDATA entities are often used to represent characters that are not directly available in the character
set being used, either at a particular location, or in a "lowest-common-denominator" interchange file. SDATA and
CDATA entities can be matched as part of a larger pattern, as in following example:
translate "AT" sdata named "amp" "T" output "\ITALIC(AT&T)"
A multitude of SDATA entities that represent individual characters are defined in Annex D of ISO 8879. Combining
entity and other matches in a translate
rule allows an entity to be treated as just another character.
Care must be taken in composing patterns that include entity matching. In the preceding example, the letter T
is matched following the SDATA entity—the T
is not part of what is matched as the entity's name. Parentheses can be used to modify this behavior. If the pattern were the following, the entity name would have to be ampT
:
translate "AT" sdata named ("amp" "T") output "\ITALIC(AT&T)"
Any form of entity match can be combined with other text matching. If, for example, the ampersand character were
matched based on its replacement text rather than its name, the following translate
rule could be used
instead of that in the previous example:
translate "AT" sdata "[amp ]" "T" output "\ITALIC(AT&T)"