Chapter 13. Support file formats

1. stdfonts.config
1.1. Properties and Values
1.2. Options
1.3. File structure
1.4. Matching Algorithm
2. Custom Encodings
2.1. How it works
2.2. Associating a Font with an Encoding
2.3. File format
3. base.css
4. XML Catalog
4.1. downCast Catalog support
4.2. How-To: Adding new local DTDs

1. stdfonts.config

RTF files need to specify which encoding a font to be used is using and what properties it has. This is used by a rendering application to determine the best matching font on a platform where the exact specified font is not available. Additionally, the encoding a font is in is used by the rendering application to correctly interpret the characters found in the RTF file. Since you currently can only specify the font family name in the CSS2, downCast needs a mapping table of font names to their respective font encoding and properties.

This is what the stdfonts.config file is for. downCast comes with a default file embedded in the application JAR. You may extend and/or override it by providing a custom stdfonts.config file, to be located in the application's support directory in the Encodings directory. Here, you can specify standard font properties based on the font name.

downCast currently does not support the CSS3 Module: Web Fonts, nor does our upCast companion application as a possible data generating source.

stdfonts.config can be found at the following location in the package hierarchy in the downcast.jar JAR file:

{JAR Resource Location Path}/config/stdfonts.config

You can modify this file to your liking and requirements, or supply on override file in the location mentioned above. Follows an informal description of the file format and the necessary properties, followed by the search algorithm used by downCast to find the properties for a given font in a CSS rule.

1.1. Properties and Values

The following special properties are used in the stdfonts.config file:

1.1.1. -ilx-rtf-font-family

Determines the general RTF font family a font belongs to based on its design. An RTF rendering application will use this information to find a font with similar appearance when an exact match cannot be found.

Supported values: roman, swiss, symbol, modern, script, decor, tech, bidi

1.1.2. -ilx-codepage

This indicates the Windows codepage the font uses for its encoding.

Supported values: codepageAsInteger, -1, 10000, -1000, -1001, -1002, -1004

The special values have the following meaning:

-1

Uses the default encoding. This is the best choice for normal fonts.

10000

Identifies the Mac Roman encoding.

-1000

Identifies the Private Use Area mapper.

-1001

Identifies the standard encoding of the Symbol font.

-1002

Identifies the encoding of the Wingdings font.

-1004

Identifies the encoding of the Zapf Dingbats font.

1.1.3. -ilx-unicode-offset

Specifies the Unicode codepoint offset for this particular font. On platforms like Macintosh and Windows, fonts that have no Unicode mapping defined like Webdings or Hoefler Text Ornaments, will be mapped 1:1 into the PUA. Normally, this is the area of U+F000…U+F0FF, but by using the U-xxxx notation below, you can set the offset anywhere you require.

Supported values: normal | private | U-xxxx

with private being equivalent to U-F000, which is the Unicode codepoint offset (should be in the Private Use Area (PUA)) where this mapping starts, and normal being equivalent to U-0000, which is also the default if the property is not specified.

1.1.4. -ilx-renderhint-fontswitch

When downCast encounters a Unicode character to render to RTF, it first looks whether this character is part of the encoding of the current font. If it is, it is written according to RTF specifications. However, when this Unicode character is not part of the encoding of the current font, downCast tries to look up a font in the names specified using the @font-search-list option in order. The first one it finds will be used to write the character to RTF. However, for a subsequent RTF reader to pick this up correctly, downCast must write a switch of font for this specific character. This property specifies which method downCast should use for this, if possible:

font

downCast will write a simple RTF font switch {\fx c}.

field

downCast will write the character using a SYMBOL field. This is only possible for single-byte-fonts.

auto

downCast decides how to best write the character.

1.1.5. -ilx-renderhint-unicode

When downCast needs to write a character, it can do it in two ways: either just the character for the current font's encoding, or additionally as the original Unicode codepoint. By specifying one of the following values for a font, you can tell upCast which method it should use )if possible):

always

downCast will always write the character in the current encoding and its Unicode equivalent

never

downCast will not write the Unicode equivalent

auto

downCast decides how to best write the character.

1.2. Options

The following general options are available:

@font-search-list

lets you specify a comma-separated list of font names in which downCast will search for an incoming Unicode character to be output to RTF if it is not part of the current encoding. This lets you specify precedences, e.g. you may want to list the actually installed fonts on your particular system first. If downCast does not find a match in the listed fonts, it will try all fonts defined in stdfonts.config. If it then still does not find a match, it will use Unicode notation with an underscore '_' as replacement character.

1.3. File structure

The file structure is line based. Each line identifies a set of font names with a set of properties:

fontlist ::= propertyset
fontlist ::= font ( ', ' font)*
font ::= fontname | '"' fontname '"'
propertyset ::= '\-ilx-rtf-font-family: ' ffval '; \-ilx-codepage: ' [0-9]+ ';'
                ' \-ilx-unicode-offset: ' uoval '; \-ilx-renderhint-fontswitch: ' rhfs '; \-ilx-renderhint-unicode: ' rhuc ';' 
ffval ::= 'roman' | 'swiss' | 'symbol' | 'modern' | 'script' | 'decor' | 'tech' | 'bidi'
uoval ::= 'normal' | 'private' | 'U-' [0-9A-F]{4}
rhfs ::= 'font' | 'field' | 'auto'
rhuc ::= 'always' | 'never' | 'auto'
fontname ::= name of font

Note

Note that you must use CSS style escapes (or numerical character entities of the form &#...;) to generate Unicode characters for specifying font names using characters outside the ASCII range. Examples for this can be seen in the file packaged default stdfonts.config file.

All lines starting with // denote a comment line, as do empty lines.

1.4. Matching Algorithm

To avoid having to explicitly define every font in the stdfonts.config file which might ever occur in a stylesheet, downCast employs a multi-stage search algorithm for a matching property definition entry as follows:

First, a potentially existing user supplied stdfonts.config is prepended to the default one supplied by downCast. Within this concatenated, big file, the following search algorithm is employed:

  1. A search for the exact name (considering case) is performed. The first matching entry is used if it exists.

  2. A search for the exact name, but ignoring case, is performed. The first matching entry is is used if it exists.

  3. A search for a font name is performed that matches the start of the actual name. So if the characteristics for "Univers Bold" are requested, and there is an entry "Univers" in stdfonts.config, then its properties are used. Case is ignored.

  4. A search for a font name is performed that is contained in the actual name. So if the characteristics for "L Univers 44" are requested, and there is an entry "Univers" in stdfonts.config, then its properties are used because the string "Univers" is contained in the actual font name. Case is ignored.

2. Custom Encodings

2.1. How it works

downCast comes complete with virtually all default encodings you can use in RTF resp. Word, including many two-byte ones. This means that normally, you do not need to provide a custom encoding.

The default encodings are hard-coded with optimizations done for each specific encoding to provide efficient access, since the mapping functions are called for each character that passes through downCast. These default encodings are therefore not directly user-accessable. However, there are sometimes occasions where you'll need to use a custom encoding, especially when you are using custom fonts.

downCast provides a custom encoding loader and handler which lets you specify your own mappings from character codepoint in the font to Unicode by means of a simple text file. Both, one-byte and two-byte encodings can be specified in this way.

To create a custom encoding, you need to create an ASCII text file with the extension .encoding which specifies both the mapping of the individual codepoints to Unicode and also states which codepage it implements. You can also give it a name for easily spotting it in the UI portions of the application. downCast looks for custom encoding files in the {Encodings Folder} at startup. All encodings it finds are added to the internal set of default encodings. By specifying a codepage in a custom encoding that has a default equivalent, you may override any of the factory-supplied encodings.

Since the mapping is built on the fly, specific optimizations cannot be performed and the use of custom encodings may slow-down processing slightly.

Note

downCast automatically inverts the supplied encoding for mapping the incoming Unicode characters to the font's encoding. Please not that this can only work if the mapping is unambigous.

2.2. Associating a Font with an Encoding

A custom encoding per se is not tied to anything but the codepage it implements. To tie a codepage to a specific font, you need to extend or override the stdfonts.config file. In this mapping file, you simply list the font's name and associate it with a codepage using the keywords as described.

It is recommended to use codepage values greater than 40000 for custom encodings, as downCast will not use these codepages internally. Which you use for custom encodings is up to you. downCast reserves the range from 32000 to 35000 for internal use, so you should not use these. Also note that when you override a default encoding, every font that is specified to use that encoding will use the custom one.

2.3. File format

File names can be arbitrary, must however have an extension of .encoding. They should be placed into the {Encodings Folder} for downCast to scan and load them automatically at startup.

The file structure is simple: one mapping entry per line, and all lines starting with #, // or ; are treated as comments. To create a two-byte encoding, separate the two bytes by a comma.

A mapping entry has the form (notation similar to BNF):

mapping ::= <srcbyte> [',' <srcbyte>]? '=' <unicodechar>

with:

srcbyte ::= hexNumber | decimalNumber
unicodechar ::= hexNumber | decimalNumber
hexNumber ::= ('0x' | '0X' | '$')[0-9A-Fa-f]+
decimalNumber ::= [0-9]+

Follows a rather silly example, which maps what in codepage 1252 fonts is a space to the at-sign:

@codepage 42001
@encodingname Silly Encoding
$20=$40

Options. Two special options are supported:

@codepage decimalNumber

This specifies the codepage this encoding represents.

You can specify either an existing encoding to override its definition, or create custom codepages for specific fonts, in which case you should choose codepage number higher than 40000.

@encodingname asciistring

This is a descriptive name for the encoding so you can easily spot it in downCast's UI.

Note

It is recommended to use codepage values greater than 40000 for custom encodings, as upCast will not use these codepages internally. Which you use for custom encodings is up to you.

upCast reserves the range from 32000 to 35000 for internal use, so you should not use these. Also note that when you override a default encoding, every font that is specified to use that encoding will use the custom one.

3. base.css

downCast provides a default stylesheet which specifies some common properties which are tedious to specify for each element. This stylesheet is bundled in the application JAR at the following package location:

{JAR Resource Location Path}/config/base.css

If required, you may change it to your desire. It is the first stylesheet read for every document conversion. A typical sample base stylesheet is the following:

/*
 * The downCast Base (=User-) Stylesheet
 * Version: 1.1
 * Date: 2004-03-03
 */


/*
 * Document page size and margins 
 */
@page { 
  size: 210mm 297mm; /* A4 */
  margin-top: 1.0in; 
  margin-bottom: 0.7in; 
  margin-left: 1.0in; 
  margin-right: 1.0in; 
}

/** General element properties **/
item { display: list-item; }
inline { display: inline; }
table { display: table; }
tr { display: table-row; }
tbody { display: table-row-group; }
thead { display: table-header-group; }
td { display: table-cell; }

document, par, heading, block, section, part {
    display: block;
}

document {
  page: "";
  orphans: 2;
  widows: 2;
  font-size: 12pt;
  \-ilx-footnote-style-type: decimal;
  \-ilx-footnote-numbering-policy: continuous;
  \-ilx-footnote-position: pagebottom;
  \-ilx-font-family-default: Times;
}

part {
  page: "";
  page-break-before: auto;
}

par {
  \-ilx-tab-stops: left blank 0tw;
}



/* heading elements get the corresponding class as below automatically
 * set upon insertion into the document, which is then (possibly)
 * overridden by the externally specified style.
 */
.heading\a0 9 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: auto;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 8;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Helvetica, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: normal;
      color: #000000;
      vertical-align: baseline;
      font-size: 11.0pt;
      text-transform: none;
} /* was original style #9 */ 



.heading\a0 8 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: auto;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 7;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: italic;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Times, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: normal;
      color: #000000;
      vertical-align: baseline;
      font-size: 12.0pt;
      text-transform: none;
} /* was original style #8 */ 



.heading\a0 7 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: auto;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 6;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Times, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: normal;
      color: #000000;
      vertical-align: baseline;
      font-size: 12.0pt;
      text-transform: none;
} /* was original style #7 */ 



.heading\a0 6 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: auto;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 5;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Times, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 11.0pt;
      text-transform: none;
} /* was original style #6 */ 



.heading\a0 5 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: auto;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 4;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: italic;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Times, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 13.0pt;
      text-transform: none;
} /* was original style #5 */ 



.heading\a0 4 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: avoid;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 3;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Times, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 14.0pt;
      text-transform: none;
} /* was original style #4 */ 



.heading\a0 3 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: avoid;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 2;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Helvetica, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 13.0pt;
      text-transform: none;
} /* was original style #3 */ 



.heading\a0 2 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: avoid;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 1;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: italic;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Helvetica, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 14.0pt;
      text-transform: none;
} /* was original style #2 */ 



.heading\a0 1 {
      display: block;
          /* Paragraph Properties: */
      text-align: left;
      margin-left: 0.0mm;
      float: none;
      page-break-after: avoid;
      margin-bottom: 1.1mm;
      word-break-inside: normal;
      line-height: normal;
      margin-top: 4.2mm;
      text-indent: 0.0mm;
      margin-right: 0.0mm;
      \-ilx-paragraph-outline-level: 0;
      widows: 2; orphans: 2;
          /* Character Properties: */
      letter-spacing: normal;
      font-style: normal;
      vertical-align: baseline;
      font-variant: normal;
      font-family: Helvetica, serif;
      text-decoration: none; text-underline-style: none; text-underline-mode: continuous;
      font-weight: bold;
      color: #000000;
      vertical-align: baseline;
      font-size: 16.0pt;
      text-transform: none;
} /* was original style #1 */ 
 


list {
  list-style-type: decimal;
}

textbox {
  display: inline-block;
  float: left;
  width: 4cm;
  /*height: 2cm;*/
  position: static;
  float: none;
  top: 0cm;
  left: 0cm;
  right: 0cm;
  bottom: 0cm;
  z-index: 0;
  border-style: none;
}

4. XML Catalog

4.1. downCast Catalog support

downCast supports the use of a Catalog file. A Catalog file is in its simplest idea a mapping definition between PUBLIC DTD identifiers and the location of a physical copy of that specific DTD (or more general, Entity). The downCast application supports the Catalog file format as defined in http://www.oasis-open.org/specs/tr9401.html as well as XML Catalogs.

Note

downCast will ask you at first launch whether you want to install a default catalog file (if there isn't already one installed). It is highly recommended to have this default installed, as it places a copy of the upCast DTD locally on your machine and lets you validate the XML (upCast DTD) filter output without requiring an active connection to the internet.

During this initial procedure, you get also the choice to specify a different default XML Catalog to use by downCast. Choose the respective option in the presented dialog and pick the catalog using a standard file chooser.

You can also change the XML Catalog used by selecting Extras Choose XML Catalog... any time during executing downCast.

Tip

You may wish to add an entry to the catalog for the XHTML 1.0 strict DTD as well and place a copy on your local machine to be able to validate also XHTML files without requiring an internet connection. Such an entry might look like this:

PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "file:///localpath/to/xhtml1-strict.dtd"

4.2. How-To: Adding new local DTDs

The recommended procedure to add new DTDs to downCast's catalog system is to create a new catalog file and include downCast's default catalog file in it by reference. This way, you may repeatedly call Setup default XML Catalog... without having to worry or edit the possibly manually edited default catalog file when it is created anew.

So the basic pattern to use a customized catalog is as follows:

  1. Create a new catalog file anywhere outside downCast's {Application Support Folder} folder structure. This may be either an OASIS catalog or an XML Catalog.

  2. Include downCast's default catalog file in this new catalog, using the CATALOG keyword (OASIS catalog).

  3. Add references to other catalogs or create mappings as desired.

  4. Tell downCast to use the modified catalog using the Choose XML Catalog file... command.

    Note

    You'll have to pick that custom catalog file each time you delete downCast's preferences (which you normally should not do) or you call downCast's Setup default XML Catalog... command.