Chapter 8. BCF Reference

1. Using the built-in batch processor
2. General BCF file structure
3. Element Reference
4. Examples

1. Using the built-in batch processor

Batch job processing in upCast Enterprise is controlled by so-called Batch Configuration Files (BCFs) that describe the batch job operations to be executed on source documents. They are executed by the Batch Processor import filter.

In order to access the batch job functionality you have to select the Batch Processor import filter. Do this via the Import Filter popup menu.

After you have selected the Batch Processor import filter you can choose a batch configuration file either via the Browse… button next to the Source File input field or you can directly type in the name and access path of the batch configuration file. In doing so, to separate folder names, you must use the forward slash ('/') on Unix and Macintosh systems. On Windows machines, use the backslash ('\') character instead. The slashes will be converted to the appropriate, system specific separator character during file access. Batch job file names must end in .bcf.

In order to initiate a batch job, click Start Conversion. This initiates a detailed check of the chosen batch configuration file.

This check comprises the following:

  • a test for the existence of all specified documents to be converted. You will get a warning for all files that were not found

  • a test for files that are already converted and therefore may be ignored (see skipexisting). You will be informed about all the files that will be skipped.

At the end of the check you get a survey about the number of documents to be converted and their accumulated size.

If you want to proceed with the execution of the batch job, click Execute Batch, otherwise click Cancel Batch.

A batch job consists of several phases, as determined in the batch configuration file. During the processing of these phases a progress bar will be displayed that provides a Cancel button. You can abort the batch job at any time by clicking this button. All already successfully written files will remain untouched.

Tip

You can quickly generate a BCF template from your current upCast configuration by using the FileSave Configuration As BCF... command.

2. General BCF file structure

A Batch Configuration File (BCF) is actually an instance of an XML document written in the BCF language. It consists of one or more blocks describing the filter configurations and the source files to be converted using these filter configurations.

A BCF is considered an input file for the Batch Processor import filter. In other words, what normally in upCast is the RTF file to be converted by the RTF 1.6 import filter, is now the BCF file.

Internally, a BCF sets up the conversion core of upCast and calls either the (none) or the RTF 1.6 import filter for each file according to the rules specified in the BCF.

A batch run as described in a BCF consists of one or more conversion elements.

A conversion specifies the complete filter configuration and one or more source elements, which specify what documents (specified by file elements) have to be converted.

A configuration consists of the specification of the export filter elements to be used, where each can have its own param element set.

A param element has a a name, value attribute pair.

The source element specifies the sourcepath, i.e. the path where the file elements to be converted will be found, and the destpath, where the converted files will be written to. In conjunction with the (optional) use of so-called Post Actions, you may also specify successpath and failurepath attributes.

A file element can either have a specific name attribute and thus specify a single source document. Or it can be of a selective nature in order to collect several files based on certain criteria, like all files starting or ending with a certain character combination or even all RTF files in the sourcepath. If the file element is empty and has no selection attributes, it indicates that the filters set up should be executed exactly once.

A sample BCF. Let's have a look at the following BCF:

<run> 
 <conversion>
   <configuration>
     <import type="rtf" />
     <filter type="XML">
       <param name="Extension" value=".xml" />
     </filter>
     <filter type="XHTML">
       <param name="Extension" value=".html" />
     </filter>
     <filter type="CSS">
       <param name="Extension" value=".css" />
     </filter>
   </configuration>
   <source sourcepath="file:///C:/upCast/rtf"
           destpath="file:///C:/upCast/rtf/converted">
     <file endswith=".rtf" skipexisting="yes" />
   </source>
 </conversion>
</run>

run is the root element of the BCF. It encapsulates all other BCF elements.

Then, a conversion element is specified. Within one run, you can specify several conversion elements, where each has its own filter settings and source file selections. In our sample, only one conversion element is specified.

A conversion consists of two parts: The configuration and the source, also in this order. The configuration sets up import- and export filter elements together with their param elements, then within source, the file elements for the source documents to be converted are specified.

The configuration element in our sample specifies the import filter to use by way of an import element with a type value of rtf - selecting the RTF 1.6 import filter, and three export filters that should be applied to all imported RTF documents, namely XML, XHTML and CSS. The type of the export filter is selected by the type attribute of the filter element. For a list of all available type attribute values see the reference section.

Any parameters you want to set for a filter go within the filter element as a name-value pair in form of a param element. The available parameter names and value ranges are the ones listed in the functional and parameter descriptions of the various filters.

Now, let's see which files (and therefore documents) are specified within the source section of our sample: First, the files need to reside in the folder C:\upCast\rtf. But we do not want to process all files there. Using the file element, we specify the desired ones in more detail: Only files with names ending in .rtf should be converted (endswith=".rtf"), and only if they haven't already been converted (skipexisting="yes"). Converted files should be written to C:\upCast\rtf\converted.

3. Element Reference

In the following we describe the actual DTD for BCFs in human language for each element. For those who better like the rather concise DTD notation, here it is (though a bit sloppy when it comes to attribute values):

<ELEMENT run (setvar*,conversion*)>
<ATTLIST run
         mode (interactive|silent) 'interactive'>
<ELEMENT setvar EMPTY>
<ATTLIST name CDATA #REQUIRED
         value CDATA #REQUIRED>
<ELEMENT conversion (configuration,source*)>
<ATTLIST conversion
         maxdocs CDATA #IMPLIED>
<ELEMENT configuration (import,filter+)>
<ELEMENT import (param)*>
<ATTLIST import
         type (none|rtf) #REQUIRED>
<ELEMENT filter (param)*>
<ATTLIST filter
         type CDATA #REQUIRED
<ELEMENT param EMPTY>
<ATTLIST param
         name CDATA #REQUIRED
         value CDATA #IMPLIED>
<ELEMENT source (file)*>
<ATTLIST source
         sourcepath CDATA #IMPLIED
         destpath CDATA #IMPLIED
         imagedestpath CDATA #IMPLIED
         postaction CDATA #IMPLIED
         successfolder CDATA #IMPLIED
         failurefolder CDATA #IMPLIED>
<ELEMENT file EMPTY>
<ATTLIST file
         endswith CDATA #IMPLIED
         startswith CDATA #IMPLIED
         skipexisting (yes|no) 'no'
         name CDATA #IMPLIED
         anyfile (yes|no) 'no'>
run
Description

The root element of a BCF.

Attributes
mode

Determines the run mode of this batch. The setting of this parameter overrides the Silent operation setting found in the Timed Execution… dialog under Extras Settings.

interactive

A pre-flight check is performed and then the user is asked whether he actually wants to run the batch. This is helpful during testing or for additional safety, e.g. when there are delete operations part of the batch.

silent

The batch is executed silently without pausing after the pre-flight step.

callback

Specify a callback class implementing de.infinityloop.upcast.BCFCallback. For more information, see the upCast API documentation.

setvar
Description

Lets you define variables for use in attributes later in the BCF's attributes. Variables are referenced by the character sequence ${varname}. The characters \, $, { and } need to be quoted by the backslash character \ if used literally.

Note

The above requires you to quote the Windows file separator character when specifying local paths in attribute values. We therefore recommend you use URLs for specifying files and paths in BCFs.

Attributes
name

The variable name.

value

The variable value. You may already use earlier defined variables for constructing the value.

conversion
Description

Encapsulates one conversion batch. There may be more than one conversion batch within a single BCF. A conversion consists of first a configuration of the import- and export filter(s) and then specifies the source files to be fed into that configuration.

Attributes
maxdocs

Sets the maximum number of documents that should be converted in this conversion as a decimal integer. This can be useful in conjunction with the Timed Execution mode to chunk a large number of files into several subsequent runs or for testing and debgging purposes of the general working of the BCF.

If the attribute is not specified or the value is -1, then no limit on the number of documents to be converted is set.

callback

Specify a callback class implementing de.infinityloop.upcast.BCFCallback. For more information, see the upCast API documentation.

configuration
Description

Encapsulates import and export filter configuration and parameter settings. Contains an import element and one or more filter elements for each export filter the document should be processed by. Export filters will be executed in the order they are defined in the BCF.

Attributes

(none)

import
Description

Selects the import filter to be used for this conversion and (optionally) specifies its parameter setting by including param elements as children, thus overriding the settings specified in the graphical UI.

Attributes
type

Selects the import filter to use:

none

The document is passed to the output filter without any processing.

rtf

Documents are imported using the RTF 1.6 import filter.

filter
Description

Sets an export filter. As contents, it defines export filter parameters with the help of param elements.

Attributes
type

Specifies the export filter type.

type can have one of the following values (names are case sensitive):

XML

the XML (upCast DTD) export filter conforming to the generic upCast DTD

XHTML

the XHTML 1.0 (strict) export filter

CSS

the External CSS2 export filter

Commandline

the Commandline processor for running an external shell command

XMLValidator

the XML Validator post-processing filter

XSLTProcessor

the XSLT processor

DocBook42

the DocBook 4.2 export filter

XMLRaw

the XML (Raw) export filter

XSLTProcessor

the XSLT Processor

RawTreeDumper

the Raw Tree Dumper export filter

UnicodeTranslator

the Unicode Translator export filter

param
Description

Sets import or export filter parameters.

Attributes
name

The name of an export filter's parameter.

For details on what parameters are available, what they do and by which export filters they are supported, see the descriptions of the respective import or export filters.

value

Value for the named parameter.

For available values, see the descriptions of the respective import or export filters.

source
Description

Specifies the sourcepath and the destpath for all file elements within this source element. Also, a postaction can be specified which acts on the source file after conversion has been performed. Depending on the selected postaction, a successfolder and/or failurefolder attribute may also be required.

If the specified destpath, successfolder or failurefolder does not exist, it will be created.

sourcepath specifies where source files will be searched.

destpath is where the files, modified by the filter's extension replacement setting, will be stored after conversion.

If specified, imagedestpath designates a folder where extracted images will be stored.

Attributes
sourcepath

Specifies a basic source path for all files contained in source.

destpath

Specifies the destination path for files generated by the export filters.

imagedestpath

Specifies the destination path for images extracted from the source file.

postaction

The following actions are available:

none

no action is performed on the source file (this is the default action)

deleteAlways

source file is deleted after conversion

deleteOnSuccess

source file is deleted after conversion only if no errors occurred; otherwise it is left untouched

moveAlways

source file is moved to successfolder after conversion

moveFailure

source file is moved to failurefolder after conversion only if an error occurred

moveFailureDelete

source file is moved to failurefolder after conversion only if an error occurred; otherwise, it is deleted

moveStatus

source file is moved to successfolder if there were no errors during conversion, otherwise it is moved to failurefolder

successfolder

Specifies the folder/directory where the source files will be moved after processing them successfully (depending on the selected postaction).

failurefolder

Specifies the folder/directory where the source files will be moved after failing to process them successfully (depending on the selected postaction).

file
Description

This element (which must not have any content) selects the files to be converted, based on various criteria:

anyfile="yes"

Any file found in the folder specified by the enclosing source element is converted.

name="somefilename"

The file somefilename is converted (and only that).

startswith="prefixstring"

All files in the folder specified by the source element's sourcepath attribute starting with the string prefixstring will be processed.

You may use this e.g. to separate the generated files by the first character of the filename into several destination folders, with a construction like:

...
<source sourcepath="srcfiles"
        destpath="dest/A">
  <file startswith="a" />
  <file startswith="A" />
</source>
<source sourcepath="srcfiles"
        destpath="dest/B">
  <file startswith="b" />
  <file startswith="B" />
</source>
...

endswith="someextension"

Selects files ending with the specified string someextension.

skipexisting="yes"

If all destination files that would result by processing the current source file using the specified export filters are already existent in the location where they would be created, the processing of this file is skipped. This is useful if a previous batch run was aborted abnormally and only part of the files had been processed. In this case you simply can restart the very same batch, the files that have already been converted at some earlier time will not get converted again.

If you do not specify any selection attribute, the configuration is executed exactly once with a virtual input file of the name sourcepath/foo.bar .

Warning

skipexisting only does what its name suggests, it checks for the existence of the result files, not if their modification dates are newer than the source files, so this is not a real make functionality!

Attributes
anyfile

If specified with value yes, converts every file found in the current folder.

name

If specified, selects only the file with the name specified as this attribute's value.

startswith

If specified, selects all files which start with the prefix string specified as this attribute's value.

endswith

If specified, selects all files which end in the string specified as this attribute's value.

skipexisting

If specified with value yes, skips processing of all source files where all files that would be generated by the specified export filters already exist.

4. Examples

Deleting results of previous run. Suppose you wish to delete generated files of a previous run before performing a conversion job. This can be done as follows:

<run mode="interactive">
  <conversion >
    <configuration>
      <import type="none" />
      <filter type="Commandline">
        <param name="Commandline" value="rm &quot;${il:srcfilename}&quot;" />
      </filter>
    </configuration>
    <source sourcepath="/Users/chris/conv/" postaction="none" >
      <file endswith=".css" />
      <file endswith=".xml" />
    </source>
  </conversion>
  <conversion>
    <configuration>
      <import type="rtf" />
      <filter type="XML">
        <param name="Extension" value=".xml" />
        <!-- further parameters omitted... -->
      </filter>
      <filter type="CSS">
        <param name="Extension" value=".css"/>
      </filter>
    </configuration>
    <source sourcepath="/Users/chris/orig/" 
            destpath="/Users/chris/conv/"
            postaction="none">
      <file endswith=".rtf" />
    </source>
  </conversion>
</run>

The above example deletes all *.css and *.xml files from the output directory of the second conversion by utilizing the Unix command rm in conjunction with the upCast variable %S. Note that the argument to rm must be quoted to allow file names containing spaces or special shell characters and that the quotes itself need to be quoted using &quot;since they are within an attribute.

Validating a folder of XML files. Suppose you wish to use upCast for quickly validating all XML files in the folder /Users/chris/xml/, then this should do it:

<run mode="interactive">
  <conversion >
    <configuration>
      <import type="none" />
      <filter type="XMLValidator">
        <param name="InputFile" value="${il:srcfilename}" />
      </filter>
    </configuration>
    <source sourcepath="/Users/chris/xml/" postaction="none" >
      <file endswith=".xml" />
    </source>
  </conversion>
</run>

The above example validates all *.xml files in the /Users/chris/xml/ directory.