Batch job processing in upCast Enterprise is controlled by so-called Batch Configuration Files (BCFs) that describe the batch job operations to be executed on source documents. They are executed by the Batch Processor import filter.
In order to access the batch job functionality you have to select the Batch Processor import filter. Do this via the Import Filter popup menu.
After you have selected the Batch Processor import filter you can choose a batch configuration file either via the Source File input field or you can directly type in the name and access path of the batch configuration file. In doing so, to separate folder names, you must use the forward slash ('/') on Unix and Macintosh systems. On Windows machines, use the backslash ('\') character instead. The slashes will be converted to the appropriate, system specific separator character during file access. Batch job file names must end in .bcf.
button next to theIn order to initiate a batch job, click
. This initiates a detailed check of the chosen batch configuration file.This check comprises the following:
a test for the existence of all specified documents to be converted. You will get a warning for all files that were not found
a test for files that are already converted and therefore may be ignored (see skipexisting). You will be informed about all the files that will be skipped.
At the end of the check you get a survey about the number of documents to be converted and their accumulated size.
If you want to proceed with the execution of the batch job, click
, otherwise click .A batch job consists of several phases, as determined in the batch configuration file. During the processing of these phases a progress bar will be displayed that provides a
button. You can abort the batch job at any time by clicking this button. All already successfully written files will remain untouched.A Batch Configuration File (BCF) is actually an instance of an XML document written in the BCF language. It consists of one or more blocks describing the filter configurations and the source files to be converted using these filter configurations.
A BCF is considered an input file for the Batch Processor import filter. In other words, what normally in upCast is the RTF file to be converted by the RTF 1.6 import filter, is now the BCF file.
Internally, a BCF sets up the conversion core of upCast and calls either the (none) or the RTF 1.6 import filter for each file according to the rules specified in the BCF.
A batch run as described in a BCF consists of one or more conversion elements.
A conversion specifies the complete filter configuration and one or more source elements, which specify what documents (specified by file elements) have to be converted.
A configuration consists of the specification of the export filter elements to be used, where each can have its own param element set.
A param element has a a name, value attribute pair.
The source element specifies the sourcepath, i.e. the path where the file elements to be converted will be found, and the destpath, where the converted files will be written to. In conjunction with the (optional) use of so-called Post Actions, you may also specify successpath and failurepath attributes.
A file element can either have a specific name attribute and thus specify a single source document. Or it can be of a selective nature in order to collect several files based on certain criteria, like all files starting or ending with a certain character combination or even all RTF files in the sourcepath. If the file element is empty and has no selection attributes, it indicates that the filters set up should be executed exactly once.
A sample BCF. Let's have a look at the following BCF:
<run> <conversion> <configuration> <import type="rtf" /> <filter type="XML"> <param name="Extension" value=".xml" /> </filter> <filter type="XHTML"> <param name="Extension" value=".html" /> </filter> <filter type="CSS"> <param name="Extension" value=".css" /> </filter> </configuration> <source sourcepath="file:///C:/upCast/rtf" destpath="file:///C:/upCast/rtf/converted"> <file endswith=".rtf" skipexisting="yes" /> </source> </conversion> </run>
run is the root element of the BCF. It encapsulates all other BCF elements.
Then, a conversion element is specified. Within one run, you can specify several conversion elements, where each has its own filter settings and source file selections. In our sample, only one conversion element is specified.
A conversion consists of two parts: The configuration and the source, also in this order. The configuration sets up import- and export filter elements together with their param elements, then within source, the file elements for the source documents to be converted are specified.
The configuration element in our sample specifies the import filter to use by way of an import element with a type value of rtf - selecting the RTF 1.6 import filter, and three export filters that should be applied to all imported RTF documents, namely XML, XHTML and CSS. The type of the export filter is selected by the type attribute of the filter element. For a list of all available type attribute values see the reference section.
Any parameters you want to set for a filter go within the filter element as a name-value pair in form of a param element. The available parameter names and value ranges are the ones listed in the functional and parameter descriptions of the various filters.
Now, let's see which files (and therefore documents) are specified within the source section of our sample: First, the files need to reside in the folder C:\upCast\rtf. But we do not want to process all files there. Using the file element, we specify the desired ones in more detail: Only files with names ending in .rtf should be converted (endswith=".rtf"), and only if they haven't already been converted (skipexisting="yes"). Converted files should be written to C:\upCast\rtf\converted.
In the following we describe the actual DTD for BCFs in human language for each element. For those who better like the rather concise DTD notation, here it is (though a bit sloppy when it comes to attribute values):
<ELEMENT run (setvar*,conversion*)> <ATTLIST run mode (interactive|silent) 'interactive'> <ELEMENT setvar EMPTY> <ATTLIST name CDATA #REQUIRED value CDATA #REQUIRED> <ELEMENT conversion (configuration,source*)> <ATTLIST conversion maxdocs CDATA #IMPLIED> <ELEMENT configuration (import,filter+)> <ELEMENT import (param)*> <ATTLIST import type (none|rtf) #REQUIRED> <ELEMENT filter (param)*> <ATTLIST filter type CDATA #REQUIRED <ELEMENT param EMPTY> <ATTLIST param name CDATA #REQUIRED value CDATA #IMPLIED> <ELEMENT source (file)*> <ATTLIST source sourcepath CDATA #IMPLIED destpath CDATA #IMPLIED imagedestpath CDATA #IMPLIED postaction CDATA #IMPLIED successfolder CDATA #IMPLIED failurefolder CDATA #IMPLIED> <ELEMENT file EMPTY> <ATTLIST file endswith CDATA #IMPLIED startswith CDATA #IMPLIED skipexisting (yes|no) 'no' name CDATA #IMPLIED anyfile (yes|no) 'no'>
The root element of a BCF.
Determines the run mode of this batch. The setting of this parameter overrides the Silent operation setting found in the Timed Execution… dialog under .
A pre-flight check is performed and then the user is asked whether he actually wants to run the batch. This is helpful during testing or for additional safety, e.g. when there are delete operations part of the batch.
The batch is executed silently without pausing after the pre-flight step.
Specify a callback class implementing de.infinityloop.upcast.BCFCallback. For more information, see the upCast API documentation.
Lets you define variables for use in attributes later in the BCF's attributes. Variables are referenced by the character sequence ${varname}. The characters \, $, { and } need to be quoted by the backslash character \ if used literally.
The above requires you to quote the Windows file separator character when specifying local paths in attribute values. We therefore recommend you use URLs for specifying files and paths in BCFs.
The variable name.
The variable value. You may already use earlier defined variables for constructing the value.
Encapsulates one conversion batch. There may be more than one conversion batch within a single BCF. A conversion consists of first a configuration of the import- and export filter(s) and then specifies the source files to be fed into that configuration.
Sets the maximum number of documents that should be converted in this conversion as a decimal integer. This can be useful in conjunction with the Timed Execution mode to chunk a large number of files into several subsequent runs or for testing and debgging purposes of the general working of the BCF.
If the attribute is not specified or the value is -1, then no limit on the number of documents to be converted is set.
Specify a callback class implementing de.infinityloop.upcast.BCFCallback. For more information, see the upCast API documentation.
Encapsulates import and export filter configuration and parameter settings. Contains an import element and one or more filter elements for each export filter the document should be processed by. Export filters will be executed in the order they are defined in the BCF.
(none)
Selects the import filter to be used for this conversion and (optionally) specifies its parameter setting by including param elements as children, thus overriding the settings specified in the graphical UI.
Selects the import filter to use:
The document is passed to the output filter without any processing.
Documents are imported using the RTF 1.6 import filter.
Sets an export filter. As contents, it defines export filter parameters with the help of param elements.
Specifies the export filter type.
type can have one of the following values (names are case sensitive):
the XML (upCast DTD) export filter conforming to the generic upCast DTD
the XHTML 1.0 (strict) export filter
the External CSS2 export filter
the Commandline processor for running an external shell command
the XML Validator post-processing filter
the XSLT processor
the DocBook 4.2 export filter
the XML (Raw) export filter
the XSLT Processor
the Raw Tree Dumper export filter
the Unicode Translator export filter
Sets import or export filter parameters.
The name of an export filter's parameter.
For details on what parameters are available, what they do and by which export filters they are supported, see the descriptions of the respective import or export filters.
Value for the named parameter.
For available values, see the descriptions of the respective import or export filters.
Specifies the sourcepath and the destpath for all file elements within this source element. Also, a postaction can be specified which acts on the source file after conversion has been performed. Depending on the selected postaction, a successfolder and/or failurefolder attribute may also be required.
If the specified destpath, successfolder or failurefolder does not exist, it will be created.
sourcepath specifies where source files will be searched.
destpath is where the files, modified by the filter's extension replacement setting, will be stored after conversion.
If specified, imagedestpath designates a folder where extracted images will be stored.
Specifies a basic source path for all files contained in source.
Specifies the destination path for files generated by the export filters.
Specifies the destination path for images extracted from the source file.
The following actions are available:
no action is performed on the source file (this is the default action)
source file is deleted after conversion
source file is deleted after conversion only if no errors occurred; otherwise it is left untouched
source file is moved to successfolder after conversion
source file is moved to failurefolder after conversion only if an error occurred
source file is moved to failurefolder after conversion only if an error occurred; otherwise, it is deleted
source file is moved to successfolder if there were no errors during conversion, otherwise it is moved to failurefolder
Specifies the folder/directory where the source files will be moved after processing them successfully (depending on the selected postaction).
Specifies the folder/directory where the source files will be moved after failing to process them successfully (depending on the selected postaction).
This element (which must not have any content) selects the files to be converted, based on various criteria:
anyfile="yes"
Any file found in the folder specified by the enclosing source element is converted.
name="somefilename"
The file somefilename is converted (and only that).
startswith="prefixstring"
All files in the folder specified by the source element's sourcepath attribute starting with the string prefixstring will be processed.
You may use this e.g. to separate the generated files by the first character of the filename into several destination folders, with a construction like:
... <source sourcepath="srcfiles" destpath="dest/A"> <file startswith="a" /> <file startswith="A" /> </source> <source sourcepath="srcfiles" destpath="dest/B"> <file startswith="b" /> <file startswith="B" /> </source> ...
endswith="someextension"
Selects files ending with the specified string someextension.
skipexisting="yes"
If all destination files that would result by processing the current source file using the specified export filters are already existent in the location where they would be created, the processing of this file is skipped. This is useful if a previous batch run was aborted abnormally and only part of the files had been processed. In this case you simply can restart the very same batch, the files that have already been converted at some earlier time will not get converted again.
If you do not specify any selection attribute, the configuration is executed exactly once with a virtual input file of the name sourcepath/foo.bar .
skipexisting only does what its name suggests, it checks for the existence of the result files, not if their modification dates are newer than the source files, so this is not a real make functionality!
If specified with value yes, converts every file found in the current folder.
If specified, selects only the file with the name specified as this attribute's value.
If specified, selects all files which start with the prefix string specified as this attribute's value.
If specified, selects all files which end in the string specified as this attribute's value.
If specified with value yes, skips processing of all source files where all files that would be generated by the specified export filters already exist.
Deleting results of previous run. Suppose you wish to delete generated files of a previous run before performing a conversion job. This can be done as follows:
<run mode="interactive"> <conversion > <configuration> <import type="none" /> <filter type="Commandline"> <param name="Commandline" value="rm "${il:srcfilename}"" /> </filter> </configuration> <source sourcepath="/Users/chris/conv/" postaction="none" > <file endswith=".css" /> <file endswith=".xml" /> </source> </conversion> <conversion> <configuration> <import type="rtf" /> <filter type="XML"> <param name="Extension" value=".xml" /> <!-- further parameters omitted... --> </filter> <filter type="CSS"> <param name="Extension" value=".css"/> </filter> </configuration> <source sourcepath="/Users/chris/orig/" destpath="/Users/chris/conv/" postaction="none"> <file endswith=".rtf" /> </source> </conversion> </run>
The above example deletes all *.css and *.xml files from the output directory of the second conversion by utilizing the Unix command rm in conjunction with the upCast variable %S. Note that the argument to rm must be quoted to allow file names containing spaces or special shell characters and that the quotes itself need to be quoted using "since they are within an attribute.
Validating a folder of XML files. Suppose you wish to use upCast for quickly validating all XML files in the folder /Users/chris/xml/, then this should do it:
<run mode="interactive"> <conversion > <configuration> <import type="none" /> <filter type="XMLValidator"> <param name="InputFile" value="${il:srcfilename}" /> </filter> </configuration> <source sourcepath="/Users/chris/xml/" postaction="none" > <file endswith=".xml" /> </source> </conversion> </run>
The above example validates all *.xml files in the /Users/chris/xml/ directory.