Legacy approaches to document automation

There are 3 common existing approaches to document automation.

Approach 1: string replacement

In this approach, the document contains special strings, which are recognised by the document automation program.

Approach 2: fields

In this approach, the fields feature of Microsoft Word or Open Office is used.

Approach 3: proprietary vendor file format

Unlike the above approaches, this approach does not utilize the Word or OpenOffice file format. Instead, the document is represented using a file format designed for the purpose, usually XML.

Limitations of Existing Approaches

Each of the existing approaches shares a common limitation: the markup used is proprietary to the vendor of the document automation system.

Once you've marked up your documents to suit your vendor's software, you are "locked in". It is difficult to migrate/convert to a different vendor.

Imagine a world where web pages created using tools from vendor A could only be read using vendor A's browser. And that browser couldn't open pages created by vendor B's tools. That's the world of document automation today!

The existing approaches each have other limitations of their own.

The string replacement approach is brittle, because what looks like a single variable name in the wordprocessor user interface might be broken into separate strings in the underlying file format (perhaps because of spelling/grammar checking).

The proprietary (XML) file format approach - found in higher end systems designed 10 years or so ago - has proven to be a particularly costly one. Documents (typically in Word format) have to be converted to the proprietary format, and then, when delivered, converted again (to PDF, or back to Word). Costs include:

document conversion costs

training costs (learning to use authoring environment)

software costs (vendor costs in maintaining conversion and authoring software)

output document fidelity issues

These costs can be minimised or avoided entirely, by using a system which uses Word's docx format as its underlying template document format (or ODF, if your organisation uses OpenOffice or LibreOffice).