Blog

ColdFusion Word Mail Merge using OOXML

July 5, 2023

If you’ve ever needed to use ColdFusion to manipulate Word documents, you might have tried a 3rd party library like Doc4J or Apache POI.

While these libraries are very robust I found them to be limiting in different ways. Apache POI lacks built-in mail merge capability (which to me seems very odd given the information below) and Doc4j threw an odd Jakarta error that I was not able to fix.

I also looked at paid for libraries like Apose, but the licensing costs were just too prohibitive

Finally I decided on direct OOXML manipulation.

This turned out to be much easier than I anticipated, once I learned that Office files like Word docx files are actually Zip archives!

Who knew!?!?!

So it’s possible to use the native cfzip tag to “open” the file.

<cfzip action="unzip" file="{Your Office Document File Path}" destination="{A temporary folder}" recurse="yes"/>

Once unzipped to a folder you will have a directory structure like this:
Unzipped Word document root

Within the Word subfolder there are several XML files. The document I am manipulating for the mail merge is document.xml

This file can be read into ColdFusion using xmlParse() and from there can be manipulated like any other XML object using ColdFusion’s native tags and commands.

I couldn’t find any “standard” way for altering the XML to merge data into the various template fields.

To get an idea of how Word does it I created a test Word document with a simple and complex mail merge field: Word Merge Fields.docx

Running this through a Word mail merge and then unzipping the resulting file and reviewing the document.xml for the “merged” document I found that simple merge fields (fldSimple) are completely replaced with the merged value while complex fields have a begin and end XML node delimiter that must be parsed and manipulated in specific ways.

In order to properly merge complex fields it’s necessary to determine if there is a separator XML node within the field. If so, the the node between the separator node and the end node are used as the value. If no separator node is found then the entire field is replaced with the merged value just as simple merge fields.

After updating the parsed XML it must be written back to the document.xml file:

<cffile action="write" file="{Path to document.xml}" output="#toString(CF XML object)#" charset="utf-8">

Once the document.xml file has been saved we need to rezip everything back into a Word docx file:

<cfzip action="zip" file="{Full path to resulting docx file}" source="{Folder containing unzipped contents}" recurse="yes"/>

Fully merged document:
Word Merge Fields – Merged.docx

So far I’ve only used this for mail merges, but I’m sure that is only scratching the surface.

0 Comments

Leave Your Comment

Your email address will not be published. Required fields are marked *


about me

An information technology professional with twenty four years experience in systems administration, computer programming, requirements gathering, customer service, and technical support.