About the author:
Daniel is CTO at rhome GmbH, and Co-Founder at Aqarios GmbH. He holds a M.Sc. in Computer Science from LMU Munich, and has published papers in reinforcement learning and quantum computing. He writes about technical topics in quantum computing and startups.
jrnl · home about list ventures publications LinkedIn Join my Slack

# Filling a docx template with Python while preserving style

I am actually amazed how easy this is - check the example below. There are a few gotchas to keep track of, but all in all - very easy.

A common issue arises with placeholders in a Word document – they might be split into multiple runs. A run in Word is a region of text with the same formatting. When editing a document, Word may split text into multiple runs even if the text appears continuous. This behavior can lead to difficulties when replacing placeholders programmatically, as part of the placeholder might be in one run, and the rest in another. To avoid this, it's crucial to ensure that each placeholder is written out in one go in the Word document before saving it. This step ensures that Word treats each placeholder as a single unit, making it easier to replace it programmatically without disturbing the rest of the formatting.

Here’s how you can automate the replacement of placeholders in a Word document while preserving the original style and formatting. This approach is particularly useful for generating documents like confirmation letters, where maintaining the professional appearance of the document is crucial.

from docx import Document

# Function to replace text without changing style
def replace_text_preserve_style(paragraph, key, value):
    if key in paragraph.text:
        inline = paragraph.runs
        for i in range(len(inline)):
            if key in inline[i].text:
                text = inline[i].text.replace(key, value)
                inline[i].text = text

# Load your document
doc = Document('path_to_your_document.docx')

# Mock data for placeholders
mock_data = {
    "PLACEHOLDER1": "replacement text",
    "PLACEHOLDER2": "another piece of text"
    # Add as many placeholders as needed
}

# Replace placeholders with mock data
for paragraph in doc.paragraphs:
    for key in mock_data:
        replace_text_preserve_style(paragraph, key, mock_data[key])

# Save the modified document
doc.save('path_to_modified_document.docx')

In this script, replace_text_preserve_style is a function specifically designed to replace text without altering the style of the text. It iterates through all runs in a paragraph and replaces the placeholder text while keeping the style intact. This method ensures that the formatting of the document, including font type, size, and other attributes, remains unchanged.

When preparing your template, make sure each placeholder is inserted as a whole (this means - go into the template and write the placeholders in one go, then save the docx). Avoid breaking it into separate parts, or only writing a part of the placeholder. Word needs to consider the whole placeholder as one unit. This preparation makes it easier for the script to find and replace placeholders without messing with the document’s formatting.

This approach is incredibly effective for automating document generation while maintaining a high standard of presentation. It's particularly useful in business contexts where documents need to be generated rapidly but also require a professional appearance. We use this a lot at rhome to automate all kinds of registration documents.

Published on