Software Testing for Document Engineering

Gregory M. Kapfhammer

September 15, 2025

How do you know if your document engineering tool is correct? You test it!

Software testing for document engineering

  • Review the basics of document engineering
  • Explore software testing techniques
  • Learn how to test document engineering tools
  • Let’s start by learning the importance of correctness!

Document engineering tools and workflows must run correctly

  • Text processing: does each function parse content correctly?
  • Document analysis: do algorithms extract the right information?
  • File operations: are documents read and written correctly?
  • Content validation: do parsers handle malformed input properly?
  • Output generation: does the tool create documents as expected?

Software testing techniques help to ensure that document tools work correctly! Let’s learn more about software testing!

Testing document engineering tools gives confidence in correctness

  • Steps for testing document engineering tools:
    • Create sample document input
    • Setup the tool’s environment
    • Process the input through the tool
    • Collect the output from the tool
    • Compare output to expected results
    • Report any discrepancies as defects

Testing versus benchmarking for document engineering tools

  • Testing: Create and run test cases to confirm document tools produce correct output for sample documents
  • Benchmarking: Measure timing and performance of document processing operations like parsing or formatting
    • Testing and benchmarking are complementary methods
    • Course focuses on testing document engineering tools
    • Explore more about benchmarking in Algorithm Analysis!

How would you test the Doubler?

class Doubler:
    def __init__(self, n):
        self._n = 2 * n

    def n(self):
        return self._n

x = Doubler(5)
print(x.n() == 10)
assert(x.n() == 10)
y = Doubler(-4)
print(y.n() == -8)
assert(y.n() == -8)
True
True
  • Establish a confidence in the correctness of the Doubler class
  • When testing is it better to use print or assert statements?

Explore use of the print and assert

  • Key Tasks: After creating assert statements that will pass and fail as expected, decide which you prefer and why! What situations warrant a print statement and which ones require an assert?

Test document processing tools

  • Answer key questions when testing document tools:
    • Does the tool correctly parse document formats?
    • After changing code, does processing still work?
  • Using assertion statements during testing:
    • print statements require manual checking of output
    • assert statements automatically verify correctness
  • Use a testing framework like pytest or unittest
  • Assess test coverage with coverage.py

unittest for DayOfTheWeek

import unittest
from dayoftheweek import DayOfTheWeek

class TestDayOfTheWeek(unittest.TestCase):
    def test_init(self):
        d = DayOfTheWeek('F')
        self.assertEqual(d.name(), 'Friday')
        d = DayOfTheWeek('Th')
        self.assertEqual(d.name(), 'Thursday')

unittest.main(argv=['ignored'], verbosity=2, exit=False)
<unittest.main.TestProgram at 0x7f83f88ddcd0>
  • Call unittest.main differently for tests outside Quarto
  • Run test_dayoftheweek.py in slides/weekfour/
  • The OK output in terminal confirms passing assertions

Explore DayOfTheWeek

class DayOfTheWeek:
    """A class to represent a day of the week."""
    def __init__(self, abbreviation):
        """Create a new DayOfTheWeek object."""
        self.abbreviation = abbreviation
        self.name_map = {
            "M": "Monday",
            "T": "Tuesday",
            "W": "Wednesday",
            "Th": "Thursday",
            "F": "Friday",
            "Sa": "Saturday",
            "Su": "Sunday",
        }

    def name(self):
        return self.name_map.get(self.abbreviation)
  • Support the lookup of a day of the week through an abbreviation like Sa
  • Simple example helps us learn testing before complex document processing

Exploring test-driven development in Python

  • Test-driven development (TDD) states “tests before code”:
    • How will you use a function?
    • What are the function’s inputs and outputs?
    • Can you write code to make the tests pass?
  • The “TDD mantra” is Red-Green-Refactor:
    • Red: The tests fail. You haven’t written the code yet!
    • Green: You get the tests to pass by changing the code.
    • Refactor: You clean up the code, removing duplication.

How can you refactor Python code?

L1 = [1, 2, 3, 4, 5]
L2 = [6, 7, 8, 9, 10]
avg1 = sum(L1)/len(L1)
avg2 = sum(L2)/len(L2)
print("avg(", L1, ") -->", avg1)
print("avg(", L2, ") -->", avg2)
avg( [1, 2, 3, 4, 5] ) --> 3.0
avg( [6, 7, 8, 9, 10] ) --> 8.0
  • This code will not work for empty lists!
  • And, the code is repetitive and hard to read
  • Can we refactor the program to avoid the defect?
L1 = [1, 2, 3, 4, 5]
L2 = [6, 7, 8, 9, 10]
if len(L1) == 0:
    avg1 = 0
else:
    avg1 = sum(L1) / len(L1)
if len(L2) == 0:
    avg2 = 0
else:
    avg2 = sum(L2) / len(L2)
print("avg(", L1, ") -->", avg1)
print("avg(", L2, ") -->", avg2)
avg( [1, 2, 3, 4, 5] ) --> 3.0
avg( [6, 7, 8, 9, 10] ) --> 8.0
  • This avoids the defect but is repetitive and hard to read!
def avg(L):
    if len(L) == 0:
        return 0
    else:
        return sum(L) / len(L)

L1 = [1, 2, 3, 4, 5]
L2 = [6, 7, 8, 9, 10]
avg1 = avg(L1)
avg2 = avg(L2)
print("avg(", L1, ") -->", avg1)
print("avg(", L2, ") -->", avg2)
avg( [1, 2, 3, 4, 5] ) --> 3.0
avg( [6, 7, 8, 9, 10] ) --> 8.0
  • The avg function avoids the defect and is easier to read!

Bug hunt for average computation

  • Key Tasks: After confirming that the program works for the initial lists in L1 and L2, try to find the defect. Can you make a solution that works for empty lists? How do you know it is correct?

Refactoring in document engineering

  • What is refactoring?
    • Defined: Better code structure without changing features
    • Goal: Enhance aspects of document processing tools
  • Why refactor document engineering systems?
    • Readability: Helps others to understand text processing logic
    • Maintainability: Simplifies modifications and debugging
    • Reusability: Promotes code reuse across document tools
    • Performance: Aim to improve text processing efficiency
  • Use test cases to confirm correctness of refactoring!

What to test in document tools?

  • For each document processing function, ask these questions:
    • What should happen when processing different document types?
    • How do I want to use this document analysis function?
    • What are the inputs and outputs of text processing?
    • What should be the function’s document inputs and outputs?
    • What are the edge cases for document processing?
  • Test the system’s expected behavior, not its implementation
  • Test the public interface of document processing tools
  • Transform detected defects into repeatable test cases
  • Later, as schedule permits, assess adequacy of tests with coverage.py

Testing aids document tool design

  • Software testing helps refine document engineering tool design

  • Interplay between testing and document tool design:

    • See data (documents) and operations (processing functions)
    • Specify what should happen when processing documents
    • Write a unit test case to encode expected behavior
    • Confirm that all test cases pass correctly
    • Refactor code to improve document processing design
    • Repeatedly run test suite to confirm correctness
  • Systems with good designs are easier to test and maintain!

Don’t benchmark until you are done testing!

  • Testing: Use test cases to confirm document tools produce correct output for sample documents and operations
  • Benchmarking: Measure timing and performance of document processing operations like parsing
  • Running experiments on incorrect document tools may compromise results. Always run a small trial first!

Test a document analysis function

import re
from typing import Dict, Any

def document_summary(text: str) -> Dict[str, Any]:
    """Generate a comprehensive summary of document statistics."""
    words = [word for word in text.split() if any(char.isalnum() for char in word)]
    word_count = len(words)
    sentences = re.split(r'[.!?]+', text)
    sentence_count = len([s for s in sentences if s.strip()])
    paragraphs = [p for p in text.split('\n\n') if p.strip()]
    paragraph_count = len(paragraphs)
    avg_words_per_sentence = word_count / sentence_count if sentence_count > 0 else 0
    return {
        'word_count': word_count, 'sentence_count': sentence_count,
        'paragraph_count': paragraph_count,
        'avg_words_per_sentence': round(avg_words_per_sentence, 1)
    }

sample_text = "Hello world. This is a test."
result = document_summary(sample_text)
print(f"Words: {result['word_count']}, Sentences: {result['sentence_count']}")
assert result['word_count'] == 6
assert result['sentence_count'] == 2
Words: 6, Sentences: 2

Test the document_summary function

  • Implement a Python function like document_summary
  • Create an input document as a string like sample_text
  • Call the function and collect the output in result
  • Use assert statements to confirm correctness of output
  • Manually inspect printed output for additional confidence
  • Yet, can we adopt a more automated approach to testing? Let’s explore the basics of automated testing with pytest! Frameworks make testing easy and repeatable.
  • You can use uv to add pytest as a project dependency!

Advanced testing for document analysis tools

  • Automated testing frameworks like pytest and unittest:
    • Enable structured definition and running of tests
    • Perform parameterized test cases with pytest
    • Support property-based testing with hypothesis

Testing DayOfTheWeek with Pytest

from daydetector.dayoftheweek import DayOfTheWeek

def test_init():
    """Test the DayOfTheWeek class."""
    d = DayOfTheWeek("F")
    assert d.name() == "Friday"
    d = DayOfTheWeek("Th")
    assert d.name() == "Thursday"
    d = DayOfTheWeek("W")
    assert d.name() == "Wednesday"
    d = DayOfTheWeek("T")
    assert d.name() == "Tuesday"
    d = DayOfTheWeek("M")
    assert d.name() == "Monday"
  • Standard format for test names and files
  • Automated discovery and running of the tests
  • Extension through the use of plugins

Parameterized Tests with pytest

@pytest.mark.parametrize(
    "abbreviation, expected",
    [
        ("M", "Monday"),
        ("T", "Tuesday"),
        ("W", "Wednesday"),
        ("Th", "Thursday"),
        ("F", "Friday"),
        ("Sa", "Saturday"),
        ("Su", "Sunday"),
        ("X", None),
    ],
)
def test_day_name(abbreviation, expected):
    """Use parameterized testing to confirm that lookup works correctly."""
    day = DayOfTheWeek(abbreviation)
    assert day.name() == expected
  • Express the inputs and the expected outputs in a table!
  • Same approach works for testing document processing functions

Property-based test case

import hypothesis.strategies as st
from hypothesis import given
import pytest

@pytest.mark.parametrize(
    "valid_days",
    [["Monday", "Tuesday", "Wednesday", "Thursday",
      "Friday", "Saturday", "Sunday"]],
)
@given(
    st.text(alphabet=st.characters(), min_size=1, max_size=2)
)
def test_abbreviation_maps_to_name(valid_days, abbreviation):
    """Use property-based testing with Hypothesis to confirm mapping."""
    day = DayOfTheWeek(abbreviation)
    assert day.name() in valid_days or day.name() is None
  • Hypothesis strategies generate random character inputs for the abbreviation parameter, thereby increasing the input diversity

Oh, one more thing! You could use a large language model to write tests in test_dayoftheweek.py! Wow!

  • What are the benefits and downsides of using artificial intelligence (AI) to generate tests?

  • What are situations in which you should and should not use AI to generate tests?

  • Tests establish a confidence in correctness!

Reminder of course goals

  • Document Creation:
    • Design and implement document generation workflows
    • Test all aspects of documents to ensure quality and accuracy
    • Create frameworks for automated document production
  • Document Analysis:
    • Collect and analyze data about document usage and quality
    • Visualize insights to improve documentation strategies
  • Communicate results and best practices for document engineering
  • References for this week’s content: