Skip to content

docx_plus.revisions.read

Enumerate every tracked change in a document — run-level insertions and deletions, move source/destination wrappers, run- and paragraph-property changes, and paragraph-mark insertions/deletions — each paired with its id, author, timestamp, type, and affected text. Insertion text is read from <w:t>, deletion text from <w:delText>.

docx_plus.revisions.read

Enumerate every tracked change in a document.

Inverse of :func:docx_plus.revisions.mark_insertion / :func:~docx_plus.revisions.mark_deletion, and the reader for revision marks Word itself authored: walks the document body once and reports every w:ins, w:del, w:moveFrom / w:moveTo, w:rPrChange, and w:pPrChange with its id, author, timestamp, type, and affected text.

Run text inside revision wrappers is invisible to python-docx's paragraph.runs, so all text is read through our own XPath — insertions from <w:t> and deletions from <w:delText>.

This module imports only from docx_plus.core (SPEC §9.1).

RevisionType module-attribute

RevisionType = Literal[
    "insertion",
    "deletion",
    "move_from",
    "move_to",
    "format_run",
    "format_paragraph",
    "paragraph_mark_insertion",
    "paragraph_mark_deletion",
]

TrackedChange dataclass

TrackedChange(
    revision_id: int,
    revision_type: RevisionType,
    author: str,
    timestamp: datetime | None,
    text: str,
    paragraph_index: int,
)

One revision mark paired with the text it affects.

Attributes:

Name Type Description
revision_id int

The w:id value of the revision element.

revision_type RevisionType

One of the :data:RevisionType literals.

author str

The w:author attribute (may be empty).

timestamp datetime | None

The w:date attribute parsed as a timezone-aware UTC :class:datetime, or None if absent or unparseable.

text str

For insertions, the inserted <w:t> text. For deletions, the deleted <w:delText> text. For moves, the moved run text. Empty for format changes and paragraph-mark revisions (the mark itself carries no text).

paragraph_index int

Zero-based index (within doc.paragraphs) of the paragraph containing the revision element, or -1 if it could not be resolved.

read_revisions

read_revisions(doc: Document) -> list[TrackedChange]

Return every tracked change in doc in document order.

Enumerates run-level insertions/deletions, move source/destination wrappers, run- and paragraph-property changes, and paragraph-mark insertions/deletions. Move range markers (the bookmark-like *RangeStart / *RangeEnd delimiters) are not reported as separate entries — the w:moveFrom / w:moveTo wrapper that carries the moved text and metadata is.

Parameters:

Name Type Description Default
doc Document

The python-docx :class:~docx.document.Document to scan.

required

Returns:

Name Type Description
One list[TrackedChange]

class:TrackedChange per revision element, in document order.

list[TrackedChange]

Returns [] for a document with no tracked changes.

Source code in docx_plus/revisions/read.py
def read_revisions(doc: Document) -> list[TrackedChange]:
    """Return every tracked change in ``doc`` in document order.

    Enumerates run-level insertions/deletions, move source/destination
    wrappers, run- and paragraph-property changes, and paragraph-mark
    insertions/deletions. Move *range markers* (the bookmark-like
    ``*RangeStart`` / ``*RangeEnd`` delimiters) are not reported as separate
    entries — the ``w:moveFrom`` / ``w:moveTo`` wrapper that carries the
    moved text and metadata is.

    Args:
        doc: The python-docx :class:`~docx.document.Document` to scan.

    Returns:
        One :class:`TrackedChange` per revision element, in document order.
        Returns ``[]`` for a document with no tracked changes.
    """
    body = doc.element.body
    paragraph_elements = list(xpath(body, ".//w:p"))

    handlers = {
        qn("w:ins"): _read_ins,
        qn("w:del"): _read_del,
        qn("w:moveFrom"): _read_move_from,
        qn("w:moveTo"): _read_move_to,
        qn("w:rPrChange"): _read_rpr_change,
        qn("w:pPrChange"): _read_ppr_change,
    }

    result: list[TrackedChange] = []
    for elem in body.iter():
        handler = handlers.get(elem.tag)
        if handler is None:
            continue
        change = handler(elem, paragraph_elements)
        if change is not None:
            result.append(change)
    return result