Skip to content

docx_plus.bookmarks.anchor

Paired <w:bookmarkStart> / <w:bookmarkEnd> body markers with a shared w:id and a human-readable w:name. Cross-references key off the name. python-docx provides no abstraction for bookmarks; this module fills the gap, validating names against Word's rules ([A-Za-z_][A-Za-z0-9_]{0,39}) so silently broken cross-references become impossible.

Architecture walkthrough: ARCHITECTURE.md §7.8.

docx_plus.bookmarks.anchor

Bookmark anchoring — body-side <w:bookmarkStart> / <w:bookmarkEnd>.

A bookmark is a pair of empty marker elements with a shared w:id and a human-readable w:name. python-docx provides no abstraction for either, so this module fills the gap with :func:add_bookmark and :func:delete_bookmark. Cross-references key off the bookmark name, which is what REF / PAGEREF field instructions accept.

This module imports only from docx_plus.core and the sibling docx_plus.bookmarks.registry (SPEC §9.1).

BookmarkTarget module-attribute

BookmarkTarget = Run | Paragraph | tuple[Run, Run]

BookmarkRef dataclass

BookmarkRef(
    bookmark_id: int, name: str, start_element: _Element, end_element: _Element
)

Handle for an inserted bookmark.

Attributes:

Name Type Description
bookmark_id int

The w:id value shared by the start and end markers.

name str

The w:name attribute. Cross-references key off the name, not the id.

start_element _Element

The <w:bookmarkStart> lxml element.

end_element _Element

The <w:bookmarkEnd> lxml element.

add_bookmark

add_bookmark(
    target: BookmarkTarget,
    name: str,
    *,
    id_registry: BookmarkIdRegistry | None = None,
) -> BookmarkRef

Anchor a bookmark to a run, paragraph, or run range.

Writes a paired <w:bookmarkStart> / <w:bookmarkEnd> bracketing the target. The bookmark id is minted from id_registry (or a fresh one if not supplied). The name is validated against Word's bookmark name rules: it must start with a letter or underscore, contain only letters, digits, and underscores, and be at most 40 characters long.

Parameters:

Name Type Description Default
target BookmarkTarget

Where the bookmark anchors. Same shapes as :func:docx_plus.comments.add_comment — a single Run, a Paragraph (must have at least one run), or a (start_run, end_run) tuple.

required
name str

Bookmark name. Must match [A-Za-z_][A-Za-z0-9_]{0,39}. Names violating Word's rules silently break cross-references, so this is enforced.

required
id_registry BookmarkIdRegistry | None

Pre-existing registry to share across an editing session.

None

Returns:

Name Type Description
A BookmarkRef

class:BookmarkRef capturing the assigned id and the body

BookmarkRef

elements.

Raises:

Type Description
ValueError

For invalid names, empty paragraph targets, or unsupported target shapes.

Example

from docx import Document from docx_plus.bookmarks import add_bookmark doc = Document() p = doc.add_paragraph("Section 1 intro") ref = add_bookmark(p, "section_1_intro")

Source code in docx_plus/bookmarks/anchor.py
def add_bookmark(
    target: BookmarkTarget,
    name: str,
    *,
    id_registry: BookmarkIdRegistry | None = None,
) -> BookmarkRef:
    """Anchor a bookmark to a run, paragraph, or run range.

    Writes a paired ``<w:bookmarkStart>`` / ``<w:bookmarkEnd>``
    bracketing the target. The bookmark id is minted from
    ``id_registry`` (or a fresh one if not supplied). The name is
    validated against Word's bookmark name rules: it must start with a
    letter or underscore, contain only letters, digits, and underscores,
    and be at most 40 characters long.

    Args:
        target: Where the bookmark anchors. Same shapes as
            :func:`docx_plus.comments.add_comment` — a single ``Run``, a
            ``Paragraph`` (must have at least one run), or a
            ``(start_run, end_run)`` tuple.
        name: Bookmark name. Must match
            ``[A-Za-z_][A-Za-z0-9_]{0,39}``. Names violating Word's
            rules silently break cross-references, so this is enforced.
        id_registry: Pre-existing registry to share across an editing
            session.

    Returns:
        A :class:`BookmarkRef` capturing the assigned id and the body
        elements.

    Raises:
        ValueError: For invalid names, empty paragraph targets, or
            unsupported target shapes.

    Example:
        >>> from docx import Document
        >>> from docx_plus.bookmarks import add_bookmark
        >>> doc = Document()
        >>> p = doc.add_paragraph("Section 1 intro")
        >>> ref = add_bookmark(p, "section_1_intro")
    """
    if not _BOOKMARK_NAME_RE.match(name):
        raise ValueError(f"bookmark name {name!r} must match {_BOOKMARK_NAME_RE.pattern}")

    start_anchor, end_anchor, doc = _normalize_target(target)

    if id_registry is None:
        id_registry = BookmarkIdRegistry(doc)
    bookmark_id = id_registry.next()
    bid = str(bookmark_id)

    start = el("w:bookmarkStart", **{"w:id": bid, "w:name": name})
    end = el("w:bookmarkEnd", **{"w:id": bid})

    start_anchor.addprevious(start)
    end_anchor.addnext(end)

    return BookmarkRef(
        bookmark_id=bookmark_id,
        name=name,
        start_element=start,
        end_element=end,
    )

delete_bookmark

delete_bookmark(doc: Document, name: str) -> None

Remove every bookmark with the given name from doc.

Idempotent — removing a missing bookmark is a no-op. Removing by name (not id) matches the cross-reference key, so a stale name referenced by a REF field can be cleared with a single call.

Parameters:

Name Type Description Default
doc Document

A python-docx Document.

required
name str

Bookmark name to remove. Matches case-sensitively.

required
Source code in docx_plus/bookmarks/anchor.py
def delete_bookmark(doc: Document, name: str) -> None:
    """Remove every bookmark with the given name from ``doc``.

    Idempotent — removing a missing bookmark is a no-op. Removing by
    *name* (not id) matches the cross-reference key, so a stale name
    referenced by a ``REF`` field can be cleared with a single call.

    Args:
        doc: A python-docx Document.
        name: Bookmark name to remove. Matches case-sensitively.
    """
    body = doc.element.body
    starts = xpath(body, ".//w:bookmarkStart[@w:name=$name]", name=name)
    ids = {s.get(qn("w:id")) for s in starts}
    for start in starts:
        remove(start)
    if not ids:
        return
    # Match each end by id (bookmarkEnd has no name attribute).
    for end in xpath(body, ".//w:bookmarkEnd"):
        if end.get(qn("w:id")) in ids:
            remove(end)