`docx_plus.bookmarks.read`¶

Read every bookmark in a document. Each BookmarkInfo carries the bookmark's id, name, the anchored text (what a REF bookmark_name field would resolve to), and the paragraph index where the bookmarkStart marker sits.

docx_plus.bookmarks.read ¶

Read every bookmark from a document.

Returns one :class:BookmarkInfo per <w:bookmarkStart> paired with its matching <w:bookmarkEnd>. The anchored text is what a REF bookmark_name field would resolve to.

This module imports only from docx_plus.core (SPEC §9.1).

BookmarkInfo `dataclass` ¶

BookmarkInfo(
    bookmark_id: int, name: str, anchored_text: str, paragraph_index: int
)

A bookmark with the text it anchors to.

Attributes:

Name	Type	Description
`bookmark_id`	`int`	The `w:id` value.
`name`	`str`	The `w:name` attribute. Cross-references key off the name, not the id.
`anchored_text`	`str`	The text between `bookmarkStart` and `bookmarkEnd`. Empty for unclosed or empty-range bookmarks.
`paragraph_index`	`int`	Zero-based index (within `doc.paragraphs`) of the paragraph that contains the `bookmarkStart` marker. `-1` if the bookmark sits outside any paragraph (rare; structurally invalid).

read_bookmarks ¶

read_bookmarks(doc: Document) -> list[BookmarkInfo]

Return every bookmark in doc paired with the text it anchors to.

Parameters:

Name	Type	Description	Default
`doc`	`Document`	The python-docx :class:`~docx.document.Document` to scan.	required

Returns:

Name	Type	Description
`One`	`list[BookmarkInfo]`	class:`BookmarkInfo` per bookmark, in document order.

Source code in docx_plus/bookmarks/read.py

def read_bookmarks(doc: Document) -> list[BookmarkInfo]:
    """Return every bookmark in ``doc`` paired with the text it anchors to.

    Args:
        doc: The python-docx :class:`~docx.document.Document` to scan.

    Returns:
        One :class:`BookmarkInfo` per bookmark, in document order.
    """
    body = doc.element.body
    paragraph_elements = list(xpath(body, ".//w:p"))

    # Build an id → end-element map so we don't re-scan for each start.
    ends_by_id: dict[str, etree._Element] = {}
    for end in xpath(body, ".//w:bookmarkEnd"):
        bid = end.get(qn("w:id"))
        if bid is not None:
            ends_by_id[bid] = end

    result: list[BookmarkInfo] = []
    for start in xpath(body, ".//w:bookmarkStart"):
        bid_raw = start.get(qn("w:id"))
        name = start.get(qn("w:name")) or ""
        if bid_raw is None:
            continue
        try:
            bid = int(bid_raw)
        except ValueError:
            continue

        end = ends_by_id.get(bid_raw)
        anchored_text = _text_between(body, start, end) if end is not None else ""

        paragraph_index = -1
        ancestor = start.getparent()
        while ancestor is not None and ancestor.tag != qn("w:p"):
            ancestor = ancestor.getparent()
        if ancestor is not None:
            try:
                paragraph_index = paragraph_elements.index(ancestor)
            except ValueError:
                paragraph_index = -1

        result.append(
            BookmarkInfo(
                bookmark_id=bid,
                name=name,
                anchored_text=anchored_text,
                paragraph_index=paragraph_index,
            )
        )
    return result

docx_plus.bookmarks.read¶