Concepts¶
Four ideas drive almost every API decision in wordlive. If you understand these, the rest of the surface follows.
Politeness model¶
The user is editing the same document as your script. Naïve automation clobbers their cursor, their scroll position, and their undo history. wordlive refuses to do that.
Every Document.edit() scope:
- Snapshots the user's
Selection(start / end character offsets) andActiveWindow.VerticalPercentScrolled. - Runs your mutations.
- Restores the snapshot on the way out.
The captured snapshot is a plain dataclass:
from wordlive import SelectionSnapshot
# Captured at the start of every doc.edit() block.
SelectionSnapshot(start=412, end=412, vertical_percent=37)
If you genuinely want to move the user — say, jumping their cursor to a new section after inserting it — opt in explicitly:
with doc.edit("Add risk paragraph") as scope:
doc.heading("Risks").insert_paragraph_after("New risk identified.")
scope.allow_cursor_move() # don't restore on exit
doc.go_to(doc.heading("Risks"))
Snapshots use best-effort restoration: if the document shrank and the old offset is now invalid, wordlive collapses to the start of the snapshot range rather than raising.
Implementation
The snapshot dataclass, plus the snapshot() / restore() helpers, live in
src/wordlive/_selection.py.
Semantic anchors over Selection¶
The Word object model encourages you to drive everything through the live
Selection — the user's cursor. That's hostile to both humans (your script
fights their typing) and LLM agents (the cursor is invisible state).
wordlive operates on anchors instead: named handles for ranges that don't depend on the cursor.
| Anchor type | What it names | Persistence |
|---|---|---|
Bookmark |
A bookmark by name | Stored in the .docx |
ContentControl |
A structured field by Title (or Tag) | Stored in the .docx |
Heading |
A heading paragraph by visible text | Reads the doc structure |
Cell |
A table cell by (table, row, column) | Reads the doc structure |
HeaderFooter |
A section's header/footer by (section, which) | Reads the doc structure |
RangeAnchor |
An arbitrary character span by offsets | Ephemeral (resolved live) |
They all subclass Anchor and share the
same operations:
addr = doc.bookmarks["Address"]
addr.text # read
addr.set_text("123 Main St") # replace
addr.insert_before("Mailing: ") # insert without replacing
addr.insert_after(" (verified)")
addr.delete()
addr.com # raw COM Range — escape hatch
Why not Selection-driven? Two reasons:
- Idempotent operations are easier to reason about. "Set the Address bookmark to X" is repeatable; "type X at the cursor" is not.
- LLM tool use needs stable identifiers. A bookmark name is stable; a character offset isn't.
Anchor IDs¶
Each anchor kind has its own collection (doc.bookmarks,
doc.content_controls, doc.tables, doc.sections, doc.paragraphs,
doc.heading(name), …). For programmatic addressing across all of them —
especially from JSON tool-use payloads — wordlive uses a single string scheme:
heading:3 # 1-based paragraph index of a heading
para:5 # 1-based index of any paragraph (same index space as heading:N)
bookmark:Address # bookmark by name
cc:Signatory # content control by Title (or Tag)
table:1:2:3 # cell at row 2, column 3 of the 1st table
range:412-429 # arbitrary character span (the form find() emits)
header:1:primary # primary header of section 1
footer:2:first # first-page footer of section 2
start # the position before the first paragraph (the prepend target)
end # the position past the last paragraph (the append target)
para:N and heading:N index the same paragraph stream, so a heading at
para:5 is also heading:5 — the difference is that heading:N refuses to
resolve a non-heading paragraph, while para:N resolves any paragraph.
doc.outline() emits the heading-only view;
doc.paragraphs.list() (and
outline --all) emits every paragraph with offsets.
The bare table:N form is deliberately not an anchor — a whole table is a
collection, not a single range — so it's addressed through doc.tables[N] and
the table CLI group instead. Only cells (table:N:R:C) resolve via
anchor_by_id. Header/footer ids take a section index S and a WHICH of
primary / first / even; the bare section:N is likewise a collection, not
an anchor (use doc.sections[N]).
The range:START-END form is what find()
emits for each hit, and it round-trips: feed it back into replace --anchor-id
or comments.add to act on exactly the span that was found. Range offsets are
live — they're resolved against the document on each use, so an edit that
shifts the text earlier can leave a stale range pointing at the wrong place.
Resolve, act, discard.
The bare start and end keywords are the two schemes without a kind:value
shape: they name the positions before the first and past the last paragraph —
the spots no content names — resolving to a
StartAnchor /
EndAnchor whose insert verbs all prepend /
append. They back doc.prepend_paragraph /
doc.append_paragraph (and the matching
inline doc.prepend /
doc.append) plus the wordlive prepend /
append commands, so building a document from either end needs no .com drop.
These IDs are emitted directly by doc.outline():
doc.outline()
# [
# {"level": 1, "text": "Introduction", "anchor_id": "heading:1"},
# {"level": 2, "text": "Context", "anchor_id": "heading:3"},
# {"level": 1, "text": "Risks", "anchor_id": "heading:8"},
# ]
And consumed by doc.anchor_by_id() and
every CLI command that takes --anchor-id:
Why a paragraph index for headings instead of the heading text? Two headings
can share the same text ("Background", "Background") and the index
disambiguates. The heading:N form always refers to the Nth paragraph in
the document, which is stable across the lifetime of a session.
Implementation
Resolution is centralised in
Document.anchor_by_id; see
src/wordlive/_document.py.
EditScope and atomic undo¶
doc.edit("label") returns an EditScope.
Inside the with block, wordlive opens Application.UndoRecord so every
mutation is bundled into a single Ctrl-Z step labelled with your string.
with doc.edit("Replace boilerplate"):
doc.bookmarks["Greeting"].set_text("Hello,")
doc.bookmarks["Closing"].set_text("Best,")
doc.heading("Footer").set_text("Signed electronically.")
# In Word's undo dropdown: a single entry, "Replace boilerplate".
Two responsibilities are bundled into the same context manager:
UndoRecord— start/end the recording. On Word versions that don't supportUndoRecord(pre-2010), wordlive silently falls back to running the ops without atomic-undo; everything still works, you just get N undo entries instead of one.SelectionSnapshot— see Politeness.
The scope object itself exposes one knob:
with doc.edit("Insert and jump") as scope:
doc.heading("Introduction").insert_paragraph_after("…")
scope.allow_cursor_move() # skip the snapshot restore
doc.go_to(doc.heading("Introduction"))
Most code never touches the scope — just with doc.edit("label"): and write
your mutations.
Implementation
EditScope lives in
src/wordlive/_edit.py.
The .com escape hatch¶
wordlive deliberately covers a small surface. When you need something it
doesn't, every wrapper exposes the raw COM object via .com:
with wl.attach() as word:
doc = word.documents.active
# Anything wordlive covers, use the wordlive API.
with doc.edit("Bold the first ten characters"):
# Anything it doesn't, drop to COM.
doc.com.Range(0, 10).Font.Bold = True
word.com, doc.com, anchor.com, and selection.com all return the
underlying pywin32 dispatch object. Treat this as a forward-compatibility
seam: as wordlive grows, today's COM call may become tomorrow's high-level
helper, but the escape hatch is permanent.