Usage
select_nodes(x, ...)
select_descendants(x, ...)
select_children(x, ...)
select_first(x, ...)
walk_nodes(x, ..., .f)
map_nodes(x, ..., .f)
replace_nodes(x, ..., .with)
delete_nodes(x, ...)
splice_nodes(x, ..., .f)
insert_before(x, ..., .what)
insert_after(x, ..., .what)Arguments
- x
A
pandoc,pandoc_node,pandoc_blocks,pandoc_inlines,ts_tree,ts_node,ts_nodes, or a plainlistof nodes from a previous selection. For a plain list, each mutation verb is applied to every element in turn (insert and splice flatten their multi-node results back into one list).- ...
Predicate expressions, combined with
&. May be empty to match every node (use with care).- .f
A function (or rlang formula like
~ ...) called with each matching node. Return value follows the mutation contract above.- .with
A constant replacement node or list of nodes.
- .what
The siblings to insert. May be a single node, a list of nodes, or a function called as
.what(node).
Details
A tidyselect-style API for querying and rewriting either of q2r's
two AST representations: the pandoc S7 hierarchy and the
ts_tree tree-sitter AST. Each verb is an S7 generic with methods
on the relevant node types; the same verb name works on both ASTs.
Predicates are unquoted R expressions evaluated against each candidate
node with a per-AST data mask. The mask exposes the node's S7 slots
as bare names (level, url, text, kind, is_named, ...) plus a
set of helper functions (is(), has_class(), has_id(),
has_attr(), has_text(), has_label(), is_leaf()). Multiple
predicates are combined with & (logical AND).
Selection verbs
select_nodes()descends the whole tree (including the root) and returns a flat list of matching nodes.select_descendants()is the same but excludes the root. Accepts a list of nodes (so pipe chains work).select_children()only checks the direct children.select_first()returns the first match orNULL.
Iteration and mutation verbs
walk_nodes()applies a side-effect function to every match; returns its input invisibly.map_nodes()rewrites every match via.f. The function may return a single node (in-place replacement), a list of nodes (spliced in at the match site),NULL(delete), or the original node (no-op). Applied to a document / tree / wrapper the result is the same class as the input; applied to a bare node the result follows the mutation contract directly (it may be a node, a list, orNULL).replace_nodes()ismap_nodes()with a constant replacement.delete_nodes()ismap_nodes()with\(x) NULL.splice_nodes()ismap_nodes()whose.fmust return a list.insert_before()/insert_after()inject siblings around each match.
Mutation contract
.f is called with each matching node as its argument. Its return
value is interpreted by the walker:
A single node of the appropriate kind replaces the original.
A list of nodes is spliced into the parent's child list at the match's position.
NULLremoves the match from the parent.Returning the original node (or an
==equivalent) is a no-op.
The mutation walker traverses bottom-up (post-order), so when a parent is checked its children have already been rewritten. This matches Pandoc Lua filters' default.
On a ts_tree, the three grammar-gap content kinds (pandoc_math,
pandoc_display_math, code_fence_content) round-trip through their
verbatim source bytes, so mutating their children is a no-op on
to_qmd() output.
Predicate helpers (only available inside ...)
These shadow nothing in the global R namespace because they are
installed into the predicate's data mask, not the package
namespace. Outside a select_*/map_nodes/etc. predicate they
are unavailable.
is(<S7 class>)honours S7 inheritance, sois(pandoc_block)matches any block. The attribute- and text-based helpers (has_class,has_id,has_attr,has_text,has_label) resolve@attr/ast_text(), which exist only on the pandoc AST, so on ats_treethey are a silent no-match. Usets_query()or bare-slot predicates (kind,text) for tree-sitter queries.has_class("foo")/has_class(c("foo", "bar"))test@attr@classesmembership (pandoc only).has_id("intro")tests@attr@id(pandoc only).has_attr("key")/has_attr("key", "val")test@attr@attributes(pandoc only).has_text("Exercise")tests the node's flattened text (ast_text()) against one or more regex patterns (fixed = TRUEfor literal matching); the analog of parsermd'shas_heading()(pandoc only).has_label("fig-*")glob-matches the node's@attr@id, where Quarto labels surface as#id; for code cells without an attr id it falls back to the cell'slabeloption. The analog of parsermd'shas_label()(pandoc only).is_code_cell()matches an executable Quarto cell (seecode_cell).has_option("eval")/has_option("eval", FALSE)test a cell's#|options.has_engine("r")/has_engine(c("r", "python"))test a cell's engine (cell_engine()).is_leaf()matches nodes with no children.is_named(a bare slot, tree-sitter only, not a function call) is thets_nodenamed/anonymous flag.starts_with(),ends_with(),matches(),contains()- string tests usable as e.g.starts_with("http", url).any_of(x)andall_of(x)- splice a character vector for use with%in%.Bare slot access:
level,url,title,text,format,kind,class,quote_type,math_type, etc. Missing slots resolve toNULL(soNULL == 2isFALSE, not an error).