--- a/docs/Writerside/topics/tree.h.md Tue Dec 30 13:50:55 2025 +0100 +++ b/docs/Writerside/topics/tree.h.md Tue Dec 30 21:44:23 2025 +0100 @@ -3,28 +3,27 @@ UCX provides several low-level functions for working with arbitrary tree structures, as well as some high-level functions for trees that induce a certain order on the data they store. -The following convenience macros allow you to declare and use simple tree structures. +The following convenience macro allows you to declare and use simple tree structures. ```C -#define CX_TREE_NODE_BASE(Type) - -#define cx_tree_node_base_layout +#define CX_TREE_NODE(Type) ``` -The `CX_TREE_NODE_BASE` macro declares four pointers of type `Type*`: -`parent`, `children`, `last_child`, `prev`, and `next`, -which must be placed as first member into your struct. +It declares four pointers of type `Type*`: +`parent`, `children`, `last_child`, `prev`, and `next`. +It can be placed anywhere in your node structure, but it is recommended +to use them as first members to avoid padding issues. You can use it, for example, like this: ```C typedef struct my_tree_node_s { - CX_TREE_NODE_BASE(struct my_tree_node_s); + CX_TREE_NODE(struct my_tree_node_s); int value; char *desc; } MyTreeNode; ``` -The macro `cx_tree_node_base_layout` expands to the offsets of the above-mentioned pointers. +The macro `cx_tree_node_layout(name)` expands to the offsets of the above-mentioned pointers. It will become handy when calling the low-level tree functions which expect all offsets to be passed as arguments. > In all functions, the `last_child` and `prev` pointers are completely optional. @@ -34,81 +33,41 @@ > The `children` pointer (which points to the first child), and the `next` pointer are mandatory, > and so is the `parent` pointer. -Before diving into the function definitions, there are four function pointer declarations you should know. - -```C -#include <cx/tree.h> - -typedef void *(*cx_tree_node_create_func)( - const void *data, void *context); - -typedef int (*cx_tree_search_data_func)( - const void *node, const void *data); - -typedef int (*cx_tree_search_func)( - const void *node, const void *new_node); - -typedef void (*cx_tree_relink_func)( - void *node, const void *old_parent, const void *new_parent); -``` - -A `cx_tree_node_create_func` takes a pointer to the `data` the new node shall contain, and a `context` (which might be an allocator, for example). -It shall allocate and return the created node, or `NULL` when node creation is not possible. - -A `cx_tree_search_data_func` shall check if the `node` contains the `data`, or if a child node might contain the data. -It shall return zero, when `node` contains the `data`. -When a child node _might_ contain the data, the returned positive number shall indicate how close the match is. -It is _not_ relevant if the data can actually be found in the subtree - the positive number also indicates that the data could be inserted in that subtree. -Only when the entire subtree starting with `node` cannot contain the `data`, a negative number is supposed to be returned. - -A `cx_tree_search_func` behaves exactly the same, except that it does expect a pointer to the `data` but a pointer to a node structure. -In combination with the `cx_tree_node_create_func`, when inserting nodes with the high-level functions, a `new_node` is created first, -and then an appropriate insertion point is searched with the `cx_tree_search_func`. - -A `cx_tree_relink_func` is called when an intermediate node was removed from the tree and it's children need to be detached from -the removed `old_parent` and attached to a `new_parent`. -The function is called for every child of the removed node and can be used, for example, to update the contents of the node when needed. - ## Create ```C #include <cx/tree.h> CxTree *cxTreeCreate(const CxAllocator *allocator, - cx_tree_node_create_func create_func, - cx_tree_search_func search_func, - cx_tree_search_data_func search_data_func, - ptrdiff_t loc_parent, - ptrdiff_t loc_children, ptrdiff_t loc_last_child, - ptrdiff_t loc_prev, ptrdiff_t loc_next); - -CxTree *cxTreeCreateSimple(const CxAllocator *allocator, - cx_tree_node_create_func create_func, - cx_tree_search_func search_func, - cx_tree_search_data_func search_data_func); - -CxTree *cxTreeCreateWrapped(const CxAllocator *allocator, - void *root, - ptrdiff_t loc_parent, - ptrdiff_t loc_children, ptrdiff_t loc_last_child, + size_t node_size, size_t elem_size, + void *root, ptrdiff_t loc_data, + ptrdiff_t loc_parent, ptrdiff_t loc_children, + ptrdiff_t loc_last_child, ptrdiff_t loc_prev, ptrdiff_t loc_next); ``` -The function `cxTreeCreate()` creates a `CxTree` structure -where each node created by the `create_func` has the layout specified by the offset arguments. -The function `cxTreeCreateSimple()` is exactly the same, except that it assumes the `cx_tree_node_base_layout`. +The function `cxTreeCreate()` creates a `CxTree` structure with the specified layout. +You can use the `cx_tree_node_layout()` macro to fill in the locations starting with `loc_parent`. + +> The `node_size` is used by `cxTreeCreateRootData()` and `cxTreeAddData()` to allocate memory for new tree nodes. +> The `elem_size` is used to copy the data into the new node. +> See [below](#add-nodes) for details. -In both cases the tree will be created empty, that is with no nodes, not even a root node. -On the other hand, `cxTreeCreateWrapped()` creates a `CxTree` structure with the specified layout and `root` node -where `root` may already be the root of a larger tree. +The `loc_data` location is optional. +If specified, the functions `cxTreeCreateRootData()`, `cxTreeAddData()`, `cxTreeFind()`, and `cxTreeFindInSubtree()` +will resolve the location to copy or compare the data. +A location of zero means that a pointer to the entire node is used +(in that case the `elem_size` MUST be a positive number and SHOULD equal the `node_size`). -If your wrapped tree structure does not have a `last_child` or a `prev` pointer, +If your tree structure does not have a `last_child` or a `prev` pointer, you may specify a negative location to indicate the missing pointer. All other pointers are mandatory and a non-negative location is expected. -Note that a wrapped tree by default has no create or search functions assigned. -Therefore, if you wish to use one of the functions below that needs those function pointers set, -you will have to set them manually by assigning to the respective fields in the `CxTree` structure. +The `root` node is optional. +If not specified, the tree is initialized empty and `cxFree` is registered as advanced destructor. +If a `root` is specified, the initial tree size is calculated and stored in the collection's metadata. +Also, no destructors are specified and the nodes will not be deallocated when destroying the tree to avoid +use-after-free bugs when the `root` is still in use elsewhere. ### Example for wrapping a libxml2 tree @@ -119,9 +78,9 @@ #include <cx/tree.h> CxTree *wrap_xml_tree(xmlNode *root) { - return cxTreeCreateWrapped( + return cxTreeCreate( cxDefaultAllocator, // or you can just write NULL - root, + sizeof(xmlNode), sizeof(xmlNode), root, 0, offsetof(xmlNode, parent), offsetof(xmlNode, children), offsetof(xmlNode, last), @@ -131,9 +90,6 @@ } ``` -You do not need to specify any function pointers or destructors in the `CxTree` structure, -if you just want to use the resulting `CxTree` for [iterating](#iterator-and-visitor) over the nodes. - You can, for example, print the XML structure with the following code: ```C @@ -164,95 +120,43 @@ ```C #include <cx/tree.h> -void cx_tree_link(void *parent, void *node, ptrdiff_t loc_parent, +void cx_tree_add(void *parent, void *node, ptrdiff_t loc_parent, ptrdiff_t loc_children, ptrdiff_t loc_last_child, ptrdiff_t loc_prev, ptrdiff_t loc_next); -extern unsigned int cx_tree_add_look_around_depth; +void *cxTreeAddData(CxTree *tree, void *parent, const void *data); -size_t cx_tree_add_iter(struct cx_iterator_base_s *iter, size_t n, - cx_tree_search_func sfunc, cx_tree_node_create_func cfunc, - void *cdata, void **failed, void *root, - ptrdiff_t loc_parent, - ptrdiff_t loc_children, ptrdiff_t loc_last_child, - ptrdiff_t loc_prev, ptrdiff_t loc_next); - -size_t cx_tree_add_array(const void *src, size_t n, size_t elem_size, - cx_tree_search_func sfunc, cx_tree_node_create_func cfunc, - void *cdata, void **failed, void *root, - ptrdiff_t loc_parent, - ptrdiff_t loc_children, ptrdiff_t loc_last_child, - ptrdiff_t loc_prev, ptrdiff_t loc_next); - -int cx_tree_add(const void *src, - cx_tree_search_func sfunc, cx_tree_node_create_func cfunc, - void *cdata, void **cnode, void *root, - ptrdiff_t loc_parent, - ptrdiff_t loc_children, ptrdiff_t loc_last_child, - ptrdiff_t loc_prev, ptrdiff_t loc_next); +void cxTreeAddNode(CxTree *tree, void *parent, void *child); + +void cxTreeSetParent(CxTree *tree, void *parent, void *child); ``` -The low-level function `cx_tree_link()` adds a `node` to a new `parent`, unlinking it from its previous parent, if required. - -The functions `cx_tree_add_array()` and `cx_tree_add_iter()` are equivalent, -except that the former adds `n` items of size `elem_size` from the `src` array, -and the latter adds _up to_ `n` items from the iterator `iter`. -For each element, the `cfun` is called with the element as first argument and the `cdata` context as second argument. -The `cfun` function is supposed to return a node for which then the `sfunc` is used to find the location where to insert the node. -If a node could not be added, because `sfunc` always returns a negative number, it is stored at the location where `failed` points to. - -> The functions `cx_tree_add_array()` and `cx_tree_add_iter()` are optimized for adding multiple nodes to -> the same subtree without starting the search all over from the root node. -> This behavior can be controlled with `cx_tree_add_look_around_depth`, which specifies for how many parent nodes -> the algorithm backtracks in the current subtree before restarting the search from the root node. - -The function `cx_tree_add()` is similar to the other add-functions, -except that it does only add one single item to the tree and always stores the created node at the location where `cnode` points to. - -> By design, the add-functions can only be used to add children to the tree. -> If you want to add a new root node, this is much easier by simply calling `cx_tree_link(newroot, oldroot, layout...)` . - -The following high-level functions can be used to add nodes to a `CxTree`. +The low-level function `cx_tree_add()` adds a `node` to a new `parent`, unlinking it from its previous parent, if required. -```C -#include <cx/tree.h> - -int cxTreeAddChild(CxTree *tree, void *parent, const void *data); - -void cxTreeAddChildNode(CxTree *tree, void *parent, void *child); - -void cxTreeSetParent(CxTree *tree, void *parent, void *child); - -int cxTreeInsert(CxTree *tree, const void *data); - -size_t cxTreeInsertIter(CxTree *tree, CxIteratorBase *iter, size_t n); - -size_t cxTreeInsertArray(CxTree *tree, const void *data, - size_t elem_size, size_t n); -``` - -The functions `cxTreeAddChild()`, `cxTreeAddChildNode()`, and `cxTreeSetParent()` are similar to `cx_tree_link()`. -The difference between `cxTreeAddChild()` and `cxTreeAddChildNode()` is, -that `cxTreeAddChild()` uses the `cx_tree_node_create_func` of the `CxTree` to create the node first, before adding it. -The function returns non-zero in case the creation of the node fails. -The difference between `cxTreeSetParent()` and `cxTreeAddChildNode()` is, +The functions `cxTreeAddData()`, `cxTreeAddNode()`, and `cxTreeSetParent()` are similar to `cx_tree_add()`. +The difference between `cxTreeAddData()` and `cxTreeAddNode()` is, +that `cxTreeAddData()` uses the allocator of the `CxTree` to create the node first, before adding it. +The function returns `NULL` in case the creation of the node fails, or the created node when creation was successful. +The difference between `cxTreeSetParent()` and `cxTreeAddNode()` is, that `cxTreeSetParent()` does not increase the tree size, when `child` already had a parent. -> Use `cxTreeAddChildNode()` to add _new_ nodes to the tree and `cxTreeSetParent()` +> Use `cxTreeAddNode()` to add _new_ nodes to the tree and `cxTreeSetParent()` > to relink a subtree of the `tree` with a new parent. > > Using `cxTreeSetParent()` on a `child` which already has a parent in a _different_ tree is a mistake. -The functions `cxTreeInsert()`, `cxTreeInsertIter()`, and `cxTreeInsertArray()` behave -like `cx_tree_add()`, `cx_tree_add_iter()`, and `cx_tree_add_array()`. -As creation context `cdata` for the `cx_tree_node_create_func` a pointer to the `CxTree` is passed. -Instead of returning a `failed` node, like the low-level functions, the high-level functions automatically free the memory for the failed node. ## Size and Depth ```C #include <cx/tree.h> +size_t cx_tree_size(void *root, + ptrdiff_t loc_children, ptrdiff_t loc_next); + +size_t cx_tree_depth(void *root, + ptrdiff_t loc_children, ptrdiff_t loc_next); + size_t cxTreeSubtreeSize(CxTree *tree, void *subtree_root); size_t cxTreeSubtreeDepth(CxTree *tree, void *subtree_root); @@ -269,39 +173,41 @@ The functions `cxTreeSize()` and `cxTreeDepth()` are equivalent to `cxTreeSubtreeSize()` and `cxTreeSubtreeDepth()`, respectively, where `subtree_root` is the root node of the entire tree. -> Passing a `NULL` pointer as `subtree_root` makes those functions return zero. +The functions `cx_tree_size()` and `cx_tree_depth()` are the low-level counterparts. + +> Passing a `NULL` pointer as `root` or `subtree_root` makes those functions return zero. + +> Since the size of a `CxTree` is stored in the [collection](collection.h.md) metadata, `cxTreeSize()` is actually a constant time operation. +> This is not the case for `cxTreeSubtreeSize()` or `cx_tree_size()` where the size has to be calculated by traversing the tree. ## Search ```C #include <cx/tree.h> -#define CX_TREE_SEARCH_INFINITE_DEPTH +#define CX_TREE_SEARCH_INFINITE_DEPTH 0 -int cx_tree_search_data(const void *root, size_t max_depth, - const void *data, cx_tree_search_data_func sfunc, - void **result, ptrdiff_t loc_children, ptrdiff_t loc_next); +typedef int (*cx_tree_search_func)( + const void *node, const void *data); int cx_tree_search(const void *root, size_t max_depth, - const void *node, cx_tree_search_func sfunc, - void **result, ptrdiff_t loc_children, ptrdiff_t loc_next); + const void *data, cx_tree_search_func sfunc, void **result, + bool use_dfs, ptrdiff_t loc_children, ptrdiff_t loc_next); + +void *cxTreeFindFast(CxTree *tree, const void *data, + cx_tree_search_func sfunc); -void *cxTreeFind(CxTree *tree, const void *data); +void *cxTreeFind(CxTree *tree, const void *data, bool use_dfs); void *cxTreeFindInSubtree(CxTree *tree, const void *data, - void *subtree_root, size_t max_depth); + void *subtree_root, size_t max_depth, bool use_dfs); ``` -The functions `cx_tree_search_data()` and `cx_tree_search()` are equivalent, -except that `cx_tree_search_data()` compares a currently investigated node against the `data` with the specified `cx_tree_search_data_func`, -and `cx_tree_search()` compares against a `node` with the specified `cx_tree_search_func`. - -Usually you the latter is rarely needed. -It is used, for example, by `cx_tree_add()` to find the correct position where to add a freshly created node. +The function `cx_tree_search()` efficiently finds a node in the tree. The search proceeds as follows, starting with the `root` node: -1. The current node is compared with `data` / `node` using the `sfunc`. +1. The current node is compared with `data` using the `sfunc`. 2. If `sfunc` returns zero, a matching node has been found and the pointer to that node is stored at the location given by `result`. 3. If `sfunc` returns a negative number, the entire subtree is skipped. 4. If `sfunc` returns a positive number, the subtree is searched, unless the number is larger than the number of an already searched subtree. The best match will be stored at the location given by `result`. @@ -309,14 +215,16 @@ > If the `sfunc` implements some kind of metric, the search functions will return the best match in the tree with respect to that metric. > When a positive number is returned, it means that a new node with the searched data could be added as a child of the found node. -> This is exactly how `cx_tree_add()` finds the best position to add a node to the tree. + +The function `cxTreeFindFast()` uses the `sfunc` to find the `data` in the tree, and returns a pointer to the node when the data was found (or `NULL`, otherwise). -The function `cxTreeFind()` uses the `search_data_func` of the `CxTree` -to find the `data` in the tree, and returns a pointer to the node when the data was found (or `NULL`, otherwise). +The function `cxTreeFind()` instead traverses the entire tree (either with breadth-first search or depth-first search) +and compares the `data` with the collections [comparator function](collection.h.md#comparator-functions). +For this it is necessary that the `loc_data` was specified when creating the tree, or the function will immediately return `NULL`. -The function `cxTreeFindInSubtree()` is equivalent, except that it restricts the search to nodes +The functions `cxTreeFindFastInSubtree()` and `cxTreeFindInSubtree()` are similar, except that they restrict the search to nodes in the subtree starting with (and including) `subtree_root`, and skipping all nodes below the `max_depth`. -Note, that the `max_depth` is specified in relation to the `subtree_root` and not in relation to the entire tree. +Note, that the `max_depth` is specified in relation to the `subtree_root` and not in relation to the entire tree. ## Iterator and Visitor @@ -331,51 +239,56 @@ CxTreeIterator cxTreeIterateSubtree(CxTree *tree, void *node, bool visit_on_exit); +CxTreeIterator cx_tree_visitor(void *root, + ptrdiff_t loc_children, ptrdiff_t loc_next); + +CxTreeIterator cxTreeVisit(CxTree *tree); + +CxTreeIterator cxTreeVisitSubtree(CxTree *tree, void *node) + #define cxTreeIteratorContinue(iter) void cxTreeIteratorDispose(CxTreeIterator *iter); - - -CxTreeVisitor cx_tree_visitor(void *root, - ptrdiff_t loc_children, ptrdiff_t loc_next); - -CxTreeVisitor cxTreeVisit(CxTree *tree); - -CxTreeVisitor cxTreeVisitSubtree(CxTree *tree, void *node) - -#define cxTreeVisitorContinue(visitor) - -void cxTreeVisitorDispose(CxTreeVisitor *visitor); ``` There are two different kinds of iterators for trees: one for depth-first iteration and one for breadth-first iteration. -The `CxTreeIterator` is performing a depth-first iteration with the capability of visiting a node twice: +An iterator created with the _iterate_ functions is performing a depth-first iteration with the capability of visiting a node twice: first when the iterator enters the node coming from its parent, and secondly when the iterator tracks back from its last child. This behavior is controlled via the `visit_on_exit` argument. When set to `true`, the iterators `exiting` flag can be checked during iteration to see whether the iterator is currently entering or leaving the node. The above [example](#example-for-wrapping-a-libxml2-tree) for iterating through an XML tree illustrates this. -On the other hand, the `CxTreeVisitor` performs a breadth-first iteration and visits every node only once. +On the other hand, an iterator created with the _visit_ functions performs a breadth-first iteration and visits every node only once. Both iterators do not support the removal of nodes. Since tree iteration needs to keep track of a stack (depth-first) or queue (breadth-first), internal memory is allocated. This memory is _automatically_ disposed of when the iteration _completes_. -If you break from the iteration early, you must call `cxTreeIteratorDispose()` or `cxTreeVisitorDispose()`, respectively, to deallocate the memory. +If you break from the iteration early, you must call `cxTreeIteratorDispose()` to deallocate the memory. > It is recommended to always invoke the dispose functions, even when it seems not necessary. > In the best case it just does nothing, but calling them guarantees that no memory can be leaking, even when your code will change in the future. >{style="note"} -The macros `cxTreeIteratorContinue()` and `cxTreeVisitorContinue()` equivalently instruct the iterator/visitor to skip the subtree below the currently inspected node. +The macro `cxTreeIteratorContinue()` instruct the iterator to skip the subtree below the currently inspected node. + +> In contrast to iterators over [lists](list.h.md#iterators) or [maps](map.h.md#iterators), +> a tree iterator is _always_ iterating over the nodes. +> That means, the pointer in a `cx_foreach` loop is always expected to be a pointer to a node and not a pointer to +> the element / payload within the node. +> +>{style="warning"} ## Remove ```C #include <cx/tree.h> -void cx_tree_unlink(void *node, ptrdiff_t loc_parent, +typedef void (*cx_tree_relink_func)( + void *node, const void *old_parent, const void *new_parent); + +void cx_tree_remove(void *node, ptrdiff_t loc_parent, ptrdiff_t loc_children, ptrdiff_t loc_last_child, ptrdiff_t loc_prev, ptrdiff_t loc_next); @@ -385,7 +298,7 @@ void cxTreeRemoveSubtree(CxTree *tree, void *node); ``` -The low-level function `cx_tree_unlink()` removes the specified `node`, +The low-level function `cx_tree_remove()` removes the specified `node`, as well as the entire subtree beneath that node, from its parent. The high-level counterpart is `cxTreeRemoveSubtree()`. @@ -403,6 +316,9 @@ ```C #include <cx/tree.h> +typedef void (*cx_tree_relink_func)( + void *node, const void *old_parent, const void *new_parent); + int cxTreeDestroyNode(CxTree *tree, void *node, cx_tree_relink_func relink_func); @@ -426,11 +342,12 @@ The function `cxTreeFree()` behaves like `cxTreeClear()` and then deallocates the memory for the `CxTree` structure. -> Although `CxTree` supports the general concept of [destructor functions](collection.h.md#destructor-functions), -> it is not based on `CX_COLLECTION_BASE`. -> Therefore, the `cxSetDestructor()` and `cxSetAdvancedDestructor()` macros cannot be used on a `CxTree` and -> the fields must be set manually. ->{style="note"} +> In contrast to other collections, the destructor functions of a `CxTree` are _always_ invoked with pointers to the _nodes_ +> and not to pointers to the elements / the payload of the nodes. +> +> Also, the nodes are _not_ deallocated by the tree and _MUST_ be deallocated by the destructor functions. +> When the tree was created by `cxTreeCreate()` without specifying an existing `root` node, `cxFree` is automatically used as the destructor function. +> {style="note"} <seealso> <category ref="apidoc">