docs/Writerside/topics/tree.h.md

Thu, 23 Oct 2025 17:38:44 +0200

author
Mike Becker <universe@uap-core.de>
date
Thu, 23 Oct 2025 17:38:44 +0200
changeset 1439
8e7fe85febc0
parent 1429
6e0c3a8a914a
permissions
-rw-r--r--

add tests for cxMapClone() - relates to #743

# Tree

UCX provides several low-level functions for working with arbitrary tree structures,
as well as some high-level functions for trees that induce a certain order on the data they store. 

The following convenience macros allow you to declare and use simple tree structures.

```C
#define CX_TREE_NODE_BASE(Type)

#define cx_tree_node_base_layout
```

The `CX_TREE_NODE_BASE` macro declares four pointers of type `Type*`:
`parent`, `children`, `last_child`, `prev`, and `next`,
which must be placed as first member into your struct.

You can use it, for example, like this:
```C
typedef struct my_tree_node_s {
    CX_TREE_NODE_BASE(struct my_tree_node_s);
    int value;
    char *desc;
} MyTreeNode;
```

The macro `cx_tree_node_base_layout` expands to the offsets of the above-mentioned pointers.
It will become handy when calling the low-level tree functions which expect all offsets to be passed as arguments. 

> In all functions, the `last_child` and `prev` pointers are completely optional.
> If your tree structure does not contain the `last_child` pointer, the last child will be determined by
> traversing all children from the first child.
> The same happens when it does not have a `prev` pointer, and the left sibling of a child is needed.
> The `children` pointer (which points to the first child), and the `next` pointer are mandatory,
> and so is the `parent` pointer.

Before diving into the function definitions, there are four function pointer declarations you should know.

```C
#include <cx/tree.h>

typedef void *(*cx_tree_node_create_func)(
        const void *data, void *context);

typedef int (*cx_tree_search_data_func)(
        const void *node, const void *data);

typedef int (*cx_tree_search_func)(
        const void *node, const void *new_node);

typedef void (*cx_tree_relink_func)(
        void *node, const void *old_parent, const void *new_parent);
```

A `cx_tree_node_create_func` takes a pointer to the `data` the new node shall contain, and a `context` (which might be an allocator, for example).
It shall allocate and return the created node, or `NULL` when node creation is not possible.

A `cx_tree_search_data_func` shall check if the `node` contains the `data`, or if a child node might contain the data.
It shall return zero, when `node` contains the `data`.
When a child node _might_ contain the data, the returned positive number shall indicate how close the match is.
It is _not_ relevant if the data can actually be found in the subtree - the positive number also indicates that the data could be inserted in that subtree.
Only when the entire subtree starting with `node` cannot contain the `data`, a negative number is supposed to be returned.

A `cx_tree_search_func` behaves exactly the same, except that it does expect a pointer to the `data` but a pointer to a node structure.
In combination with the `cx_tree_node_create_func`, when inserting nodes with the high-level functions, a `new_node` is created first,
and then an appropriate insertion point is searched with the `cx_tree_search_func`.

A `cx_tree_relink_func` is called when an intermediate node was removed from the tree and it's children need to be detached from
the removed `old_parent` and attached to a `new_parent`.
The function is called for every child of the removed node and can be used, for example, to update the contents of the node when needed.

## Create

```C
#include <cx/tree.h>

CxTree *cxTreeCreate(const CxAllocator *allocator,
        cx_tree_node_create_func create_func,
        cx_tree_search_func search_func,
        cx_tree_search_data_func search_data_func,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

CxTree *cxTreeCreateSimple(const CxAllocator *allocator,
        cx_tree_node_create_func create_func,
        cx_tree_search_func search_func,
        cx_tree_search_data_func search_data_func);

CxTree *cxTreeCreateWrapped(const CxAllocator *allocator,
        void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
```

The function `cxTreeCreate()` creates a `CxTree` structure
where each node created by the `create_func` has the layout specified by the offset arguments.
The function `cxTreeCreateSimple()` is exactly the same, except that it assumes the `cx_tree_node_base_layout`.

In both cases the tree will be created empty, that is with no nodes, not even a root node.
On the other hand, `cxTreeCreateWrapped()` creates a `CxTree` structure with the specified layout and `root` node
where `root` may already be the root of a larger tree.

If your wrapped tree structure does not have a `last_child` or a `prev` pointer,
you may specify a negative location to indicate the missing pointer.
All other pointers are mandatory and a non-negative location is expected.

Note that a wrapped tree by default has no create or search functions assigned.
Therefore, if you wish to use one of the functions below that needs those function pointers set,
you will have to set them manually by assigning to the respective fields in the `CxTree` structure.

### Example for wrapping a libxml2 tree

In this example we wrap the XML tree of the commonly used libxml2 library.

```C
#include <libxml/tree.h>
#include <cx/tree.h>

CxTree *wrap_xml_tree(xmlNode *root) {
    return cxTreeCreateWrapped(
        cxDefaultAllocator, // or you can just write NULL
        root,
        offsetof(xmlNode, parent),
        offsetof(xmlNode, children),
        offsetof(xmlNode, last),
        offsetof(xmlNode, prev),
        offsetof(xmlNode, next)
    );
}
```

You do not need to specify any function pointers or destructors in the `CxTree` structure,
if you just want to use the resulting `CxTree` for [iterating](#iterator-and-visitor) over the nodes.

You can, for example, print the XML structure with the following code:

```C
// continued from above

void print_xml_structure(CxTree *tree) {
    // specify visit_on_exit argument = true
    // so that we are visiting each node twice: on enter and on exit
    CxTreeIterator iter = cxTreeIterate(tree, true);
    
    cx_foreach(xmlNode*, node, iter) {
        if (node->type == XML_ELEMENT_NODE) {
            // the exiting field from the iterator indicates
            // whether we are entring or leaving the subtree
            // of the current element 
            printf("%s - %s\n",
                node->name,
                iter.exiting ? "start" : "end"
            );
        }
    }
    cxTreeIteratorDispose(&iter);
}
```

## Add Nodes

```C
#include <cx/tree.h>

void cx_tree_link(void *parent, void *node, ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

extern unsigned int cx_tree_add_look_around_depth;

size_t cx_tree_add_iter(struct cx_iterator_base_s *iter, size_t n,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **failed, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
        
size_t cx_tree_add_array(const void *src, size_t n, size_t elem_size,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **failed, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
        
int cx_tree_add(const void *src,
        cx_tree_search_func sfunc, cx_tree_node_create_func cfunc,
        void *cdata, void **cnode, void *root,
        ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
```

The low-level function `cx_tree_link()` adds a `node` to a new `parent`, unlinking it from its previous parent, if required.

The functions `cx_tree_add_array()` and `cx_tree_add_iter()` are equivalent,
except that the former adds `n` items of size `elem_size` from the `src` array,
and the latter adds _up to_ `n` items from the iterator `iter`.
For each element, the `cfun` is called with the element as first argument and the `cdata` context as second argument.
The `cfun` function is supposed to return a node for which then the `sfunc` is used to find the location where to insert the node.
If a node could not be added, because `sfunc` always returns a negative number, it is stored at the location where `failed` points to.

> The functions `cx_tree_add_array()` and `cx_tree_add_iter()` are optimized for adding multiple nodes to
> the same subtree without starting the search all over from the root node.
> This behavior can be controlled with `cx_tree_add_look_around_depth`, which specifies for how many parent nodes
> the algorithm backtracks in the current subtree before restarting the search from the root node.

The function `cx_tree_add()` is similar to the other add-functions,
except that it does only add one single item to the tree and always stores the created node at the location where `cnode` points to.

> By design, the add-functions can only be used to add children to the tree.
> If you want to add a new root node, this is much easier by simply calling `cx_tree_link(newroot, oldroot, layout...)` .

The following high-level functions can be used to add nodes to a `CxTree`.

```C
#include <cx/tree.h>

int cxTreeAddChild(CxTree *tree, void *parent, const void *data);

void cxTreeAddChildNode(CxTree *tree, void *parent, void *child);

void cxTreeSetParent(CxTree *tree, void *parent, void *child);

int cxTreeInsert(CxTree *tree, const void *data);

size_t cxTreeInsertIter(CxTree *tree, CxIteratorBase *iter, size_t n);

size_t cxTreeInsertArray(CxTree *tree, const void *data,
        size_t elem_size, size_t n);
```

The functions `cxTreeAddChild()`, `cxTreeAddChildNode()`, and `cxTreeSetParent()` are similar to `cx_tree_link()`.
The difference between `cxTreeAddChild()` and `cxTreeAddChildNode()` is,
that `cxTreeAddChild()` uses the `cx_tree_node_create_func` of the `CxTree` to create the node first, before adding it.
The function returns non-zero in case the creation of the node fails.
The difference between `cxTreeSetParent()` and `cxTreeAddChildNode()` is,
that `cxTreeSetParent()` does not increase the tree size, when `child` already had a parent.

> Use `cxTreeAddChildNode()` to add _new_ nodes to the tree and `cxTreeSetParent()`
> to relink a subtree of the `tree` with a new parent.
> 
> Using `cxTreeSetParent()` on a `child` which already has a parent in a _different_ tree is a mistake.

The functions `cxTreeInsert()`, `cxTreeInsertIter()`, and `cxTreeInsertArray()` behave
like `cx_tree_add()`, `cx_tree_add_iter()`, and `cx_tree_add_array()`.
As creation context `cdata` for the `cx_tree_node_create_func` a pointer to the `CxTree` is passed.
Instead of returning a `failed` node, like the low-level functions, the high-level functions automatically free the memory for the failed node.

## Size and Depth

```C
#include <cx/tree.h>

size_t cxTreeSubtreeSize(CxTree *tree, void *subtree_root);

size_t cxTreeSubtreeDepth(CxTree *tree, void *subtree_root);

size_t cxTreeSize(CxTree *tree);

size_t cxTreeDepth(CxTree *tree);
```

The function `cxTreeSubtreeSize()` counts all nodes belonging to the subtree starting with `subtree_root`.

The function `cxTreeSubtreeDepth()` reports the maximum depth of the subtree starting with `subtree_root`.
If the `subtree_root` does not have any children, this results in a depth of one.

The functions `cxTreeSize()` and `cxTreeDepth()` are equivalent to `cxTreeSubtreeSize()` and `cxTreeSubtreeDepth()`, respectively, where `subtree_root` is the root node of the entire tree.

> Passing a `NULL` pointer as `subtree_root` makes those functions return zero.

## Search

```C
#include <cx/tree.h>

#define CX_TREE_SEARCH_INFINITE_DEPTH

int cx_tree_search_data(const void *root, size_t max_depth,
        const void *data, cx_tree_search_data_func sfunc,
        void **result, ptrdiff_t loc_children, ptrdiff_t loc_next);

int cx_tree_search(const void *root, size_t max_depth,
        const void *node, cx_tree_search_func sfunc,
        void **result, ptrdiff_t loc_children, ptrdiff_t loc_next);
        
void *cxTreeFind(CxTree *tree, const void *data);

void *cxTreeFindInSubtree(CxTree *tree, const void *data,
        void *subtree_root, size_t max_depth);
```

The functions `cx_tree_search_data()` and `cx_tree_search()` are equivalent,
except that `cx_tree_search_data()` compares a currently investigated node against the `data` with the specified `cx_tree_search_data_func`,
and `cx_tree_search()` compares against a `node` with the specified `cx_tree_search_func`.

Usually you the latter is rarely needed.
It is used, for example, by `cx_tree_add()` to find the correct position where to add a freshly created node.

The search proceeds as follows, starting with the `root` node:

1. The current node is compared with `data` / `node` using the `sfunc`.
2. If `sfunc` returns zero, a matching node has been found and the pointer to that node is stored at the location given by `result`.
3. If `sfunc` returns a negative number, the entire subtree is skipped.
4. If `sfunc` returns a positive number, the subtree is searched, unless the number is larger than the number of an already searched subtree. The best match will be stored at the location given by `result`. 
5. The return value will be the return value of the `sfunc` for node that was stored in the `result`, or a negative number when no match was found

> If the `sfunc` implements some kind of metric, the search functions will return the best match in the tree with respect to that metric.
> When a positive number is returned, it means that a new node with the searched data could be added as a child of the found node.
> This is exactly how `cx_tree_add()` finds the best position to add a node to the tree.

The function `cxTreeFind()` uses the `search_data_func` of the `CxTree`
to find the `data` in the tree, and returns a pointer to the node when the data was found (or `NULL`, otherwise).

The function `cxTreeFindInSubtree()` is equivalent, except that it restricts the search to nodes
in the subtree starting with (and including) `subtree_root`, and skipping all nodes below the `max_depth`.
Note, that the `max_depth` is specified in relation to the `subtree_root` and not in relation to the entire tree. 

## Iterator and Visitor

```C
#include <cx/tree.h>

CxTreeIterator cx_tree_iterator(void *root, bool visit_on_exit,
        ptrdiff_t loc_children, ptrdiff_t loc_next);

CxTreeIterator cxTreeIterate(CxTree *tree, bool visit_on_exit);

CxTreeIterator cxTreeIterateSubtree(CxTree *tree,
        void *node, bool visit_on_exit);

#define cxTreeIteratorContinue(iter)

void cxTreeIteratorDispose(CxTreeIterator *iter);


CxTreeVisitor cx_tree_visitor(void *root,
        ptrdiff_t loc_children, ptrdiff_t loc_next);

CxTreeVisitor cxTreeVisit(CxTree *tree);

CxTreeVisitor cxTreeVisitSubtree(CxTree *tree, void *node)

#define cxTreeVisitorContinue(visitor)

void cxTreeVisitorDispose(CxTreeVisitor *visitor);
```

There are two different kinds of iterators for trees: one for depth-first iteration and one for breadth-first iteration.

The `CxTreeIterator` is performing a depth-first iteration with the capability of visiting a node twice:
first when the iterator enters the node coming from its parent, and secondly when the iterator tracks back from its last child.
This behavior is controlled via the `visit_on_exit` argument.
When set to `true`, the iterators `exiting` flag can be checked during iteration to see whether the iterator is currently entering or leaving the node. 
The above [example](#example-for-wrapping-a-libxml2-tree) for iterating through an XML tree illustrates this.

On the other hand, the `CxTreeVisitor` performs a breadth-first iteration and visits every node only once.

Both iterators do not support the removal of nodes.

Since tree iteration needs to keep track of a stack (depth-first) or queue (breadth-first), internal memory is allocated.
This memory is _automatically_ disposed of when the iteration _completes_.
If you break from the iteration early, you must call `cxTreeIteratorDispose()` or `cxTreeVisitorDispose()`, respectively, to deallocate the memory.

> It is recommended to always invoke the dispose functions, even when it seems not necessary.
> In the best case it just does nothing, but calling them guarantees that no memory can be leaking, even when your code will change in the future.
>{style="note"}

The macros `cxTreeIteratorContinue()` and `cxTreeVisitorContinue()` equivalently instruct the iterator/visitor to skip the subtree below the currently inspected node.

## Remove

```C
#include <cx/tree.h>

void cx_tree_unlink(void *node, ptrdiff_t loc_parent,
        ptrdiff_t loc_children, ptrdiff_t loc_last_child,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

int cxTreeRemoveNode(CxTree *tree, void *node,
        cx_tree_relink_func relink_func);

void cxTreeRemoveSubtree(CxTree *tree, void *node);
```

The low-level function `cx_tree_unlink()` removes the specified `node`,
as well as the entire subtree beneath that node, from its parent.
The high-level counterpart is `cxTreeRemoveSubtree()`.

The function `cxTreeRemoveNode()`, on the other hand, _only_ removes the `node` from the tree,
links all children of `node` to the former parent of `node`, 
and calls the optional `relink_func` for every former child of `node` which will be relinked to a new parent.
Therefore, calling `cxTreeRemoveNode()` on the `root` node is an error and the function returns non-zero.
In all other cases, the function returns zero.

> When your tree is storing a scene graph, for example, a possible use-case for the `relink_func` might be updating
> the world transform matrices in the subtree of the removed node. 

## Dispose

```C
#include <cx/tree.h>

int cxTreeDestroyNode(CxTree *tree, void *node,
        cx_tree_relink_func relink_func);

void cxTreeDestroySubtree(CxTree *tree, void *node);

void cxTreeClear(CxTree *tree);

void cxTreeFree(CxTree *tree);
```

The function `cxTreeDestroyNode()` first [removes](#remove) the `node` from the tree, like `cxTreeRemoveNode()`,
and then invokes the [destructor functions](collection.h.md#destructor-functions).
It is guaranteed that the simple destructor is called before the advanced destructor.
If no destructor function is registered at all, the behavior is identical to `cxTreeRemoveNode()`.
That means this function does not deallocate the memory for the node on its own and leaves that entirely to the destructor functions.

The function `cxTreeDestroySubtree()` performs the above actions for the entire subtree starting with `node`.
The order in which the destructor functions for the nodes of the subtree are called is determined by a [tree iterator](#iterator-and-visitor).

The function `cxTreeClear()` is a shorthand for invoking `cxTreeDestroySubtree()` on the root node of the tree.

The function `cxTreeFree()` behaves like `cxTreeClear()` and then deallocates the memory for the `CxTree` structure.

> Although `CxTree` supports the general concept of [destructor functions](collection.h.md#destructor-functions),
> it is not based on `CX_COLLECTION_BASE`.
> Therefore, the `cxDefineDestructor()` and `cxDefineAdvancedDestructor()` macros cannot be used on a `CxTree` and
> the fields must be set manually.
>{style="note"}

<seealso>
<category ref="apidoc">
<a href="https://ucx.sourceforge.io/api/tree_8h.html">tree.h</a>
</category>
</seealso>

mercurial