docs/Writerside/topics/linked_list.h.md

Thu, 23 Oct 2025 17:54:17 +0200

author
Mike Becker <universe@uap-core.de>
date
Thu, 23 Oct 2025 17:54:17 +0200
changeset 1441
78ec3e2243e4
parent 1424
563033aa998c
permissions
-rw-r--r--

add documentation for cxListClone() - relates to #744

# Linked List

On top of implementing the list interface, this header also defines several low-level functions that
work with arbitrary structures.

The high-level [list interface](list.h.md) is documented on a separate page and explains how lists are used
that are created by one of the following functions.

```C
#include <cx/linked_list.h>

CxList *cxLinkedListCreate(const CxAllocator *allocator,
        cx_compare_func comparator, size_t elem_size);
        
CxList *cxLinkedListCreateSimple(size_t elem_size);
```

The remaining content of this page concentrates on the low-level functions. 

## Basic Structure

All functions described on this page work with nodes that have the following structure.
```C
struct node {
    node *next;
    node *prev; // optional
    // ... the element data goes here ...
};
```

Each node must have at least one pointer that contains the address of the subsequent node.
Optionally, for doubly-linked lists, there may be a second pointer that points to the predecessor.

The functions usually expect a `loc_prev` and a `loc_next` offset.
In the example structure from above you can obtain them with `offsetof(struct node, next)` and `offsetof(struct node, prev)`.
In all functions, `loc_prev` is optional in the sense, that when you do not have a `prev` pointer, you can specify a negative value.
When a function expects a `loc_advance` offset, you can freely choose if you want to pass the offset of the `next` or the `prev` pointer,
depending on what you want to do.

If a function expects a `void** begin` and a `void** end` pointer, they are usually both optional, unless otherwise specified.
If non-`NULL`, they point to the variables where the addresses of the first, or the last, node of a list are stored, respectively.
When a list operation results in a new first (or last) node, the addresses are overwritten.
In simple scenarios, you usually keep a pointer to the beginning of a list, and hence you would usually pass `NULL` to the `end` argument.
If you are designing a stack-like linked-list, it may happen that you only want to store the last node of a list (the top of stack),
hence passing `NULL` to the `begin` argument and the address of your top-of-stack pointer to the `end` argument.
Either way, most functions allow you a big deal of flexibility - please still read the documentation of each function carefully to learn which combinations are allowed.

When you are working with a singly-linked list, it is still sometimes necessary to access the predecessor of a node.
In the absence of a `prev` pointer, the only chance is to start at the beginning of the list and search for that node.
This is exactly what `cx_linked_list_prev()` does.
```C
#include <cx/linked_list.h>

void *cx_linked_list_prev(const void *begin,
        ptrdiff_t loc_next, const void *node);
```

Starting with the `begin` node, the function checks if the next node is exactly the `node` (by pointer-comparison).
If true, the function terminates and returns the current node.
Otherwise, it moves on with the search.
If `begin` is already the searched `node`, this function immediately returns `NULL` as there is no predecessor.
Note carefully that the behavior of this function is not defined when `node` is not contained in the list that starts with `begin`.

> It is advisable to use the low-level functions inside own custom functions that you define particularly for your lists.
> Otherwise you will get a lot of boilerplate code when specifying the offsets to the pointers in your node structure in every call
> across your entire program.

## Link and Unlink

```C
#include <cx/linked_list.h>

void cx_linked_list_link(void *left, void *right,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

void cx_linked_list_unlink(void *left, void *right,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);
```

When you have two nodes `left` and `right` you can link or unlink them with the functions shown above.

When linking `left` and `right` you should make sure that `left` as currently no successor and `right` has no predecessor,
because the pointers will be overwritten without unlinking possible existing links, first.

On the other hand, when unlinking `left` and `right`, they must actually be linked right now.
This is even asserted in debug builds.

Usually you should not need those functions when working with the [insert](#insert) and [remove](#remove) functions below. 

## Insert

```C
#include <cx/linked_list.h>

void cx_linked_list_add(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *new_node);

void cx_linked_list_prepend(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *new_node);

void cx_linked_list_insert(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *node, void *new_node);
        
void cx_linked_list_insert_chain(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *node, void *insert_begin, void *insert_end);

void cx_linked_list_insert_sorted(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *new_node, cx_compare_func cmp_func);

void cx_linked_list_insert_sorted_chain(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *insert_begin, cx_compare_func cmp_func);
        
int cx_linked_list_insert_unique(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *new_node, cx_compare_func cmp_func);

void *cx_linked_list_insert_unique_chain(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *insert_begin, cx_compare_func cmp_func);
```

The above functions can be used to insert one or more elements into a linked list.

The functions `cx_linked_list_add()` and `cx_linked_list_prepend()` add the `new_node` to the beginning, or the end, of the list, respectively.
Either `begin` or `end` needs to be non-`NULL`.
If you pass `NULL` to `end` in `cx_linked_list_add()`, you have to specify `begin` and `loc_next`, instead.
On the other hand, if you pass `NULL` to `begin` in `cx_linked_list_prepend()`, you have to specify `end` and `loc_prev`, instead.

The function `cx_linked_list_insert()` behaves like `cx_linked_list_insert_chain()` where both
`insert_begin` and `insert_end` are set to `new_node`.

The function `cx_linked_list_insert_chain()` inserts the chain of nodes starting with `insert_begin` after the node `node`.
If `insert_end` is `NULL`, the end is automatically detected, in which case `loc_next` must be available.
If you pass `NULL` to `node`, the entire chain is prepended to the list, in which case either `begin` must be non-`NULL`,
or `end` must be non-`NULL` and `loc_prev` must be available to determine the start of the list.

The functions `cx_linked_list_insert_sorted()` and `cx_linked_list_insert_sorted_chain()` are equivalent,
except that `begin` and `loc_next` are always required, and the target list must already be sorted.
The order is determined by the `cmp_func` to which the pointers to the nodes are passed.

The functions `cx_linked_list_insert_unique()` and `cx_linked_list_insert_unique_chain()` are similar to
`cx_linked_list_insert_sorted()` and `cx_linked_list_insert_sorted_chain()`, except that they only insert the node
if it is not already contained in the list.
When a duplicate is found, `cx_linked_list_insert_unique()` returns non-zero and `cx_linked_list_insert_unique_chain()`
returns a pointer to a new chain starting with the first node that was not inserted.

> The `cx_linked_list_insert_sorted_chain()` function does not have an `insert_end` argument, because
> it cannot take advantage of simply inserting the entire chain as-is, as the chain might need to be broken
> to maintain the sort order.

## Access and Find

```C
#include <cx/linked_list.h>

void *cx_linked_list_first(const void *node, ptrdiff_t loc_prev);

void *cx_linked_list_last(const void *node, ptrdiff_t loc_next);

void *cx_linked_list_at(const void *start,
        size_t start_index, ptrdiff_t loc_advance, size_t index);

void *cx_linked_list_find(const void *start,
        ptrdiff_t loc_advance, ptrdiff_t loc_data,
        cx_compare_func cmp_func,
        const void *elem, size_t *found_index);

size_t cx_linked_list_size(const void *node, ptrdiff_t loc_next);
```

The functions `cx_linked_list_first()` and `cx_linked_list_last()` can be used to find the first, or last,
node in your list in case you are not keeping track of them separately.
They can start at an arbitrary node within the list.

The function `cx_linked_list_at()` starts at an arbitrary node `start` which is _specified_ to have the index `start_index`,
and finds the node at `index`.
If `index` is larger than `start_index`, you must pass the offset of the `next` pointer to `loc_advanced`.
On the other hand, if `index` is smaller than `start_index`, you must pass the offset of the `prev` pointer.
If both indices match, the function simply returns `start`.
And if `index` is out-of-bounds, the function returns `NULL`.

The function `cx_linked_list_find()` starts a search at the `start` node and compares each element with `elem`.
If `loc_data` is non-zero, the data-type of `elem` must be a pointer to data which is compatible to the data located at the specified offset in the node.
If `loc_data` is zero, `elem` is expected to point to a node structure (which is usually artificially created for the sake of comparison and not contained in the list).
When the searched element is found, a pointer to the node is returned and the index (assuming `start` has index zero) is written to the optional `found_index`, if non-`NULL`.

The size of a list, or sub-list, can be determined with `cx_linked_list_size()` which may start at an arbitrary `node“ in the list.

> A creative way of using `cx_linked_list_size()` in doubly-linked lists is to use the offset of the `prev` pointer
> for `loc_next`, in which case the function will return the index of the node within the list plus one.

## Remove

```C
#include <cx/linked_list.h>

void cx_linked_list_remove(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next, void *node);

size_t cx_linked_list_remove_chain(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        void *node, size_t num);
```

You can either remove a single element or a specified number of subsequent elements from the list.

The function `cx_linked_list_remove()` is equivalent to `cx_linked_list_remove_chain()` where `num` is set to one.

The function `cx_linked_list_remove_chain()` removes up to `num` number of nodes from the list where `node` is contained, including `node` itself.
It returns the actual number of removed elements, which might be smaller when the list does not contain at least `num` elements.

The `prev` and `next` pointers of _all_ removed nodes are kept completely intact, allowing traversal within the removed chain,
as well as identifying the formerly adjacent nodes with the list from which the chain was removed.  

> Both `begin` and `end` pointers are optional, if you specify both `loc_prev` and `loc_next`.
> In case your list does not have a `prev` pointer, specifying `begin` is mandatory (because there would be no other way to determine the predecessor of `node`).
>{style="note"} 

> While specifying _only_ `loc_prev` and `end` is technically illegal, you can simply swap roles
> and use the offset of your `prev` pointer as `loc_next` and the address of your `end` pointer as `begin`.
> The list is then traversed backwards, but otherwise everything works as expected. 

## Reorder

```C
#include <cx/linked_list.h>

void cx_linked_list_reverse(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next);

void cx_linked_list_sort(void **begin, void **end,
        ptrdiff_t loc_prev, ptrdiff_t loc_next,
        ptrdiff_t loc_data, cx_compare_func cmp_func);
```

The function `cx_linked_list_reverse()` swaps all links of all nodes in the specified list, effectively reversing the order of elements.

The function `cx_linked_list_sort()` uses a recursive _natural mergesort_ implementation to sort the list with respect to the specified `cmp_fnc`.
The non-negative `loc_data` offset is used to calculate the pointers that are passed to the compare function.
If you choose `loc_data` to be zero, the pointers to the nodes themselves are passed.

> The `begin` pointer is required in all of the above functions while the `end` pointer is still optional.
> {style="note"}

> Sorting uses [small buffer optimization](install.md#small-buffer-optimizations) for small list sizes.
> If a list contains no more than `CX_LINKED_LIST_SORT_SBO_SIZE` number of elements, no additional heap memory is allocated during the entire operation.
> Otherwise, merge operations still do not allocate extra memory as long as the length of the merged sub-list is not larger.

## Compare

```C
#include <cx/linked_list.h>

int cx_linked_list_compare(
        const void *begin_left, const void *begin_right,
        ptrdiff_t loc_advance, ptrdiff_t loc_data,
        cx_compare_func cmp_func
);
```

For comparing two linked list, you need to specify where to start,
and the offsets for the pointer to the next node and the data.

In the natural case, `begin_left` and `begin_right` point to the first node of either list,
and `loc_advance` is the offset of the `next` pointer.
But it is also possible to start with the _last_ node of both lists and use the `prev` pointer to compare them backwards.

The `loc_data` offset is used to calculate the pointer that is passed to the `cmp_fnc`.
This can either be the offset of a specific field in the struct or simply zero, in which case the pointers to the nodes themselves are passed to the compare function.

<seealso>
<category ref="apidoc">
<a href="https://ucx.sourceforge.io/api/linked__list_8h.html">linked_list.h</a>
</category>
</seealso>

mercurial