Mon, 10 Mar 2025 11:54:46 +0100
complete list.h documentation
relates to #451
# List Interface The `list.h` header defines a common interface for all list implementations. UCX already comes with two common list implementations ([linked list](linked_list.h.md) and [array list](array_list.h.md)) that should cover most use cases. But if you feel the need to implement an own list, you will find instructions [below](#implement-own-list-structures). ```C #include <cx/linked_list.h> CxList *cxLinkedListCreate(const CxAllocator *allocator, cx_compare_func comparator, size_t elem_size); CxList *cxLinkedListCreateSimple(size_t elem_size); #include <cx/array_list.h> CxList *cxArrayListCreate(const CxAllocator *allocator, cx_compare_func comparator, size_t elem_size, size_t initial_capacity); CxList *cxArrayListCreateSimple(size_t elem_size, size_t initial_capacity); ``` The function `cxLinkedListCreate()` creates a new linked list with the specified `allocator` which stores elements of size `elem_size`. The elements are supposed to be compared with the specified `comparator` (cf. [](#access-and-find)). The function `cxLinkedListCreateSimple()` will use the stdlib default allocator and does not specify a compare function. Array lists can be created with `cxArrayListCreate()` where the additional argument `initial_capacity` specifies for how many elements the underlying array shall be initially allocated. If `CX_STORE_POINTERS` is used as `elem_size`, the actual element size will be `sizeof(void*)` and the list will behave slightly differently when accessing elements. Lists that are storing pointers will always return the stored pointer directly, instead of returning a pointer into the list's memory, thus saving you from unnecessary dereferencing. Conversely, when adding elements to the list, the pointers you specify (e.g. in `cxListAdd()` or `cxListInsert()`) are directly added to the list, instead of copying the contents pointed to by the argument. > When you create a list which is storing pointers and do not specify a compare function, `cx_cmp_ptr` will be used by default. ## Example In the following example we create a linked-list of regular expressions for filtering data. ```C #include <cx/linked_list.h> #include <regex.h> CxList *create_pattern_list(void) { // create a linked list to store patterns CxList *list = cxLinkedListCreateSimple(sizeof(regex_t)); // automatically free the pattern when list gets destroyed cxDefineDestructor(list, regfree); return list; } int add_pattern(CxList *list, const char *value) { // compile pattern and add it to the list, if successful regex_t regex; if (regcomp(®ex, value, REG_EXTENDED|REG_NOSUB)) { return 1; } else { // take address of local variable // since we are not storing pointers in this list, // we get a copy of the data directly in the list cxListAdd(list, ®ex); return 0; } } bool matches_any(CxList *list, const char *text) { CxIterator iter = cxListIterator(list); cx_foreach(regex_t*, pat, iter) { if (regexec(pat, text, 0, NULL, 0) == 0) { return true; } } return false; } ``` If in the above example the list was created with `CX_STORE_POINTERS` instead of `sizeof(regex_t)`, the `add_pattern()` function would need to be changed as follows: ```C int add_pattern(CxList *list, const char *value) { // allocate memory here, because now we are storing pointers regex_t *regex = malloc(sizeof(regex_t)); if (!regex || regcomp(regex, value, REG_EXTENDED|REG_NOSUB)) { return 1; } else { cxListAdd(list, regex); return 0; } } ``` Also, just registering `regfree()` as destructor is not sufficient anymore, because the `regex_t` structure also needs to be freed. Therefore, we would need to wrap the calls to `regfree()` and `free()` into an own destructor, which we then register with the list. However, it should be clear by now that using `CX_STORE_POINTERS` is a bad choice for this use case to begin with. As a rule of thumb: if you allocate memory for an element that you immediately put into the list, consider storing the element directly. And if you are getting pointers to already allocated memory from somewhere else, and you just want to organize those elements in a list, then consider using `CX_STORE_POINTERS`. > If you want to lazy-initialize lists, you can use the global `cxEmptyList` symbol as a placeholder instead of using a `NULL`-pointer. > While you *must not* insert elements into that list, you can safely access this list or create iterators. > This allows you to write clean code without checking for `NULL`-pointer everywhere. > You still need to make sure that the placeholder is replaced with an actual list before inserting elements. ## Insert ```C #include <cx/list.h> int cxListAdd(CxList *list, const void *elem); int cxListInsert(CxList *list, size_t index, const void *elem); int cxListInsertSorted(CxList *list, const void *elem); size_t cxListAddArray(CxList *list, const void *array, size_t n); size_t cxListInsertArray(CxList *list, size_t index, const void *array, size_t n); size_t cxListInsertSortedArray(CxList *list, const void *array, size_t n); int cxListInsertAfter(CxIterator *iter, const void *elem); int cxListInsertBefore(CxIterator *iter, const void *elem); ``` The function `cxListAdd()` appends the element `elem` to the list and returns zero on success or non-zero when allocating the memory for the element fails. Similarly, `cxListInsert()` adds the element at the specified `index`, and `cxListInsertSorted()` adds the element at the correct position so that the list remains sorted according to the list's compare function. When you are currently iterating through a list, you can insert elements before or after the current position of the iterator with `cxListInsertBefore()` or `cxListInsertAfter()`, respectively. This _should_ be done with [mutating iterators](iterator.h.md#mutating-iterators) only, but is also defined for normal iterators. If the list is storing pointers (cf. `cxCollectionStoresPointers()`), the pointer `elem` is directly added to the list. Otherwise, the contents at the location pointed to by `elem` are copied to the list's memory with the element size specified during creation of the list. On the other hand the `array` argument for `cxListAddArray()`, `cxListInsertArray()`, and `cxListInsertSortedArray()` must always point to an array of elements to be added (i.e. either an array of pointers, or an array of actual elements). A special requirement for `cxListInsertSortedArray()` is, that the `array` must already be sorted. > Implementations are required to optimize the insertion of a sorted array into a sorted list in linear time, > while inserting each element separately via `cxListInsertSorted()` may take quadratic time. > The functions `cxListInsertSorted()` and `cxListInsertSortedArray()` do neither check if the list is already sorted, nor do they actively sort the list. > {style="note"} ## Access and Find <link-summary>Functions for searching and accessing elements.</link-summary> ```C #include <cx/list.h> void *cxListAt(const CxList *list, size_t index); size_t cxListFind(const CxList *list, const void *elem); size_t cxListFindRemove(CxList *list, const void *elem); bool cxListIndexValid(const CxList *list, size_t index); size_t cxListSize(const CxList *list); ``` The function `cxListAt()` accesses the element at the specified `index`. If the list is storing pointers (i.e. `cxCollectionStoresPointers()` returns `true`), the pointer at the specified index is directly returned. Otherwise, a pointer to element at the specified index is returned. If the index is out-of-bounds, the function returns `NULL`. On the other hand, `cxListFind()` searches for the element pointed to by `elem` by comparing each element in the list with the list`s compare function, and returns the first index when the element was found. Otherwise, the function returns the list size. The function `cxListFindRemove()` behaves like `cxListFind()`, except that it also removes the first occurrence of the element from the list. This will _also_ call destructor functions, if specified, so take special care when you continue to use `elem`, or when the list is storing pointers and the element appears more than once in the list. With `cxListIndexValid()` you can check the index returned by `cxListFind()` or `cxListFindRemove()`, which is more convenient than comparing the return value if the return value of `cxListSize()`. ## Remove ```C #include <cx/list.h> int cxListRemove(CxList *list, size_t index); size_t cxListRemoveArray(CxList *list, size_t index, size_t num); int cxListRemoveAndGet(CxList *list, size_t index, void *targetbuf); size_t cxListRemoveArrayAndGet(CxList *list, size_t index, size_t num, void *targetbuf); size_t cxListFindRemove(CxList *list, const void *elem); void cxListClear(CxList *list); ``` The function `cxListRemove()` removes the element at the specified index and returns zero, or returns non-zero if the index is out-of-bounds. The function `cxListRemoveArray()` removes `num` elements starting at `index` and returns the number of elements that have been removed. For each removed element, the [destructor functions](collection.h.md#destructor-functions) are called. On the other hand, `cxListRemoveAndGet()` and `cxListRemoveArrayAndGet()` do not invoke the destructor functions and instead copy the elements into the `targetbuf`, which must be large enough to hold the removed elements. The function `cxListFindRemove()` finds the first element within the list that is equivalent to the element pointed to by `elem` and removes it from the list. This will _also_ call destructor functions, if specified, so take special care when you continue to use `elem`. The function `cxListClear()` simply removes all elements from the list, invoking the destructor functions. It behaves equivalently, but is usually more efficient than calling `cxListRemove()` for every index. ## Iterators ```C #include <cx/list.h> CxIterator cxListIterator(const CxList *list); CxIterator cxListBackwardsIterator(const CxList *list); CxIterator cxListIteratorAt(const CxList *list, size_t index); CxIterator cxListBackwardsIteratorAt(const CxList *list, size_t index); CxIterator cxListMutIterator(CxList *list); CxIterator cxListMutBackwardsIterator(CxList *list); CxIterator cxListMutIteratorAt(CxList *list, size_t index); CxIterator cxListMutBackwardsIteratorAt(CxList *list, size_t index); ``` The functions `cxListIterator()` and `cxListBackwardsIterator()` create iterators that start and the first or the last element in the list and iterate through the entire list. The functions `cxListIteratorAt()` and `cxListBackwardsIteratorAt()` start with the element at the specified index and iterate until the end, or the beginning of the list, respectively. The functions with `Mut` in are equivalently, except that they create a [mutating iterator](iterator.h.md#mutating-iterators). Removing elements via a mutating iterator will cause an invocation of the [destructor functions](collection.h.md#destructor-functions) for the removed element. If is safe to specify an out-of-bounds index in which case an iterator is returned for which `cxIteratorValid()` returns `false`, immediately. ## Reorder ```C #include <cx/list.h> int cxListSwap(CxList *list, size_t i, size_t j); void cxListReverse(CxList *list); void cxListSort(CxList *list); ``` The function `cxListSwap()` swaps two elements specified by the indices `i` and `j`. The return value is non-zero if one of the indices is out-of-bounds. The function `cxListReverse()` reorders all elements, so that they appear in exactly the opposite order after invoking this function. The function `cxListSort()` sorts all elements with respect to the list's compare function, unless the list is already sorted (cf. `cxCollectionSorted()`), in which case the function immediately returns. Default UCX implementations of the list interface make use of [small buffer optimizations](install.md#small-buffer-optimizations) when swapping elements. > An invocation of `cxListSort` sets the `sorted` flag of the [collection](collection.h.md). > Implementations usually make use of this flag to optimize search operations, if possible. > For example the [array list](array_list.h.md) implementation will use binary search > for `cxListFind()` and similar operations, when the list is sorted. ## Compare ```C #include <cx/list.h> int cxListCompare(const CxList *list, const CxList *other); ``` Arbitrary lists can be compared element-wise with `cxListCompare()`, as long as the compare functions of both lists are equivalent. That means, you can compare an UCX [array list](array_list.h.md) with a [linked list](linked_list.h.md), and you could even compare lists that are storing pointers with lists that are not storing pointers. However, the optimized list-internal compare implementation is only used, when both the compare functions and the list classes are identical. Otherwise, `cxListCompare()` will behave as if you were iterating through both lists and manually comparing the elements. The return value of `cxListCompare()` is zero, if the lists are element-wise equivalent. If they are not, the non-zero return value equals the return value of the used compare function for the first pair of elements that are not equal. ## Dispose ```C #include <cx/list.h> void cxListFree(CxList *list); ``` No matter with which function a `CxList` has been created, you can _always_ deallocate the entire memory with a call to `cxListFree()`. The effect is equivalent to invoking `cxListClear()` plus deallocating the memory for the list structure. That means, for each element in the list, the [destructor functions](collection.h.md#destructor-functions) are called before deallocating the list. When the list has been storing pointers, make sure that either another reference to the same memory exist in your program, or any of the destructor functions deallocates the memory. Otherwise, there is a risk of a memory leak. ## Implement own List Structures If you want to create your own list implementation, this is extremely easy. You just need to define a function for creating your list and assign a `cx_list_class` structure with the pointers to your implementation. Then you call the `cx_list_init()` helper function to initialize the collection sture. This also automatically adds support for `CX_STORE_POINTERS` to your list. ```C // define the class with pointers to your functions static cx_list_class my_list_class = { my_list_destructor, my_list_insert_element, my_list_insert_array, my_list_insert_sorted, my_list_insert_iter, my_list_remove, my_list_clear, my_list_swap, my_list_at, my_list_find_remove, my_list_sort, my_list_compare, my_list_reverse, my_list_iterator, }; typedef struct { struct cx_list_s base; // your custom list data goes here } my_list; CxList *my_list_create( const CxAllocator *allocator, cx_compare_func cmpfun, size_t elem_size ) { if (allocator == NULL) { allocator = cxDefaultAllocator; } my_list *list = cxCalloc(allocator, 1, sizeof(my_list)); if (list == NULL) return NULL; cx_list_init((CxList*)list, &my_list_class, allocator, cmpfun, elem_size); // initialize your custom list data here return (CxList *) list; } ``` The required behavior for the implementations is described in the following table. You can always look at the source code of the UCX implementations to get inspiration. | Function | Description | |------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `clear` | Invoke destructor functions on all elements and remove them from the list. | | `destructor` | Invoke destructor functions on all elements and deallocate the entire list memory. | | `insert_element` | Insert a single element at the specified index. Return zero on success and non-zero on failure. | | `insert_array` | Insert an array of elements starting at the specified index. Return the number of elements inserted. | | `insert_sorted` | Insert an array of sorted elements into a sorted list. Return the number of elements inserted. | | `insert_iter` | Insert a single element depending on the iterator position. The third argument to this function is zero when the element shall be inserted after the iterator position and non-zero if it shall be inserted before the iterator position. The implementation is also responsible for adjusting the iterator, respectively. | | `remove` | Removes a multiple elements starting at the specified index. If a target buffer is specified, copy the elements to that buffer. Otherwise, invoke the destructor functions for the elements. Return the number of elements actually removed. | | `swap` | Swap two elements by index. Return zero on success or non-zero when any index was out-of-bounds. | | `at` | Return a pointer to the element at the specified index or `NULL` when the index is out-of-bounds. | | `find_remove` | Search for the specified element with the list's compare function and return the index if found. If the `remove` argument is true, invoke the destructor functions and remove the element. Return the list size if the element is not found. | | `sort` | Sort all elements in the list with respect to the list's compare function. | | `compare` | Only function pointer that may be `NULL`, in which case an unoptimized fallback is used. You can implement an optimized compare function that compares the list of another list of the same class. | | `reverse` | Reorders all elements in the list so that they appear in exactly the opposite order. | | `iterator` | Creates an iterator starting at the specified index. The Boolean argument indicates whether iteration is supposed to traverse backwards. | > If you initialize your list with `cx_list_init()` you do not have to worry about making a > difference between storing pointers and storing elements, because your implementation will > be automatically wrapped. > This means, you only have to handle the one single case described above. ### Default Class Function Implementations If you are satisfied with some of the default implementations, you can use some pre-defined functions. Note, however, that those are not optimized for any specific list structure and just serve as a quick and convenient solution if performance does not matter for your use case. | Default Function | Description | |---------------------------------|-----------------------------------------------------------------------------------| | `cx_list_default_insert_array` | Falls back to multiple calls of `insert_element`. | | `cx_list_default_insert_sorted` | Uses linear search to find insertion points. | | `cx_list_default_sort` | Copies all elements to an array, applies `qsort()`, and copies the elements back. | | `cx_list_default_swap` | Uses a temporarily allocated buffer to perform a three-way-swap. | <seealso> <category ref="apidoc"> <a href="https://ucx.sourceforge.io/api/list_8h.html">list.h</a> </category> </seealso>