implement a new allocation strategy for array lists and add cxListReserve() and cxListShrink()

Mon, 10 Nov 2025 21:36:15 +0100

author
Mike Becker <universe@uap-core.de>
date
Mon, 10 Nov 2025 21:36:15 +0100
changeset 1482
6769cb72521b
parent 1481
1dda1eed1899
child 1483
97a6cf1520ba

implement a new allocation strategy for array lists and add cxListReserve() and cxListShrink()

resolves #758

CHANGELOG file | annotate | diff | comparison | revisions
docs/Writerside/topics/about.md file | annotate | diff | comparison | revisions
docs/Writerside/topics/list.h.md file | annotate | diff | comparison | revisions
src/array_list.c file | annotate | diff | comparison | revisions
src/cx/array_list.h file | annotate | diff | comparison | revisions
src/cx/list.h file | annotate | diff | comparison | revisions
src/kv_list.c file | annotate | diff | comparison | revisions
src/linked_list.c file | annotate | diff | comparison | revisions
src/list.c file | annotate | diff | comparison | revisions
--- a/CHANGELOG	Sun Nov 09 16:29:22 2025 +0100
+++ b/CHANGELOG	Mon Nov 10 21:36:15 2025 +0100
@@ -8,6 +8,7 @@
  * adds support for comparing arbitrary strings without explicit call to cx_strcast()
  * adds clone, union, difference, and intersection functions for CxList and CxMap
  * adds cxListContains() and cxMapContains()
+ * adds cxListReserve() and cxListShrink()
  * adds cxListSet()
  * adds cxListFirst() and cxListLast()
  * adds cxListRemoveAndGetFirst() and cxListRemoveAndGetLast(),
@@ -35,6 +36,8 @@
  * changes insert_element member function of CxList to accept NULL source and return a pointer to the inserted element
  * changes the compare function wrapper for pointer lists so that it no longer invokes the actual compare function for NULL pointers
  * changes struct cx_array_reallocator_s by replacing the four generic data members with two specifically named members
+ * changes cx_array_reserve() so that it reserves exactly the requested capacity (i.e., without overallocation)
+ * changes all other array functions to perform smart overallocation to avoid too many subsequent allocations
  * fixes critical memory overflow in the stack-based array reallocator (this unfortunately breaks the function signature)
  * fixes critical bug in cx_array_insert_sorted() that caused an infinite loop when inserting duplicates
  * fixes mempool implementation not supporting NULL as argument for realloc
--- a/docs/Writerside/topics/about.md	Sun Nov 09 16:29:22 2025 +0100
+++ b/docs/Writerside/topics/about.md	Mon Nov 10 21:36:15 2025 +0100
@@ -35,6 +35,7 @@
 * adds support for comparing arbitrary strings without explicit call to cx_strcast()
 * adds clone, union, difference, and intersection functions for CxList and CxMap
 * adds cxListContains() and cxMapContains()
+* adds cxListReserve() and cxListShrink()
 * adds cxListSet()
 * adds cxListFirst() and cxListLast()
 * adds cxListRemoveAndGetFirst() and cxListRemoveAndGetLast(),
@@ -62,6 +63,8 @@
 * changes insert_element member function of CxList to accept NULL source and return a pointer to the inserted element
 * changes the compare function wrapper for pointer lists so that it no longer invokes the actual compare function for NULL pointers
 * changes struct cx_array_reallocator_s by replacing the four generic data members with two specifically named members
+* changes cx_array_reserve() so that it reserves exactly the requested capacity (i.e., without overallocation)
+* changes all other array functions to perform smart overallocation to avoid too many subsequent allocations
 * fixes critical memory overflow in the stack-based array reallocator (this unfortunately breaks the function signature)
 * fixes critical bug in cx_array_insert_sorted() that caused an infinite loop when inserting duplicates
 * fixes mempool implementation not supporting NULL as argument for realloc
--- a/docs/Writerside/topics/list.h.md	Sun Nov 09 16:29:22 2025 +0100
+++ b/docs/Writerside/topics/list.h.md	Mon Nov 10 21:36:15 2025 +0100
@@ -432,6 +432,33 @@
 > Passing the same pointer for different list arguments may result in erroneous behavior.
 >{style="warning"}
 
+## Reserve and Shrink
+
+```C
+#include <cx/list.h>
+
+int cxListReserve(CxList *list, size_t count);
+
+int cxListShrink(CxList *list);
+```
+
+If a list implementation allows overallocation, the function `cxListReserve()` can be used to reserve memory for a total of `count` elements.
+On the other hand, you can use `cxListShrink()` to dispose of any overallocated memory and reduce the capacity to the actual number of currently stored elements.
+Calling `cxListReserve()` with a `count` argument that is less than the current size of the list has no effect and the function returns zero.
+
+If allocating memory fails, `cxListReserve()` returns a non-zero value.
+Since shrinking should never fail, `cxListShrink()` will usually always return zero,
+but depending on the list (or allocator) implementation, the possibility for a non-zero return value is there.
+
+List implementations that do not overallocate are advised to simply return zero.
+That means, using those functions never does any harm and calls to them can be added whenever it seems appropriate.
+Even if the currently used list implementation does not support overallocation, it may become extremely useful when the concrete implementation is exchanged later.
+
+> The function `cxListReserve()` should not be confused with `cxListEmplaceArray()`.
+> The former will only increase the capacity of a list which supports overallocation without changing the size.
+> The latter will actually add real, uninitialized elements.
+>{style="note"}
+
 ## Dispose
 
 ```C
@@ -474,6 +501,7 @@
         my_list_sort,
         my_list_compare,
         my_list_reverse,
+        my_list_change_capacity,
         my_list_iterator,
 };
 
@@ -505,23 +533,24 @@
 The required behavior for the implementations is described in the following table.
 You can always look at the source code of the UCX implementations to get inspiration.
 
-| Function         | Description                                                                                                                                                                                                                                                                                                                |
-|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `clear`          | Invoke destructor functions on all elements and remove them from the list.                                                                                                                                                                                                                                                 |
-| `deallocate`     | Invoke destructor functions on all elements and deallocate the entire list memory.                                                                                                                                                                                                                                         |
-| `insert_element` | Insert or allocate a single element at the specified index. Return a pointer to the allocated element or `NULL` on failure.                                                                                                                                                                                                |
-| `insert_array`   | Insert or allocate an array of elements starting at the specified index. Return the number of successfully inserted or allocated elements.                                                                                                                                                                                 |
-| `insert_sorted`  | Insert an array of sorted elements into a sorted list. Return the number of elements processed (equals the number of elements inserted in this case).                                                                                                                                                                      |
-| `insert_unique`  | Insert an array of sorted unique elements into a sorted list. Return the number of elements processed (not the number of elements inserted, which might be lower).                                                                                                                                                         |
-| `insert_iter`    | Insert a single element depending on the iterator position. The third argument to this function is zero when the element shall be inserted after the iterator position and non-zero if it shall be inserted before the iterator position. The implementation is also responsible for adjusting the iterator, respectively. |
-| `remove`         | Removes a multiple elements starting at the specified index. If a target buffer is specified, copy the elements to that buffer. Otherwise, invoke the destructor functions for the elements. Return the number of elements actually removed.                                                                               |
-| `swap`           | Swap two elements by index. Return zero on success or non-zero when any index was out-of-bounds.                                                                                                                                                                                                                           |
-| `at`             | Return a pointer to the element at the specified index or `NULL` when the index is out-of-bounds.                                                                                                                                                                                                                          |
-| `find_remove`    | Search for the specified element with the list's compare function and return the index if found. If the `remove` argument is true, invoke the destructor functions and remove the element. Return the list size if the element is not found.                                                                               |
-| `sort`           | Sort all elements in the list with respect to the list's compare function.                                                                                                                                                                                                                                                 |
-| `compare`        | Only function pointer that may be `NULL`, in which case an unoptimized fallback is used. You can implement an optimized compare function that compares the list of another list of the same class.                                                                                                                         |
-| `reverse`        | Reorders all elements in the list so that they appear in exactly the opposite order.                                                                                                                                                                                                                                       |
-| `iterator`       | Creates an iterator starting at the specified index. The Boolean argument indicates whether iteration is supposed to traverse backwards.                                                                                                                                                                                   |
+| Function          | Description                                                                                                                                                                                                                                                                                                                |
+|-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `clear`           | Invoke destructor functions on all elements and remove them from the list.                                                                                                                                                                                                                                                 |
+| `deallocate`      | Invoke destructor functions on all elements and deallocate the entire list memory.                                                                                                                                                                                                                                         |
+| `insert_element`  | Insert or allocate a single element at the specified index. Return a pointer to the allocated element or `NULL` on failure.                                                                                                                                                                                                |
+| `insert_array`    | Insert or allocate an array of elements starting at the specified index. Return the number of successfully inserted or allocated elements.                                                                                                                                                                                 |
+| `insert_sorted`   | Insert an array of sorted elements into a sorted list. Return the number of elements processed (equals the number of elements inserted in this case).                                                                                                                                                                      |
+| `insert_unique`   | Insert an array of sorted unique elements into a sorted list. Return the number of elements processed (not the number of elements inserted, which might be lower).                                                                                                                                                         |
+| `insert_iter`     | Insert a single element depending on the iterator position. The third argument to this function is zero when the element shall be inserted after the iterator position and non-zero if it shall be inserted before the iterator position. The implementation is also responsible for adjusting the iterator, respectively. |
+| `remove`          | Remove multiple elements starting at the specified index. If a target buffer is specified, copy the elements to that buffer. Otherwise, invoke the destructor functions for the elements. Return the number of elements actually removed.                                                                                  |
+| `swap`            | Swap two elements by index. Return zero on success or non-zero when any index was out-of-bounds.                                                                                                                                                                                                                           |
+| `at`              | Return a pointer to the element at the specified index or `NULL` when the index is out-of-bounds.                                                                                                                                                                                                                          |
+| `find_remove`     | Search for the specified element with the list's compare function and return the index if found. If the `remove` argument is true, invoke the destructor functions and remove the element. Return the list size if the element is not found.                                                                               |
+| `sort`            | Sort all elements in the list with respect to the list's compare function.                                                                                                                                                                                                                                                 |
+| `compare`         | This function pointer can be `NULL`, in which case an unoptimized fallback is used. You can implement an optimized compare function that compares the list of another list of the same class.                                                                                                                              |
+| `reverse`         | Reorder all elements in the list so that they appear in exactly the opposite order.                                                                                                                                                                                                                                        |
+| `change_capacity` | When your list supports overallocation, the capacity shall be changed to the specified value. Return non-zero on error and zero on success. When the list does not support overallocation, this function pointer can be `NULL`.                                                                                            |
+| `iterator`        | Create an iterator starting at the specified index. The Boolean argument indicates whether iteration is supposed to traverse backwards.                                                                                                                                                                                    |
 
 > If you initialize your list with `cx_list_init()`, you do not have to worry about making a
 > difference between storing pointers and storing elements, because your implementation will
--- a/src/array_list.c	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/array_list.c	Mon Nov 10 21:36:15 2025 +0100
@@ -103,20 +103,31 @@
 // LOW LEVEL ARRAY LIST FUNCTIONS
 
 /**
- * Increases the capacity until it is a multiple of a some alignment or reaches the maximum.
+ * Intelligently calculates a new capacity, reserving some more
+ * elements than required to prevent too many allocations.
  *
- * @param cap the required capacity (must not be larger than @p max)
- * @param alignment the alignment
- * @param max the maximum capacity
- * @return the aligned capacity
+ * @param current_capacity the current capacity of the array
+ * @param needed_capacity the required capacity of the array
+ * @param maximum_capacity the maximum capacity (given by the data type)
+ * @return the new capacity
  */
-static size_t cx_array_align_capacity(
-        size_t cap,
-        size_t alignment,
-        size_t max
+static size_t cx_array_grow_capacity(
+    size_t current_capacity,
+    size_t needed_capacity,
+    size_t maximum_capacity
 ) {
-    if (cap > max - alignment) {
-        return cap;
+    if (current_capacity >= needed_capacity) {
+        return current_capacity;
+    }
+    size_t cap = needed_capacity;
+    size_t alignment;
+    if (cap < 128) alignment = 16;
+    else if (cap < 1024) alignment = 64;
+    else if (cap < 8192) alignment = 512;
+    else alignment = 1024;
+
+    if (cap - 1 > maximum_capacity - alignment) {
+        return maximum_capacity;
     } else {
         return cap - (cap % alignment) + alignment;
     }
@@ -184,10 +195,6 @@
 
     // reallocate if possible
     if (newcap > oldcap) {
-        // calculate new capacity (next number divisible by 16)
-        newcap = cx_array_align_capacity(newcap, 16, max_size);
-
-        // perform reallocation
         void *newmem = reallocator->realloc(
                 *array, oldcap, newcap, elem_size, reallocator
         );
@@ -277,22 +284,18 @@
     }
 
     // check if resize is required
-    size_t minsize = index + elem_count;
-    size_t newsize = oldsize < minsize ? minsize : oldsize;
+    const size_t minsize = index + elem_count;
+    const size_t newsize = oldsize < minsize ? minsize : oldsize;
 
-    // reallocate if possible
-    size_t newcap = oldcap;
-    if (newsize > oldcap) {
-        // check, if we need to repair the src pointer
+    // reallocate if necessary
+    const size_t newcap = cx_array_grow_capacity(oldcap, newsize, max_size);
+    if (newcap > oldcap) {
+        // check if we need to repair the src pointer
         uintptr_t targetaddr = (uintptr_t) *target;
         uintptr_t srcaddr = (uintptr_t) src;
         bool repairsrc = targetaddr <= srcaddr
                          && srcaddr < targetaddr + oldcap * elem_size;
 
-        // calculate new capacity (next number divisible by 16)
-        newcap = cx_array_align_capacity(newsize, 16, max_size);
-        assert(newcap > newsize);
-
         // perform reallocation
         void *newmem = reallocator->realloc(
                 *target, oldcap, newcap, elem_size, reallocator
@@ -375,16 +378,16 @@
     }
 
     // store some counts
-    size_t old_size = *size;
-    size_t old_capacity = *capacity;
+    const size_t old_size = *size;
+    const size_t old_capacity = *capacity;
     // the necessary capacity is the worst case assumption, including duplicates
-    size_t needed_capacity = old_size + elem_count;
+    const size_t needed_capacity = cx_array_grow_capacity(old_capacity,
+        old_size + elem_count, SIZE_MAX);
 
     // if we need more than we have, try a reallocation
     if (needed_capacity > old_capacity) {
-        size_t new_capacity = cx_array_align_capacity(needed_capacity, 16, SIZE_MAX);
         void *new_mem = reallocator->realloc(
-                *target, old_capacity, new_capacity, elem_size, reallocator
+                *target, old_capacity, needed_capacity, elem_size, reallocator
         );
         if (new_mem == NULL) {
             // give it up right away, there is no contract
@@ -392,7 +395,7 @@
             return 1;  // LCOV_EXCL_LINE
         }
         *target = new_mem;
-        *capacity = new_capacity;
+        *capacity = needed_capacity;
     }
 
     // now we have guaranteed that we can insert everything
@@ -789,8 +792,8 @@
 
     // guarantee enough capacity
     if (arl->capacity < list->collection.size + n) {
-        size_t new_capacity = list->collection.size + n;
-        new_capacity = cx_array_align_capacity(new_capacity, 16, SIZE_MAX);
+        const size_t new_capacity = cx_array_grow_capacity(arl->capacity,
+            list->collection.size + n, SIZE_MAX);
         if (cxReallocateArray(
                 list->collection.allocator,
                 &arl->data, new_capacity,
@@ -1137,6 +1140,14 @@
     }
 }
 
+static int cx_arl_change_capacity(
+        struct cx_list_s *list,
+        size_t new_capacity
+) {
+    cx_array_list *arl = (cx_array_list *)list;
+    return cxReallocateArray(list->collection.allocator,
+        &arl->data, new_capacity, list->collection.elem_size);
+}
 
 static struct cx_iterator_s cx_arl_iterator(
         const struct cx_list_s *list,
@@ -1174,6 +1185,7 @@
         cx_arl_sort,
         cx_arl_compare,
         cx_arl_reverse,
+        cx_arl_change_capacity,
         cx_arl_iterator,
 };
 
--- a/src/cx/array_list.h	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/cx/array_list.h	Mon Nov 10 21:36:15 2025 +0100
@@ -245,6 +245,9 @@
  * Supported are 0, 1, 2, and 4, as well as 8 if running on a 64-bit
  * architecture. If set to zero, the native word width is used.
  *
+ * @note This function will reserve the minimum required capacity to hold
+ * the additional elements and does not perform an overallocation.
+ *
  * @param array a pointer to the target array
  * @param size a pointer to the size of the array
  * @param capacity a pointer to the capacity of the array
@@ -280,6 +283,9 @@
  * Supported are 0, 1, 2, and 4, as well as 8 if running on a 64-bit
  * architecture. If set to zero, the native word width is used.
  *
+ * @note When this function does reallocate the array, it may allocate more
+ * space than required to avoid further allocations in the near future.
+ *
  * @param target a pointer to the target array
  * @param size a pointer to the size of the target array
  * @param capacity a pointer to the capacity of the target array
@@ -293,6 +299,7 @@
  * @retval zero success
  * @retval non-zero failure
  * @see cx_array_reallocator()
+ * @see cx_array_reserve()
  */
 cx_attr_nonnull_arg(1, 2, 3, 6)
 CX_EXPORT int cx_array_copy(void **target, void *size, void *capacity, unsigned width,
--- a/src/cx/list.h	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/cx/list.h	Mon Nov 10 21:36:15 2025 +0100
@@ -170,6 +170,12 @@
     void (*reverse)(struct cx_list_s *list);
 
     /**
+     * Optional member function for changing the capacity.
+     * If the list does not support overallocation, this can be set to @c NULL.
+     */
+    int (*change_capacity)(struct cx_list_s *list, size_t new_capacity);
+
+    /**
      * Member function for returning an iterator pointing to the specified index.
      */
     struct cx_iterator_s (*iterator)(const struct cx_list_s *list, size_t index, bool backward);
@@ -1147,6 +1153,40 @@
 cx_attr_nonnull
 CX_EXPORT int cxListUnionSimple(CxList *dst, const CxList *src, const CxList *other);
 
+/**
+ * Asks the list to reserve enough memory for a given total number of elements.
+ *
+ * List implementations are free to choose if reserving memory upfront makes
+ * sense.
+ * For example, array-based implementations usually do support reserving memory
+ * for additional elements while linked lists usually don't.
+ *
+ * @note When the requested capacity is smaller than the current size,
+ * this function returns zero without performing any action.
+ *
+ * @param list the list
+ * @param capacity the expected total number of elements
+ * @retval zero on success or when overallocation is not supported
+ * @retval non-zero when an allocation error occurred
+ * @see cxListShrink()
+ */
+cx_attr_nonnull
+CX_EXPORT int cxListReserve(CxList *list, size_t capacity);
+
+/**
+ * Advises the list to free any overallocated memory.
+ *
+ * Lists that do not support overallocation simply return zero.
+ *
+ * This function usually returns zero, except for very special and custom
+ * list and/or allocator implementations where freeing memory can fail.
+ *
+ * @param list the list
+ * @return usually zero
+ */
+cx_attr_nonnull
+CX_EXPORT int cxListShrink(CxList *list);
+
 #ifdef __cplusplus
 } // extern "C"
 #endif
--- a/src/kv_list.c	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/kv_list.c	Mon Nov 10 21:36:15 2025 +0100
@@ -509,6 +509,15 @@
     return iter;
 }
 
+static int cx_kvl_change_capacity(struct cx_list_s *list,
+        cx_attr_unused size_t cap) {
+    // since our backing list is a linked list, we don't need to do much here,
+    // but rehashing the map is quite useful
+    cx_kv_list *kv_list = (cx_kv_list*)list;
+    cxMapRehash(&kv_list->map->map_base.base);
+    return 0;
+}
+
 static cx_list_class cx_kv_list_class = {
     cx_kvl_deallocate,
     cx_kvl_insert_element,
@@ -524,6 +533,7 @@
     cx_kvl_sort,
     NULL,
     cx_kvl_reverse,
+    cx_kvl_change_capacity,
     cx_kvl_iterator,
 };
 
--- a/src/linked_list.c	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/linked_list.c	Mon Nov 10 21:36:15 2025 +0100
@@ -1252,6 +1252,7 @@
         cx_ll_sort,
         cx_ll_compare,
         cx_ll_reverse,
+        NULL, // no overallocation supported
         cx_ll_iterator,
 };
 
--- a/src/list.c	Sun Nov 09 16:29:22 2025 +0100
+++ b/src/list.c	Mon Nov 10 21:36:15 2025 +0100
@@ -185,6 +185,10 @@
     return ptr == NULL ? NULL : *ptr;
 }
 
+static int cx_pl_change_capacity(struct cx_list_s *list, size_t cap) {
+    return list->climpl->change_capacity(list, cap);
+}
+
 static struct cx_iterator_s cx_pl_iterator(
         const struct cx_list_s *list,
         size_t index,
@@ -211,6 +215,7 @@
         cx_pl_sort,
         cx_pl_compare,
         cx_pl_reverse,
+        cx_pl_change_capacity,
         cx_pl_iterator,
 };
 // </editor-fold>
@@ -267,6 +272,7 @@
         cx_emptyl_noop,
         NULL,
         cx_emptyl_noop,
+        NULL,
         cx_emptyl_iterator,
 };
 
@@ -1102,3 +1108,20 @@
 int cxListUnionSimple(CxList *dst, const CxList *src, const CxList *other) {
     return cxListUnion(dst, src, other, use_simple_clone_func(src));
 }
+
+int cxListReserve(CxList *list, size_t capacity) {
+    if (list->cl->change_capacity == NULL) {
+        return 0;
+    }
+    if (capacity <= cxCollectionSize(list)) {
+        return 0;
+    }
+    return list->cl->change_capacity(list, capacity);
+}
+
+int cxListShrink(CxList *list) {
+    if (list->cl->change_capacity == NULL) {
+        return 0;
+    }
+    return list->cl->change_capacity(list, cxCollectionSize(list));
+}
\ No newline at end of file

mercurial