Mercurial > hg > ucx / file revision

# Iterators

Iterators generalize the iteration over elements of an arbitrary collection.
This allows iteration over arrays, lists, maps, trees, etc. in a unified way.

Creating an iterator is as simple as creating a `CxIterator` struct and setting the fields in a meaningful way.
The UCX collections provide various functions to create such iterators.

If the predefined fields are not enough (or introduce too much bloat) for your use case,
you can alternatively create your own iterator structure
and place the `CX_ITERATOR_BASE` macro as first member of that structure.

```C
#include <cx/iterator.h>

struct my_fancy_iterator_s {
    CX_ITERATOR_BASE; // the base members used by cx_foreach()
    // ... custom fields ...
};
```

## Creating an Iterator

The following functions create iterators over plain C arrays.

```C
#include <cx/iterator.h>

CxIterator cxIterator(const void *array,
        size_t elem_size, size_t elem_count,
        bool remove_keeps_order);

CxIterator cxIteratorPtr(const void *array,
        size_t elem_count,
        bool remove_keeps_order);
```

The `cxIterator()` function creates an iterator over the elements of `array` where
each element is `elem_size` bytes large and the array contains a total of `elem_count` elements.

The `cxIteratorPtr()` function is equivalent to `cxIterator()`, except it assumes `sizeof(void*)` as the `elem_size`.

The UCX collections also define functions for creating iterators over their items.
You can read more about them in the respective Sections of the documentation.

## Using an Iterator

The following macros work with arbitrary structures using `CX_ITERATOR_BASE`
and invoke the respective function pointers `valid`, `current`, or `next`.
```C
cxIteratorValid(iter)
cxIteratorCurrent(iter)
cxIteratorNext(iter)
```

You may use them for manual iterator, but usually you do not need them.
Every iterator can be used with the `cx_foreach` macro.

```C
#include <cx/iterator.h>

// some custom array and its size
MyData *array = // ...
size_t size = // ...

CxIterator iter = cxIterator(array, sizeof(MyData), size);
cx_foreach(MyData*, elem, iter) {
    // .. do something with elem ..
}
```

The macro takes three arguments:
1. the pointer-type of a pointer to an element,
2. the name of the variable you want to use for accessing the element,
3. and the iterator.

> An iterator does not necessarily need to iterate over the concrete elements of a collection.
> Map iterators, for example, can iterator over the key/value pairs,
> but they can also iterate over just the values or just the keys.
>
> You should read the documentation of the function creating the iterator to learn
> what exactly the iterator is iterating over.

## Removing Elements via Iterators

Usually an iterator is not mutating the collection it is iterating over.
But sometimes it is desirable to remove an element from the collection while iterating over it.

For this purpose, most collections allow to use `cxIteratorFlagRemoval()`, which instructs the iterator to remove
the current element from the collection on the next call to `cxIteratorNext()`.
If you are implementing your own iterator, it is up to you to implement this behavior.

## Passing Iterators to Functions

To eliminate the need of memory management for iterators, the structures are usually used by value.
This does not come with additional costs because iteration is implemented entirely by using macros.

However, sometimes it is necessary to pass an iterator to another function.
To make that possible in a generalized way, such functions should accept a `CxIteratorBase*` pointer
which can be obtained with the `cxIteratorRef()` macro on the calling site.

In the following example, elements from a list are inserted into a tree:

```C
CxList *list = // ...
CxTree *tree = // ...

CxIterator iter = cxListIterator(list);
cxTreeInsertIter(tree, cxIteratorRef(iter), cxListSize(list));
```

> This is the reason, why `CX_ITERATOR_BASE` must be the first member of any iterator structure.
> Otherwise, the address taken by `cxIteratorRef()` would not equal the address of the iterator.
{style="note"}

## Custom Iterators

The base structure is defined as follows:
```C
struct cx_iterator_base_s {
    bool (*valid)(const void *);
    void *(*current)(const void *);
    void (*next)(void *);
    void *(*current_impl)(const void *);
    void (*next_impl)(void *);
    bool allow_remove;
    bool remove;
};

typedef struct cx_iterator_base_s CxIteratorBase;
```

The `valid` function indicates whether the iterator is currently pointing to an element in the collection.
The `current` function is supposed to return that element,
and the `next` function shall advance the iterator to the next element.

The booleans `allow_remove` and `remove` are used for [removing elements](#removing-elements-via-iterators) as explained above.
When an iterator is created, the `allow_remove` field is set to indicate if removal of elements is supported.
The `remove` field is set to indicate if the current element should be removed on the next call to `next()` (see `cxIteratorFlagRemoval()`).

Iterators may be wrapped in which case the original implementation can be stored in `current_impl` and `next_impl`.
They can then be called by a wrapper implementation pointed to by `current` and `next`, respectively.
This can be useful when you want to support the `store_pointer` field of the [](collection.h.md) API.

A specialized, simple, and fast iterator over an array of a certain type
that does not support removing elements can be implemented as follows:
```C
#include <cx/iterator.h>

typedef struct my_foo_s {
    // ... your data ...
} MyFoo;

typedef struct my_foo_iterator_s {
    CX_ITERATOR_BASE;
    MyFoo *array;
    size_t index;
    size_t elem_count;
} MyFooIterator;

static bool my_foo_iter_valid(const void *it) {
    const MyFooIterator *iter = it;
    return iter->index < iter->elem_count;
}

static void *my_foo_iter_current(const void *it) {
    const MyFooIterator *iter = it;
    return &iter->array[iter->index];
}

static void my_foo_iter_next(void *it) {
    MyFooIterator *iter = it;
    iter->index++;
}

MyFooIterator myFooIterator(MyFoo *array, size_t elem_count) {
    MyFooIterator iter;

    // base fields
    iter.base.valid = my_foo_iter_valid;
    iter.base.current = my_foo_iter_current;
    iter.base.next = my_foo_iter_next;
    iter.base.remove = false;
    iter.base.allow_remove = false;

    // custom fields
    iter.index = 0;
    iter.elem_count = elem_count;

    return iter;
}
```

> Note, that the behavior of `current` is undefined when `valid` returns `false`.
> That means, on the one hand, `current` does not need to check for validity of the iterator,
> but on the other hand, it is forbidden to invoke `current` when `valid` would return `false`.
{style="note"}

> If performance matters in your application, it is recommended that you indeed create specialized iterators
> for your collections. The default UCX implementations trade some performance for generality.

<seealso>
<category ref="apidoc">
<a href="https://ucx.sourceforge.io/api/iterator_8h.html">iterator.h</a>
</category>
</seealso>