| 12 The following listing shows basic string functions. |
12 The following listing shows basic string functions. |
| 13 |
13 |
| 14 > To simplify documentation, we introduce the pseudo-type `AnyStr` with the meaning that |
14 > To simplify documentation, we introduce the pseudo-type `AnyStr` with the meaning that |
| 15 > any UCX string and any C string are supported. |
15 > any UCX string and any C string are supported. |
| 16 > The implementation is actually hidden behind a macro which uses `cx_strcast()` to guarantee compatibility. |
16 > The implementation is actually hidden behind a macro which uses `cx_strcast()` to guarantee compatibility. |
| 17 > Similarly we introduce `UcxStr` with the meaning that it is either a `cxstring` or a `cxmutstr`, |
17 > Similarly, we introduce `UcxStr` with the meaning that it is either a `cxstring` or a `cxmutstr`, |
| 18 > created by `cx_strcast_m()`. |
18 > created by `cx_strcast_m()`. |
| 19 {style="note"} |
19 {style="note"} |
| 20 |
20 |
| 21 ```C |
21 ```C |
| 22 #include <cx/string.h> |
22 #include <cx/string.h> |
| 64 |
64 |
| 65 The function `cx_strdup_a()` allocates new memory with the given `allocator` and copies the given `string` |
65 The function `cx_strdup_a()` allocates new memory with the given `allocator` and copies the given `string` |
| 66 and guarantees that the result string is zero-terminated. |
66 and guarantees that the result string is zero-terminated. |
| 67 The function `cx_strdup()` is equivalent to `cx_strdup_a()`, except that it uses the [default allocator](allocator.h.md#default-allocator). |
67 The function `cx_strdup()` is equivalent to `cx_strdup_a()`, except that it uses the [default allocator](allocator.h.md#default-allocator). |
| 68 |
68 |
| 69 The functions `cx_strcpy_a()` and `cx_strcpy()` copy the contents of the `source` string to the `dest` string, |
69 The functions `cx_strcpy_a()` and `cx_strcpy()` copy the contents of the `source` string to the `dest` string |
| 70 and also guarantee zero-termination of the resulting string. |
70 and also guarantee zero-termination of the resulting string. |
| 71 The memory in `dest` is either freshly allocated or re-allocated to fit the size of the string plus the terminator. |
71 The memory in `dest` is either freshly allocated or re-allocated to fit the size of the string plus the terminator. |
| 72 |
72 |
| 73 Allocated strings are always of type `cxmutstr` and can be deallocated by a call to `cx_strfree()` or `cx_strfree_a()`. |
73 Allocated strings always have the type `cxmutstr` and can be deallocated by a call to `cx_strfree()` or `cx_strfree_a()`. |
| 74 The caller must make sure to use the correct allocator for deallocating a string. |
74 The caller must make sure to use the correct allocator for deallocating a string. |
| 75 It is safe to call these functions multiple times on a given string, as the pointer will be nulled and the length set to zero. |
75 It is safe to call these functions multiple times on a given string, as the pointer will be nulled and the length set to zero. |
| 76 It is also safe to call the functions with a `NULL`-pointer, just like any other `free()`-like function. |
76 It is also safe to call the functions with a `NULL`-pointer, just like any other `free()`-like function. |
| 77 |
77 |
| 78 When you want to use a UCX string in a `printf`-like function, you can use the macro `CX_PRIstr` for the format specifier, |
78 When you want to use a UCX string in a `printf`-like function, you can use the macro `CX_PRIstr` for the format specifier, |
| 109 ``` |
109 ``` |
| 110 |
110 |
| 111 The `cx_strcmp()` function compares two strings lexicographically |
111 The `cx_strcmp()` function compares two strings lexicographically |
| 112 and returns an integer greater than, equal to, or less than 0, if `s1` is greater than, equal to, or less than `s2`, respectively. |
112 and returns an integer greater than, equal to, or less than 0, if `s1` is greater than, equal to, or less than `s2`, respectively. |
| 113 |
113 |
| 114 The `cx_strcmp_p()` function takes pointers to UCX strings (i.e., only to `cxstring` and `cxmutstr`) and the signature is compatible with `cx_compare_func`. |
114 The `cx_strcmp_p()` function takes pointers to UCX strings (i.e., only to `cxstring` and `cxmutstr`), and the signature is compatible with `cx_compare_func`. |
| 115 Use this as a compare function for lists or other data structures. |
115 Use this as a compare function for lists or other data structures. |
| 116 |
116 |
| 117 The functions `cx_strprefix()` and `cx_strsuffic()` check if `string` starts with `prefix` or ends with `suffix`, respectively. |
117 The functions `cx_strprefix()` and `cx_strsuffic()` check if `string` starts with `prefix` or ends with `suffix`, respectively. |
| 118 |
118 |
| 119 The functions `cx_strcasecmp()`, `cx_strcasecmp_p()`, `cx_strcaseprefix()`, and `cx_strcasesuffix()` are equivalent, |
119 The functions `cx_strcasecmp()`, `cx_strcasecmp_p()`, `cx_strcaseprefix()`, and `cx_strcasesuffix()` are equivalent, |
| 138 The `cx_strcat_a()` function takes `count` UCX strings (`cxstring` or `cxmutstr` - not pointers!), |
138 The `cx_strcat_a()` function takes `count` UCX strings (`cxstring` or `cxmutstr` - not pointers!), |
| 139 reallocates the memory in `str` for a concatenation of those strings _with a single allocation_, |
139 reallocates the memory in `str` for a concatenation of those strings _with a single allocation_, |
| 140 and appends the contents of the strings to `str`. |
140 and appends the contents of the strings to `str`. |
| 141 `cx_strcat()` is equivalent, except that it uses the [default allocator](allocator.h.md#default-allocator). |
141 `cx_strcat()` is equivalent, except that it uses the [default allocator](allocator.h.md#default-allocator). |
| 142 |
142 |
| 143 When there is no `str` where the other strings shall be appended to, you can pass `CX_NULLSTR` as first argument. |
143 When there is no `str` where the other strings shall be appended to, you can pass `CX_NULLSTR` as the first argument. |
| 144 In that case, a completely new string is allocated. |
144 In that case, a completely new string is allocated. |
| 145 |
145 |
| 146 Example usage: |
146 Example usage: |
| 147 ```C |
147 ```C |
| 148 cxmutstr str = cx_strcat(CX_NULLSTR, 2, |
148 cxmutstr str = cx_strcat(CX_NULLSTR, 2, |
| 237 and writes the substrings into the pre-allocated `output` array. |
237 and writes the substrings into the pre-allocated `output` array. |
| 238 The maximum number of resulting strings can be specified with `limit`. |
238 The maximum number of resulting strings can be specified with `limit`. |
| 239 That means, at most `limit-1` splits are performed. |
239 That means, at most `limit-1` splits are performed. |
| 240 The function returns the actual number of items written to `output`. |
240 The function returns the actual number of items written to `output`. |
| 241 |
241 |
| 242 On the other hand, `cx_strsplit_a()` uses the specified `allocator` to allocate the output array, |
242 On the other hand, `cx_strsplit_a()` uses the specified `allocator` to allocate the output array |
| 243 and writes the pointer to the allocated memory to `output`. |
243 and writes the pointer to the allocated memory to `output`. |
| 244 |
244 |
| 245 > The type of the `UcxStr` must the same for `string` and `output` (i.e., either both `cxstring` or both `cxmutstr`). |
245 > The type of the `UcxStr` must the same for `string` and `output` (i.e., either both `cxstring` or both `cxmutstr`). |
| 246 > {style="note"} |
246 > {style="note"} |
| 247 |
247 |
| 248 > The `allocator` in `cx_strsplit_a()` is _only_ used to allocate the output array. |
248 > The `allocator` in `cx_strsplit_a()` is _only_ used to allocate the output array. |
| 249 > The strings will always point into the original `string` |
249 > The strings will always point into the original `string`, |
| 250 > and you need to use `cx_strdup()` or `cx_strdup_a()` if you want copies or zero-terminated strings after performing the split. |
250 > and you need to use `cx_strdup()` or `cx_strdup_a()` if you want copies or zero-terminated strings after performing the split. |
| 251 {style="note"} |
251 {style="note"} |
| 252 |
252 |
| 253 ## Complex Tokenization |
253 ## Complex Tokenization |
| 254 |
254 |
| 261 const cxstring *delim, size_t count); |
261 const cxstring *delim, size_t count); |
| 262 |
262 |
| 263 bool cx_strtok_next(CxStrtokCtx *ctx, UcxStr* token); |
263 bool cx_strtok_next(CxStrtokCtx *ctx, UcxStr* token); |
| 264 ``` |
264 ``` |
| 265 |
265 |
| 266 You can tokenize a string by creating a _tokenization_ context with `cx_strtok()`, |
266 You can tokenize a string by creating a _tokenization_ context with `cx_strtok()` |
| 267 and calling `cx_strtok_next()` as long as it returns `true`. |
267 and calling `cx_strtok_next()` as long as it returns `true`. |
| 268 |
268 |
| 269 The tokenization context is initialized with the string `str` to tokenize, |
269 The tokenization context is initialized with the string `str` to tokenize, |
| 270 one delimiter `delim`, and a `limit` for the maximum number of tokens. |
270 one delimiter `delim`, and a `limit` for the maximum number of tokens. |
| 271 When `limit` is reached, the remaining part of `str` is returned as one single token. |
271 When `limit` is reached, the remaining part of `str` is returned as one single token. |
| 272 |
272 |
| 273 You can add additional delimiters to the context by calling `cx_strtok_delim()`, and |
273 You can add additional delimiters to the context by calling `cx_strtok_delim()` and |
| 274 specifying an array of delimiters to use. |
274 specifying an array of delimiters to use. |
| 275 |
275 |
| 276 > Regardless of how the context was initialized, you can use `cx_strtok_next()` |
276 > Regardless of how the context was initialized, you can use `cx_strtok_next()` |
| 277 > with pointers to `cxstring` or `cxmutstr`. However, keep in mind that modifying |
277 > with pointers to `cxstring` or `cxmutstr`. However, keep in mind that modifying |
| 278 > characters in a `cxmutstr` has only defined behavior, when the |
278 > characters in a `cxmutstr` has only defined behavior, when the |