1 # String |
1 # String |
2 |
2 |
3 <warning> |
3 UCX strings store character arrays together with a length and come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`). |
4 Outdated Section - will be updated soon! |
4 |
5 </warning> |
5 In general, UCX strings are *not* necessarily zero-terminated. |
6 |
6 If a function guarantees to return a zero-terminated string, it is explicitly mentioned in the documentation. |
7 UCX strings come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`). |
7 As a rule of thumb, you _should not_ pass a character array of a UCX string structure to another API without explicitly |
8 The functions of UCX are designed to work with immutable strings by default but in situations where it is necessary, |
|
9 the API also provides alternative functions that work directly with mutable strings. |
|
10 Functions that change a string in-place are, of course, only accepting mutable strings. |
|
11 |
|
12 When you are using UCX functions, or defining your own functions, you are sometimes facing the "problem", |
|
13 that the function only accepts arguments of type `cxstring` but you only have a `cxmutstr` at hand. |
|
14 In this case you _should not_ introduce a wrapper function that accepts the `cxmutstr`, |
|
15 but instead you should use the `cx_strcast()` function to cast the argument to the correct type. |
|
16 |
|
17 In general, UCX strings are **not** necessarily zero-terminated. If a function guarantees to return zero-terminated |
|
18 string, it is explicitly mentioned in the documentation of the respective function. |
|
19 As a rule of thumb, you _should not_ pass the strings of a UCX string structure to another API without explicitly |
|
20 ensuring that the string is zero-terminated. |
8 ensuring that the string is zero-terminated. |
21 |
9 |
22 <!-- |
|
23 ## Basics |
10 ## Basics |
24 |
11 |
25 ### cx_mutstr |
12 > To make documentation simpler, we introduce the pseudo-type `AnyStr` with the meaning that |
26 ### cx_mutstrn |
13 > both `cxstring` and `cxmutstr` are accepted for that argument. |
27 ### cx_str |
14 > The implementation is actually hidden behind a macro which uses `cx_strcast()` to guarantee compatibility. |
28 ### cx_strn |
15 {style="note"} |
29 ### cx_strcast |
16 |
30 ### cx_strfree |
17 ```C |
31 ### cx_strfree_a |
18 #include <cx/string.h> |
32 ### cx_strdup |
19 |
33 ### cx_strdup_a |
20 struct cx_string_s {const char *ptr; size_t length;}; |
34 ### cx_strlen |
21 |
35 ### cx_strtrim |
22 struct cx_mutstr_s {char *ptr; size_t length;}; |
36 ### cx_strtrim_m |
23 |
37 ### cx_strlower |
24 typedef struct cx_string_s cxstring; |
38 ### cx_strupper |
25 |
|
26 typedef struct cx_mutstr_s cxmutstr; |
|
27 |
|
28 cxstring cx_str(const char *cstring); |
|
29 |
|
30 cxstring cx_strn(const char *cstring, size_t length); |
|
31 |
|
32 cxmutstr cx_mutstr(char *cstring); |
|
33 |
|
34 cxmutstr cx_mutstrn(char *cstring, size_t length); |
|
35 |
|
36 cxstring cx_strcast(AnyStr str); |
|
37 |
|
38 cxmutstr cx_strdupa(AnyStr string); |
|
39 |
|
40 cxmutstr cx_strdup_a(const CxAllocator *allocator, AnyStr string); |
|
41 |
|
42 void cx_strfree(cxmutstr *str); |
|
43 |
|
44 void cx_strfree_a(const CxAllocator *alloc, cxmutstr *str); |
|
45 ``` |
|
46 |
|
47 > Documentation work in progress. |
|
48 >{style="warning"} |
|
49 |
|
50 > When you want to convert a string _literal_ into a UCX string, you can also use the `CX_STR(lit)` macro. |
|
51 > This macro uses the fact that `sizeof(lit)` for a string literal `lit` is always the string length plus one, |
|
52 > effectively saving an invocation of `strlen()`. |
|
53 > However, this only works for literals - in all other cases you must use `cx_str()` or `cx_strn`. |
39 |
54 |
40 ## Comparison |
55 ## Comparison |
41 |
56 |
42 ### cx_strcmp |
57 ```C |
43 ### cx_strcmp_p |
58 #include <cx/string.h> |
44 ### cx_strcasecmp |
59 |
45 ### cx_strcasecmp_p |
60 int cx_strcmp(cxstring s1, cxstring s2); |
46 ### cx_strprefix |
61 |
47 ### cx_strsuffix |
62 int cx_strcmp_p(const void *s1, const void *s2); |
48 ### cx_strcaseprefix |
63 |
49 ### cx_strcasesuffix |
64 bool cx_strprefix(cxstring string, cxstring prefix); |
|
65 |
|
66 bool cx_strsuffix(cxstring string, cxstring suffix); |
|
67 |
|
68 int cx_strcasecmp(cxstring s1, cxstring s2); |
|
69 |
|
70 int cx_strcasecmp_p(const void *s1, const void *s2); |
|
71 |
|
72 bool cx_strcaseprefix(cxstring string, cxstring prefix); |
|
73 |
|
74 bool cx_strcasesuffix(cxstring string, cxstring suffix); |
|
75 ``` |
|
76 |
|
77 > Documentation work in progress. |
|
78 >{style="warning"} |
50 |
79 |
51 ## Concatenation |
80 ## Concatenation |
52 |
81 |
53 ### cx_strcat_ma |
82 ```C |
|
83 #include <cx/string.h> |
|
84 |
|
85 cxmutstr cx_strcat(size_t count, ... ); |
|
86 |
|
87 cxmutstr cx_strcat_a(const CxAllocator *alloc, size_t count, ... ); |
|
88 |
|
89 cxmutstr cx_strcat_m(cxmutstr str, size_t count, ... ); |
|
90 |
|
91 cxmutstr cx_strcat_ma(const CxAllocator *alloc, |
|
92 cxmutstr str, size_t count, ... ); |
|
93 |
|
94 size_t cx_strlen(size_t count, ...); |
|
95 ``` |
|
96 |
|
97 > Documentation work in progress. |
|
98 >{style="warning"} |
54 |
99 |
55 ## Find Characters and Substrings |
100 ## Find Characters and Substrings |
56 |
101 |
57 ### cx_strchr |
102 ```C |
58 ### cx_strchr_m |
103 #include <cx/string.h> |
59 ### cx_strrchr |
104 |
60 ### cx_strrchr_m |
105 cxstring cx_strchr(cxstring string, int chr); |
61 ### cx_strstr |
106 |
62 ### cx_strstr_m |
107 cxmutstr cx_strchr_m(cxmutstr string, int chr); |
63 ### cx_strsubs |
108 |
64 ### cx_strsubsl |
109 cxstring cx_strrchr(cxstring string,int chr); |
65 ### cx_strsubsl_m |
110 |
66 ### cx_strsubs_m |
111 cxmutstr cx_strrchr_m(cxmutstr string, int chr); |
|
112 |
|
113 cxstring cx_strstr(cxstring haystack, cxstring needle); |
|
114 |
|
115 cxmutstr cx_strstr_m(cxmutstr haystack, cxstring needle); |
|
116 |
|
117 cxstring cx_strsubs(cxstring string, size_t start); |
|
118 |
|
119 cxstring cx_strsubsl(cxstring string, size_t start, size_t length); |
|
120 |
|
121 cxmutstr cx_strsubs_m(cxmutstr string, size_t start); |
|
122 |
|
123 cxmutstr cx_strsubsl_m(cxmutstr string, size_t start, size_t length); |
|
124 |
|
125 cxstring cx_strtrim(cxstring string); |
|
126 |
|
127 cxmutstr cx_strtrim_m(cxmutstr string); |
|
128 ``` |
|
129 |
|
130 > Documentation work in progress. |
|
131 >{style="warning"} |
67 |
132 |
68 ## Replace Substrings |
133 ## Replace Substrings |
69 |
134 |
70 ### cx_strreplacen_a |
135 ```C |
|
136 #include <cx/string.h> |
|
137 |
|
138 cxmutstr cx_strreplace(cxstring str, cxstring pattern, cxstring repl); |
|
139 |
|
140 cxmutstr cx_strreplace_a(const CxAllocator *allocator, cxstring str, |
|
141 cxstring pattern, cxstring repl); |
|
142 |
|
143 cxmutstr cx_strreplacen(cxstring str, cxstring pattern, cxstring repl, |
|
144 size_t replmax); |
|
145 |
|
146 cxmutstr cx_strreplacen_a(const CxAllocator *allocator, cxstring str, |
|
147 cxstring pattern, cxstring repl, size_t replmax); |
|
148 ``` |
|
149 |
|
150 > Documentation work in progress. |
|
151 >{style="warning"} |
71 |
152 |
72 ## Basic Splitting |
153 ## Basic Splitting |
73 |
154 |
74 ### cx_strsplit |
155 ```C |
75 ### cx_strsplit_a |
156 #include <cx/string.h> |
76 ### cx_strsplit_m |
157 |
77 ### cx_strsplit_ma |
158 size_t cx_strsplit(cxstring string, cxstring delim, |
|
159 size_t limit, cxstring *output); |
|
160 |
|
161 size_t cx_strsplit_a(const CxAllocator *allocator, |
|
162 cxstring string, cxstring delim, |
|
163 size_t limit, cxstring **output); |
|
164 |
|
165 size_t cx_strsplit_m(cxmutstr string, cxstring delim, |
|
166 size_t limit, cxmutstr *output); |
|
167 |
|
168 size_t cx_strsplit_ma(const CxAllocator *allocator, |
|
169 cxmutstr string, cxstring delim, |
|
170 size_t limit, cxmutstr **output); |
|
171 ``` |
|
172 |
|
173 > Documentation work in progress. |
|
174 >{style="warning"} |
78 |
175 |
79 ## Complex Tokenization |
176 ## Complex Tokenization |
80 |
177 |
81 ### cx_strtok_ |
178 ```C |
82 ### cx_strtok_delim |
179 #include <cx/string.h> |
83 ### cx_strtok_next |
180 |
84 ### cx_strtok_next_m |
181 CxStrtokCtx cx_strtok(AnyStr str, AnyStr delim, size_t limit); |
|
182 |
|
183 void cx_strtok_delim(CxStrtokCtx *ctx, |
|
184 const cxstring *delim, size_t count); |
|
185 |
|
186 bool cx_strtok_next(CxStrtokCtx *ctx, cxstring *token); |
|
187 |
|
188 bool cx_strtok_next_m(CxStrtokCtx *ctx, cxmutstr *token); |
|
189 ``` |
|
190 |
|
191 > Documentation work in progress. |
|
192 >{style="warning"} |
85 |
193 |
86 ## Conversion to Numbers |
194 ## Conversion to Numbers |
87 |
195 |
88 ### cx_strtod_lc_ |
196 For each integer type, as well as `float` and `double`, there are functions to convert a UCX string to a number of that type. |
89 ### cx_strtof_lc_ |
197 |
90 ### cx_strtoi16_lc_ |
198 Integer conversion comes in two flavours: |
91 ### cx_strtoi32_lc_ |
199 ```C |
92 ### cx_strtoi64_lc_ |
200 int cx_strtoi(AnyStr str, int *output, int base); |
93 ### cx_strtoi8_lc_ |
201 |
94 ### cx_strtoi_lc_ |
202 int cx_strtoi_lc(AnyStr str, int *output, int base, |
95 ### cx_strtol_lc |
203 const char *groupsep); |
96 ### cx_strtoll_lc |
204 ``` |
97 ### cx_strtos_lc |
205 |
98 ### cx_strtou16_lc |
206 The basic variant takes a string of any UCX string type, a pointer to the `output` integer, and the `base` (one of 2, 8, 10, or 16). |
99 ### cx_strtou32_lc |
207 Conversion is attempted with respect to the specified `base` and respects possible special notations for that base. |
100 ### cx_strtou64_lc |
208 Hexadecimal numbers may be prefixed with `0x`, `x`, or `#`, and binary numbers may be prefixed with `0b` or `b`. |
101 ### cx_strtou8_lc |
209 |
102 ### cx_strtou_lc |
210 The `_lc` versions of the integer conversion functions are equivalent, except that they allow the specification of an |
103 ### cx_strtoul_lc |
211 array of group separator chars, each of which is simply ignored during conversion. |
104 ### cx_strtoull_lc |
212 The default group separator for the basic version is a comma `,`. |
105 ### cx_strtous_lc |
213 |
106 ### cx_strtouz_lc |
214 The signature for the floating point conversions is quite similar: |
107 ### cx_strtoz_lc |
215 ```C |
108 --> |
216 int cx_strtof(AnyStr str, float *output); |
|
217 |
|
218 int cx_strtof_lc(AnyStr str, float *output, |
|
219 char decsep, const char *groupsep); |
|
220 ``` |
|
221 |
|
222 The two differences are that the floating point versions do not support different bases, |
|
223 and the `_lc` variant allows specifying not only an array of group separators, |
|
224 but also the character used for the decimal separator. |
|
225 |
|
226 In the basic variant, the group separator is again a comma `,`, and the decimal separator is a dot `.`. |
|
227 |
|
228 > The floating point conversions of UCX 3.1 do not achieve the same precision as standard library implementations |
|
229 > which usually use more sophisticated algorithms. |
|
230 > The precision might increase in future UCX releases, |
|
231 > but until then be aware of slight inaccuracies, in particular when working with `double`. |
|
232 {style="warning"} |
|
233 |
|
234 > The UCX string to number conversions are intentionally not considering any locale settings |
|
235 > and are therefore independent of any global state. |
|
236 {style="note"} |
109 |
237 |
110 <seealso> |
238 <seealso> |
111 <category ref="apidoc"> |
239 <category ref="apidoc"> |
112 <a href="https://ucx.sourceforge.io/api/string_8h.html">string.h</a> |
240 <a href="https://ucx.sourceforge.io/api/string_8h.html">string.h</a> |
113 </category> |
241 </category> |