docs/Writerside/topics/string.h.md

changeset 1217
23f25e91d367
parent 1190
a7b913d5d589
child 1218
cbb48edaf433
equal deleted inserted replaced
1216:151c1b7d5d50 1217:23f25e91d367
1 # String 1 # String
2 2
3 <warning> 3 UCX strings store character arrays together with a length and come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`).
4 Outdated Section - will be updated soon! 4
5 </warning> 5 In general, UCX strings are *not* necessarily zero-terminated.
6 6 If a function guarantees to return a zero-terminated string, it is explicitly mentioned in the documentation.
7 UCX strings come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`). 7 As a rule of thumb, you _should not_ pass a character array of a UCX string structure to another API without explicitly
8 The functions of UCX are designed to work with immutable strings by default but in situations where it is necessary,
9 the API also provides alternative functions that work directly with mutable strings.
10 Functions that change a string in-place are, of course, only accepting mutable strings.
11
12 When you are using UCX functions, or defining your own functions, you are sometimes facing the "problem",
13 that the function only accepts arguments of type `cxstring` but you only have a `cxmutstr` at hand.
14 In this case you _should not_ introduce a wrapper function that accepts the `cxmutstr`,
15 but instead you should use the `cx_strcast()` function to cast the argument to the correct type.
16
17 In general, UCX strings are **not** necessarily zero-terminated. If a function guarantees to return zero-terminated
18 string, it is explicitly mentioned in the documentation of the respective function.
19 As a rule of thumb, you _should not_ pass the strings of a UCX string structure to another API without explicitly
20 ensuring that the string is zero-terminated. 8 ensuring that the string is zero-terminated.
21 9
22 <!--
23 ## Basics 10 ## Basics
24 11
25 ### cx_mutstr 12 > To make documentation simpler, we introduce the pseudo-type `AnyStr` with the meaning that
26 ### cx_mutstrn 13 > both `cxstring` and `cxmutstr` are accepted for that argument.
27 ### cx_str 14 > The implementation is actually hidden behind a macro which uses `cx_strcast()` to guarantee compatibility.
28 ### cx_strn 15 {style="note"}
29 ### cx_strcast 16
30 ### cx_strfree 17 ```C
31 ### cx_strfree_a 18 #include <cx/string.h>
32 ### cx_strdup 19
33 ### cx_strdup_a 20 struct cx_string_s {const char *ptr; size_t length;};
34 ### cx_strlen 21
35 ### cx_strtrim 22 struct cx_mutstr_s {char *ptr; size_t length;};
36 ### cx_strtrim_m 23
37 ### cx_strlower 24 typedef struct cx_string_s cxstring;
38 ### cx_strupper 25
26 typedef struct cx_mutstr_s cxmutstr;
27
28 cxstring cx_str(const char *cstring);
29
30 cxstring cx_strn(const char *cstring, size_t length);
31
32 cxmutstr cx_mutstr(char *cstring);
33
34 cxmutstr cx_mutstrn(char *cstring, size_t length);
35
36 cxstring cx_strcast(AnyStr str);
37
38 cxmutstr cx_strdupa(AnyStr string);
39
40 cxmutstr cx_strdup_a(const CxAllocator *allocator, AnyStr string);
41
42 void cx_strfree(cxmutstr *str);
43
44 void cx_strfree_a(const CxAllocator *alloc, cxmutstr *str);
45 ```
46
47 > Documentation work in progress.
48 >{style="warning"}
49
50 > When you want to convert a string _literal_ into a UCX string, you can also use the `CX_STR(lit)` macro.
51 > This macro uses the fact that `sizeof(lit)` for a string literal `lit` is always the string length plus one,
52 > effectively saving an invocation of `strlen()`.
53 > However, this only works for literals - in all other cases you must use `cx_str()` or `cx_strn`.
39 54
40 ## Comparison 55 ## Comparison
41 56
42 ### cx_strcmp 57 ```C
43 ### cx_strcmp_p 58 #include <cx/string.h>
44 ### cx_strcasecmp 59
45 ### cx_strcasecmp_p 60 int cx_strcmp(cxstring s1, cxstring s2);
46 ### cx_strprefix 61
47 ### cx_strsuffix 62 int cx_strcmp_p(const void *s1, const void *s2);
48 ### cx_strcaseprefix 63
49 ### cx_strcasesuffix 64 bool cx_strprefix(cxstring string, cxstring prefix);
65
66 bool cx_strsuffix(cxstring string, cxstring suffix);
67
68 int cx_strcasecmp(cxstring s1, cxstring s2);
69
70 int cx_strcasecmp_p(const void *s1, const void *s2);
71
72 bool cx_strcaseprefix(cxstring string, cxstring prefix);
73
74 bool cx_strcasesuffix(cxstring string, cxstring suffix);
75 ```
76
77 > Documentation work in progress.
78 >{style="warning"}
50 79
51 ## Concatenation 80 ## Concatenation
52 81
53 ### cx_strcat_ma 82 ```C
83 #include <cx/string.h>
84
85 cxmutstr cx_strcat(size_t count, ... );
86
87 cxmutstr cx_strcat_a(const CxAllocator *alloc, size_t count, ... );
88
89 cxmutstr cx_strcat_m(cxmutstr str, size_t count, ... );
90
91 cxmutstr cx_strcat_ma(const CxAllocator *alloc,
92 cxmutstr str, size_t count, ... );
93
94 size_t cx_strlen(size_t count, ...);
95 ```
96
97 > Documentation work in progress.
98 >{style="warning"}
54 99
55 ## Find Characters and Substrings 100 ## Find Characters and Substrings
56 101
57 ### cx_strchr 102 ```C
58 ### cx_strchr_m 103 #include <cx/string.h>
59 ### cx_strrchr 104
60 ### cx_strrchr_m 105 cxstring cx_strchr(cxstring string, int chr);
61 ### cx_strstr 106
62 ### cx_strstr_m 107 cxmutstr cx_strchr_m(cxmutstr string, int chr);
63 ### cx_strsubs 108
64 ### cx_strsubsl 109 cxstring cx_strrchr(cxstring string,int chr);
65 ### cx_strsubsl_m 110
66 ### cx_strsubs_m 111 cxmutstr cx_strrchr_m(cxmutstr string, int chr);
112
113 cxstring cx_strstr(cxstring haystack, cxstring needle);
114
115 cxmutstr cx_strstr_m(cxmutstr haystack, cxstring needle);
116
117 cxstring cx_strsubs(cxstring string, size_t start);
118
119 cxstring cx_strsubsl(cxstring string, size_t start, size_t length);
120
121 cxmutstr cx_strsubs_m(cxmutstr string, size_t start);
122
123 cxmutstr cx_strsubsl_m(cxmutstr string, size_t start, size_t length);
124
125 cxstring cx_strtrim(cxstring string);
126
127 cxmutstr cx_strtrim_m(cxmutstr string);
128 ```
129
130 > Documentation work in progress.
131 >{style="warning"}
67 132
68 ## Replace Substrings 133 ## Replace Substrings
69 134
70 ### cx_strreplacen_a 135 ```C
136 #include <cx/string.h>
137
138 cxmutstr cx_strreplace(cxstring str, cxstring pattern, cxstring repl);
139
140 cxmutstr cx_strreplace_a(const CxAllocator *allocator, cxstring str,
141 cxstring pattern, cxstring repl);
142
143 cxmutstr cx_strreplacen(cxstring str, cxstring pattern, cxstring repl,
144 size_t replmax);
145
146 cxmutstr cx_strreplacen_a(const CxAllocator *allocator, cxstring str,
147 cxstring pattern, cxstring repl, size_t replmax);
148 ```
149
150 > Documentation work in progress.
151 >{style="warning"}
71 152
72 ## Basic Splitting 153 ## Basic Splitting
73 154
74 ### cx_strsplit 155 ```C
75 ### cx_strsplit_a 156 #include <cx/string.h>
76 ### cx_strsplit_m 157
77 ### cx_strsplit_ma 158 size_t cx_strsplit(cxstring string, cxstring delim,
159 size_t limit, cxstring *output);
160
161 size_t cx_strsplit_a(const CxAllocator *allocator,
162 cxstring string, cxstring delim,
163 size_t limit, cxstring **output);
164
165 size_t cx_strsplit_m(cxmutstr string, cxstring delim,
166 size_t limit, cxmutstr *output);
167
168 size_t cx_strsplit_ma(const CxAllocator *allocator,
169 cxmutstr string, cxstring delim,
170 size_t limit, cxmutstr **output);
171 ```
172
173 > Documentation work in progress.
174 >{style="warning"}
78 175
79 ## Complex Tokenization 176 ## Complex Tokenization
80 177
81 ### cx_strtok_ 178 ```C
82 ### cx_strtok_delim 179 #include <cx/string.h>
83 ### cx_strtok_next 180
84 ### cx_strtok_next_m 181 CxStrtokCtx cx_strtok(AnyStr str, AnyStr delim, size_t limit);
182
183 void cx_strtok_delim(CxStrtokCtx *ctx,
184 const cxstring *delim, size_t count);
185
186 bool cx_strtok_next(CxStrtokCtx *ctx, cxstring *token);
187
188 bool cx_strtok_next_m(CxStrtokCtx *ctx, cxmutstr *token);
189 ```
190
191 > Documentation work in progress.
192 >{style="warning"}
85 193
86 ## Conversion to Numbers 194 ## Conversion to Numbers
87 195
88 ### cx_strtod_lc_ 196 For each integer type, as well as `float` and `double`, there are functions to convert a UCX string to a number of that type.
89 ### cx_strtof_lc_ 197
90 ### cx_strtoi16_lc_ 198 Integer conversion comes in two flavours:
91 ### cx_strtoi32_lc_ 199 ```C
92 ### cx_strtoi64_lc_ 200 int cx_strtoi(AnyStr str, int *output, int base);
93 ### cx_strtoi8_lc_ 201
94 ### cx_strtoi_lc_ 202 int cx_strtoi_lc(AnyStr str, int *output, int base,
95 ### cx_strtol_lc 203 const char *groupsep);
96 ### cx_strtoll_lc 204 ```
97 ### cx_strtos_lc 205
98 ### cx_strtou16_lc 206 The basic variant takes a string of any UCX string type, a pointer to the `output` integer, and the `base` (one of 2, 8, 10, or 16).
99 ### cx_strtou32_lc 207 Conversion is attempted with respect to the specified `base` and respects possible special notations for that base.
100 ### cx_strtou64_lc 208 Hexadecimal numbers may be prefixed with `0x`, `x`, or `#`, and binary numbers may be prefixed with `0b` or `b`.
101 ### cx_strtou8_lc 209
102 ### cx_strtou_lc 210 The `_lc` versions of the integer conversion functions are equivalent, except that they allow the specification of an
103 ### cx_strtoul_lc 211 array of group separator chars, each of which is simply ignored during conversion.
104 ### cx_strtoull_lc 212 The default group separator for the basic version is a comma `,`.
105 ### cx_strtous_lc 213
106 ### cx_strtouz_lc 214 The signature for the floating point conversions is quite similar:
107 ### cx_strtoz_lc 215 ```C
108 --> 216 int cx_strtof(AnyStr str, float *output);
217
218 int cx_strtof_lc(AnyStr str, float *output,
219 char decsep, const char *groupsep);
220 ```
221
222 The two differences are that the floating point versions do not support different bases,
223 and the `_lc` variant allows specifying not only an array of group separators,
224 but also the character used for the decimal separator.
225
226 In the basic variant, the group separator is again a comma `,`, and the decimal separator is a dot `.`.
227
228 > The floating point conversions of UCX 3.1 do not achieve the same precision as standard library implementations
229 > which usually use more sophisticated algorithms.
230 > The precision might increase in future UCX releases,
231 > but until then be aware of slight inaccuracies, in particular when working with `double`.
232 {style="warning"}
233
234 > The UCX string to number conversions are intentionally not considering any locale settings
235 > and are therefore independent of any global state.
236 {style="note"}
109 237
110 <seealso> 238 <seealso>
111 <category ref="apidoc"> 239 <category ref="apidoc">
112 <a href="https://ucx.sourceforge.io/api/string_8h.html">string.h</a> 240 <a href="https://ucx.sourceforge.io/api/string_8h.html">string.h</a>
113 </category> 241 </category>

mercurial