ucx: comparison docs/Writerside/topics/string.h.md

-:151c1b7d5d50
+:23f25e91d367
 # String
-<warning>
+UCX strings store character arrays together with a length and come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`).
-Outdated Section - will be updated soon!
-</warning>
+In general, UCX strings are *not* necessarily zero-terminated.
+If a function guarantees to return a zero-terminated string, it is explicitly mentioned in the documentation.
-UCX strings come in two variants: immutable (`cxstring`) and mutable (`cxmutstr`).
+As a rule of thumb, you _should not_ pass a character array of a UCX string structure to another API without explicitly
-The functions of UCX are designed to work with immutable strings by default but in situations where it is necessary,
-the API also provides alternative functions that work directly with mutable strings.
-Functions that change a string in-place are, of course, only accepting mutable strings.
-When you are using UCX functions, or defining your own functions, you are sometimes facing the "problem",
-that the function only accepts arguments of type `cxstring` but you only have a `cxmutstr` at hand.
-In this case you _should not_ introduce a wrapper function that accepts the `cxmutstr`,
-but instead you should use the `cx_strcast()` function to cast the argument to the correct type.
-In general, UCX strings are **not** necessarily zero-terminated. If a function guarantees to return zero-terminated
-string, it is explicitly mentioned in the documentation of the respective function.
-As a rule of thumb, you _should not_ pass the strings of a UCX string structure to another API without explicitly
 ensuring that the string is zero-terminated.
-<!--
 ## Basics
-### cx_mutstr
+> To make documentation simpler, we introduce the pseudo-type `AnyStr` with the meaning that
-### cx_mutstrn
+> both `cxstring` and `cxmutstr` are accepted for that argument.
-### cx_str
+> The implementation is actually hidden behind a macro which uses `cx_strcast()` to guarantee compatibility.
-### cx_strn
+{style="note"}
-### cx_strcast
-### cx_strfree
+```C
-### cx_strfree_a
+#include <cx/string.h>
-### cx_strdup
-### cx_strdup_a
+struct cx_string_s {const char *ptr; size_t length;};
-### cx_strlen
-### cx_strtrim
+struct cx_mutstr_s {char *ptr; size_t length;};
-### cx_strtrim_m
-### cx_strlower
+typedef struct cx_string_s cxstring;
-### cx_strupper
+typedef struct cx_mutstr_s cxmutstr;
+cxstring cx_str(const char *cstring);
+cxstring cx_strn(const char *cstring, size_t length);
+cxmutstr cx_mutstr(char *cstring);
+cxmutstr cx_mutstrn(char *cstring, size_t length);
+cxstring cx_strcast(AnyStr str);
+cxmutstr cx_strdupa(AnyStr string);
+cxmutstr cx_strdup_a(const CxAllocator *allocator, AnyStr string);
+void cx_strfree(cxmutstr *str);
+void cx_strfree_a(const CxAllocator *alloc, cxmutstr *str);
+```
+> Documentation work in progress.
+>{style="warning"}
+> When you want to convert a string _literal_ into a UCX string, you can also use the `CX_STR(lit)` macro.
+> This macro uses the fact that `sizeof(lit)` for a string literal `lit` is always the string length plus one,
+> effectively saving an invocation of `strlen()`.
+> However, this only works for literals - in all other cases you must use `cx_str()` or `cx_strn`.
 ## Comparison
-### cx_strcmp
+```C
-### cx_strcmp_p
+#include <cx/string.h>
-### cx_strcasecmp
-### cx_strcasecmp_p
+int cx_strcmp(cxstring s1, cxstring s2);
-### cx_strprefix
-### cx_strsuffix
+int cx_strcmp_p(const void *s1, const void *s2);
-### cx_strcaseprefix
-### cx_strcasesuffix
+bool cx_strprefix(cxstring string, cxstring prefix);
+bool cx_strsuffix(cxstring string, cxstring suffix);
+int cx_strcasecmp(cxstring s1, cxstring s2);
+int cx_strcasecmp_p(const void *s1, const void *s2);
+bool cx_strcaseprefix(cxstring string, cxstring prefix);
+bool cx_strcasesuffix(cxstring string, cxstring suffix);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Concatenation
-### cx_strcat_ma
+```C
+#include <cx/string.h>
+cxmutstr cx_strcat(size_t count, ... );
+cxmutstr cx_strcat_a(const CxAllocator *alloc, size_t count, ... );
+cxmutstr cx_strcat_m(cxmutstr str, size_t count, ... );
+cxmutstr cx_strcat_ma(const CxAllocator *alloc,
+cxmutstr str, size_t count, ... );
+size_t cx_strlen(size_t count, ...);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Find Characters and Substrings
-### cx_strchr
+```C
-### cx_strchr_m
+#include <cx/string.h>
-### cx_strrchr
-### cx_strrchr_m
+cxstring cx_strchr(cxstring string, int chr);
-### cx_strstr
-### cx_strstr_m
+cxmutstr cx_strchr_m(cxmutstr string, int chr);
-### cx_strsubs
-### cx_strsubsl
+cxstring cx_strrchr(cxstring string,int chr);
-### cx_strsubsl_m
-### cx_strsubs_m
+cxmutstr cx_strrchr_m(cxmutstr string, int chr);
+cxstring cx_strstr(cxstring haystack, cxstring needle);
+cxmutstr cx_strstr_m(cxmutstr haystack, cxstring needle);
+cxstring cx_strsubs(cxstring string, size_t start);
+cxstring cx_strsubsl(cxstring string, size_t start, size_t length);
+cxmutstr cx_strsubs_m(cxmutstr string, size_t start);
+cxmutstr cx_strsubsl_m(cxmutstr string, size_t start, size_t length);
+cxstring cx_strtrim(cxstring string);
+cxmutstr cx_strtrim_m(cxmutstr string);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Replace Substrings
-### cx_strreplacen_a
+```C
+#include <cx/string.h>
+cxmutstr cx_strreplace(cxstring str, cxstring pattern, cxstring repl);
+cxmutstr cx_strreplace_a(const CxAllocator *allocator, cxstring str,
+cxstring pattern, cxstring repl);
+cxmutstr cx_strreplacen(cxstring str, cxstring pattern, cxstring repl,
+size_t replmax);
+cxmutstr cx_strreplacen_a(const CxAllocator *allocator, cxstring str,
+cxstring pattern, cxstring repl, size_t replmax);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Basic Splitting
-### cx_strsplit
+```C
-### cx_strsplit_a
+#include <cx/string.h>
-### cx_strsplit_m
-### cx_strsplit_ma
+size_t cx_strsplit(cxstring string, cxstring delim,
+size_t limit, cxstring *output);
+size_t cx_strsplit_a(const CxAllocator *allocator,
+cxstring string, cxstring delim,
+size_t limit, cxstring **output);
+size_t cx_strsplit_m(cxmutstr string, cxstring delim,
+size_t limit, cxmutstr *output);
+size_t cx_strsplit_ma(const CxAllocator *allocator,
+cxmutstr string, cxstring delim,
+size_t limit, cxmutstr **output);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Complex Tokenization
-### cx_strtok_
+```C
-### cx_strtok_delim
+#include <cx/string.h>
-### cx_strtok_next
-### cx_strtok_next_m
+CxStrtokCtx cx_strtok(AnyStr str, AnyStr delim, size_t limit);
+void cx_strtok_delim(CxStrtokCtx *ctx,
+const cxstring *delim, size_t count);
+bool cx_strtok_next(CxStrtokCtx *ctx, cxstring *token);
+bool cx_strtok_next_m(CxStrtokCtx *ctx, cxmutstr *token);
+```
+> Documentation work in progress.
+>{style="warning"}
 ## Conversion to Numbers
-### cx_strtod_lc_
+For each integer type, as well as `float` and `double`, there are functions to convert a UCX string to a number of that type.
-### cx_strtof_lc_
-### cx_strtoi16_lc_
+Integer conversion comes in two flavours:
-### cx_strtoi32_lc_
+```C
-### cx_strtoi64_lc_
+int cx_strtoi(AnyStr str, int *output, int base);
-### cx_strtoi8_lc_
-### cx_strtoi_lc_
+int cx_strtoi_lc(AnyStr str, int *output, int base,
-### cx_strtol_lc
+const char *groupsep);
-### cx_strtoll_lc
+```
-### cx_strtos_lc
-### cx_strtou16_lc
+The basic variant takes a string of any UCX string type, a pointer to the `output` integer, and the `base` (one of 2, 8, 10, or 16).
-### cx_strtou32_lc
+Conversion is attempted with respect to the specified `base` and respects possible special notations for that base.
-### cx_strtou64_lc
+Hexadecimal numbers may be prefixed with `0x`, `x`, or `#`, and binary numbers may be prefixed with `0b` or `b`.
-### cx_strtou8_lc
-### cx_strtou_lc
+The `_lc` versions of the integer conversion functions are equivalent, except that they allow the specification of an
-### cx_strtoul_lc
+array of group separator chars, each of which is simply ignored during conversion.
-### cx_strtoull_lc
+The default group separator for the basic version is a comma `,`.
-### cx_strtous_lc
-### cx_strtouz_lc
+The signature for the floating point conversions is quite similar:
-### cx_strtoz_lc
+```C
--->
+int cx_strtof(AnyStr str, float *output);
+int cx_strtof_lc(AnyStr str, float *output,
+char decsep, const char *groupsep);
+```
+The two differences are that the floating point versions do not support different bases,
+and the `_lc` variant allows specifying not only an array of group separators,
+but also the character used for the decimal separator.
+In the basic variant, the group separator is again a comma `,`, and the decimal separator is a dot `.`.
+> The floating point conversions of UCX 3.1 do not achieve the same precision as standard library implementations
+> which usually use more sophisticated algorithms.
+> The precision might increase in future UCX releases,
+> but until then be aware of slight inaccuracies, in particular when working with `double`.
+{style="warning"}
+> The UCX string to number conversions are intentionally not considering any locale settings
+> and are therefore independent of any global state.
+{style="note"}
 <seealso>
 <category ref="apidoc">
 <a href="https://ucx.sourceforge.io/api/string_8h.html">string.h</a>
 </category>

Mercurial > hg > ucx / file comparison

comparison: docs/Writerside/topics/string.h.md

docs/Writerside/topics/string.h.md