Sat, 01 Mar 2025 15:02:57 +0100
complete the properties documentation
relates to #451
# Properties The UCX properties parser can be used to parse line based key/value strings. ## Supported Syntax Key/value pairs must be line based and separated by a single character delimter. The parser supports up to three different characters which introduce comments. All characters starting with a comment character up to the end of the line are ignored. Blank lines are also ignored. An example properties file looks like this: ```properties # Comment line at start of file key1 = value1 key2 = value2 # next is a blank line and will be ignored keys_are_trimmed = and_so_are_values # also a comment ``` > Delimiter and comment characters are configured with the `CxPropertiesConfig` structure. > There is also a field reserved for `continuation` which will be used as a line continuation character > in a future version of UCX. > In UCX 3.1 this is not implemented. ## Basic Parsing ```C #include <cx/properties.h> typedef struct cx_properties_config_s { char delimiter; char comment1; char comment2; char comment3; // reserved for future use - not implemented in UCX 3.1 char continuation; } CxPropertiesConfig; void cxPropertiesInit(CxProperties *prop, CxPropertiesConfig config); void cxPropertiesInitDefault(CxProperties *prop); void cxPropertiesDestroy(CxProperties *prop); void cxPropertiesReset(CxProperties *prop); int cxPropertiesFilln(CxProperties *prop, const char *buf, size_t len); // where S is one of cxstring, cxmutstr, char*, const char* int cxPropertiesFill(CxProperties *prop, S string); CxPropertiesStatus cxPropertiesNext(CxProperties *prop, cxstring *key, cxstring *value); void cxPropertiesUseStack(CxProperties *prop, char *buf, size_t capacity); ``` The first step is to initialize a `CxProperties` structure with a call to `cxPropertiesInit()` using the desired config. The shorthand `cxPropertiesInitDefault()` creates a default configuration with the equals sign `'='` as delimiter and the hash-symbol `'#'` as comment symbol (the other two comment symbols remain unused in the default config). > In a future UCX version, the default `continuation` character will be a backslash `'\'`. > In UCX 3.1 this feature is not implemented, yet. The actual parsing is an interleaving invocation of the `cxPropertiesFill()` (or `cxPropertiesFilln()`) and `cxPropertiesNext()` functions. The `cxPropertiesFill()` function is a convenience function, that accepts UCX strings and normal zero-terminated C strings and behaves otherwise like `cxPropertiesFilln()`. Filling the input buffer is cost-free if there is no data already in the input buffer. In that case, the input buffer only stores the pointer to the original data without creating a copy. Calling `cxPropertiesNext()` will return with `CX_PROPERTIES_NO_ERROR` (= zero) for each key/value-pair that is successfully parsed, and stores the pointers and lengths for both the key and the value into the structures pointed to by the `key` and `value` arguments. When all the data from the input buffer was successfully consumed, `cxPropertiesNext()` returns `CX_PROPERTIES_NO_DATA`. > This is all still free of any copies and allocations. > That means, the pointers in `key` and `value` after `cxPropertiesNext()` returns will point into the input buffer. > If you intend to store the key and/or the value somewhere else, it is strongly recommended to create a copy with `cx_strdup()`, > because you will otherwise soon end up with a dangling pointer. > {style="note"} If `cxPropertiesNext()` returns `CX_PROPERTIES_INCOMPLETE_DATA` it means that the input buffer is exhausted, but the last line did not contain a full key/value pair. In that case, you can call `cxPropertiesFill()` again to add more data and continue with `cxPropertiesNext()`. Note, that adding more data to a non-empty input buffer will lead to an allocation, unless you specified some stack memory with `cxPropertiesUseStack()`. The stack capacity must be large enough to contain the longest line in your data. If the internal buffer is not large enough to contain a single line, it is extended. If that is not possible for some reason, `cxPropertiesNext()` fails and returns `CX_PROPERTIES_BUFFER_ALLOC_FAILED`. If you want to reuse a `CxProperties` structure with the same config, you can call `cxPropertiesReset()`, even if the last operation was a failure. Otherwise, you should always call `cxPropertiesDestroy()` when you are done with the parser. > It is strongly recommended to always call `cxPropertiesDestroy` when you are done with the parser, > even if you did not expect any allocations because you used `cxPropertiesUseStack()`. ### List of Status Codes Below is a full list of status codes for `cxPropertiesNext()`. | Status Code | Meaning | |-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | CX_PROPERTIES_NO_ERROR | A key/value pair was found and returned. | | CX_PROPERTIES_NO_DATA | The input buffer does not contain more data. | | CX_PROPERTIES_INCOMPLETE_DATA | The input ends unexpectedly. This can happen when the last line does not terminate with a line break, or when the input ends with a parsed key but no value. Use `cxPropertiesFill()` to add more data before retrying. | | CX_PROPERTIES_NULL_INPUT | The input buffer was never initialized. Probably you forgot to call `cxPropertiesFill()` at least once. | | CX_PROPERTIES_INVALID_EMPTY_KEY | Only white-spaces were found on the left hand-side of the delimiter. Keys must not be empty. | | CX_PROPERTIES_INVALID_MISSING_DELIMITER | A line contains data, but no delimiter. | | CX_PROPERTIES_BUFFER_ALLOC_FAILED | More internal buffer was needed, but could not be allocated. | ## Sources and Sinks ```C #include <cx/properties.h> CxPropertiesSource cxPropertiesStringSource(cxstring str); CxPropertiesSource cxPropertiesCstrSource(const char *str); CxPropertiesSource cxPropertiesCstrnSource(const char *str, size_t len); CxPropertiesSource cxPropertiesFileSource(FILE *file, size_t chunk_size); CxPropertiesSink cxPropertiesMapSink(CxMap *map); CxPropertiesStatus cxPropertiesLoad(CxProperties *prop, CxPropertiesSink sink, CxPropertiesSource source); ``` The basic idea of `cxPropertiesLoad()` is that key/value-pairs are extracted from a _source_ and ingested by a _sink_. For the most common scenarios where properties data is read from a string or a file and put into a map, several functions are available. But you can specify your [own sources and sinks](#creating-own-sources-and-sinks), as well. The following example shows a simple function which loads all properties data from a file. The `chunk_size` argument when creating the file source specifies how many bytes are read from the file and filled into the properties parser in one read/sink cycle. ```C #include <stdio.h> #include <cx/properties.h> int load_props_from_file(const char *filename, CxMap *map) { FILE *f = fopen(filename, "r"); if (!f) return -1; CxProperties prop; cxPropertiesInitDefault(&prop); CxPropertiesSink sink = cxPropertiesMapSink(map); CxPropertiesSource src = cxPropertiesFileSource(f, 512); CxPropertiesStatus status = cxPropertiesLoad(&prop, sink, src); fclose(f); return status; } // usage: CxMap *map = cxHashMapCreateSimple(CX_STORE_POINTERS); if (load_props_from_file("my-props.properties", map)) { // error handling } else { // assuming my-props.properties contains the following line: // my-key = some value char *value = cxMapGet(map, "my-key"); } ``` > The function `cxPropertiesLoad()` should usually not return `CX_PROPERTIES_INCOMPLETE_DATA` because the parser is automatically refilled from the source. > If it does, it could mean that the source was unable to provide all the data, or the properties data ended unexpectedly. > The most expected status code is `CX_PROPERTIES_NO_ERROR` which means that at least one key/value-pair was found. > If `cxPropertiesLoad()` returns `CX_PROPERTIES_NO_DATA` it means that the source did not provide any key/value-pair. > There are several special status codes which are documented [below](#additional-status-codes). ### Creating own Sources and Sinks ```C #include <cx/properties.h> typedef int(*cx_properties_read_init_func)(CxProperties *prop, CxPropertiesSource *src); typedef int(*cx_properties_read_func)(CxProperties *prop, CxPropertiesSource *src, cxstring *target); typedef void(*cx_properties_read_clean_func)(CxProperties *prop, CxPropertiesSource *src); typedef int(*cx_properties_sink_func)(CxProperties *prop, CxPropertiesSink *sink, cxstring key, cxstring value); typedef struct cx_properties_source_s { void *src; void *data_ptr; size_t data_size; cx_properties_read_func read_func; cx_properties_read_init_func read_init_func; cx_properties_read_clean_func read_clean_func; } CxPropertiesSource; typedef struct cx_properties_sink_s { void *sink; void *data; cx_properties_sink_func sink_func; } CxPropertiesSink; ``` You can create your own sources and sinks by initializing the respective structures. For a source, only the `read_func` is mandatory, the other two functions are optional and used for initialization and cleanup, if required. The file source created by `cxPropertiesFileSource()`, for example, uses the `read_init_func` to allocate, and the `read_clean_func` to free the read buffer, respectively. Since the default map sink created by `cxPropertiesMapSink()` stores `char*` pointers into a map, the following example uses a different sink, which stores them as `cxmutstr` values, automatically freeing them when the map gets destroyed. ```C #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <sys/stat.h> #include <sys/mman.h> #include <cx/properties.h> #include <cx/hash_map.h> static int prop_mmap(CxProperties *prop, CxPropertiesSource *src) { struct stat s; int fd = open(src->src, O_RDONLY); if (fd < 0) return -1; // re-use the data field to store the fd // there are cleaner ways, but this is just for illustration src->src = (void*) fd; fstat(fd, &s); // memory map the entire file // and store the address and length in the properties source src->data_ptr = mmap(0, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0); src->data_size = s.st_size; return src->data_ptr == NULL; } static int prop_read(CxProperties *prop, CxPropertiesSource *src, cxstring *target) { // copy the address and length of the mapped data to the target target->ptr = src->data_ptr; target->length = src->data_size; // set the new size to zero to indicate that there is no more data src->data_size = 0; return 0; } static void prop_unmap(CxProperties *prop, CxPropertiesSource *src) { // unmap the memory and close the file munmap(src->data_ptr, src->data_size); close((int)src->src); } static int prop_sink(CxProperties *prop, CxPropertiesSink *sink, cxstring key, cxstring value) { CxMap *map = sink->sink; // copy the string and store it into the map cxmutstr v = cx_strdup(value); int r = cxMapPut(map, key, &v); if (r != 0) cx_strfree(&v); return r; } int load_props_from_file(const char *filename, CxMap *map) { CxProperties prop; cxPropertiesInitDefault(&prop); CxPropertiesSource src; src.src = (void*) filename; src.read_init_func = prop_mmap; src.read_func = prop_read; src.read_clean_func = prop_unmap; CxPropertiesSink sink; sink.sink = map; sink.sink_func = prop_sink; return cxPropertiesLoad(&prop, sink, src); } int main() { // in contrast to the default map sink, // this one here stores the UCX strings by value CxMap *map = cxHashMapCreateSimple(sizeof(cxmutstr)); // automatically free the UCX string when removed from the map cxDefineDestructor(map, cx_strfree); // use our custom load function to load the data from the file if (load_props_from_file("my-props.properties", map)) { fputs("Error reading properties.\n", stderr); return 1; } // output the read key/value pairs for illustration CxMapIterator iter = cxMapIterator(map); cx_foreach(CxMapEntry *, entry, iter) { cxstring k = cx_strn(entry->key->data, entry->key->len); cxmutstr *v = entry->value; printf("%.*s = %.*s\n", (int) k.length, k.ptr, (int) v->length, v->ptr); } // freeing the map also frees the strings // because we have registered cx_strfree() as destructor function cxMapFree(map); return 0; } ``` > A cleaner implementation that does not produce a warning for bluntly casting an `int` to a `void*` > can be achieved by declaring a struct that contains the information, allocate memory for > that struct, and store the pointer in `data_ptr`. > For illustrating how properties sources and sinks can be implemented, this was not necessary. ### Additional Status Codes For sources and sinks there are three additional special status codes, which only appear as return values for `cxPropertiesLoad()`. | Status Code | Meaning | |-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | CX_PROPERTIES_READ_INIT_FAILED | Initializing the properties source failed and the `cx_properties_read_init_func` returned non-zero. | | CX_PROPERTIES_READ_FAILED | Reading from a properties source failed and the `cx_properties_read_func` returned non-zero. | | CX_PROPERTIES_SINK_FAILED | Sinking a key/value-pair failed and the `cx_properties_sink_func` returned non-zero. | <seealso> <category ref="apidoc"> <a href="https://ucx.sourceforge.io/api/properties_8h.html">properties.h</a> </category> </seealso>