Next: C vs C++ interface Up: Design Trade-offs Previous: Embedding buff_size in data Contents

Moving size_and_flags inside data

Instead of having size_and_flags a part of Str, it could have been embedded in Str::data. This approach has a few advantages:

An empty string would take 4 instead of 8 bytes to store.
It would be possible to implement a copy-on-write technique where, if one string is set to another (i.e. x = y), the string simply points to the same data and a str_reader_count counter is incremented in the strings data buffer. If any of the strings pointing to this string need to write data, they make a copy of the data at that point (and decrement str_reader_count in the original data). This would speed up string to string copies and would provide a fast-path for equals() and similar functions, but would also add complexity and overhead in other places.

The advantages or keeping size_and_flags separate are:

In the separation model, the buffer pointed-to by data is essentially a char*. This is intuitive for the programmer and is especially helpful when converting between Str and char* objects.
The separation model allows the attach() and detach() functions to exist. These functions allow a programmer to wrap a Str() around an existing char* string; a very useful and efficient technique.
If size_and_flags were a part of data then there would be a lot of size terminology associated with data, including the size of the string, the size of the string buffer, and the size of the buffer that includes size_and_flags. This increases complexity and creates ambiguity. For example, should buff_size report the size of the string buffer or the size of the entire buffer? One could argue either way.

Next: C vs C++ interface Up: Design Trade-offs Previous: Embedding buff_size in data Contents

2007-05-05