Non-Constant char*: attach() and detach()

It is also possible to convert a Str into a regular char*. This is necessary in the cases where you wish to call a function that modifies the char* data. Unfortunately, a simple cast operator is not going to work in this case. The reason is that modifying character data directly does not update the Str::size_and_flags field. If this field is out-of-sync with the NULL terminator at the end of the string data then many functions in Str will not work correctly.

The solution is to detach() the char* buffer from the string, modify it, and re-attach() it after. Here is the basic template:

Str x = "hello world";
unsigned long buff_size;
bool allocated;

char* buff = x.detach(buff_size, allocated);
// modify the string
capitalize(buff);
// reattach to x
x.attach(buff, buff_size, false, allocated);

There are many variations to the pattern above, such as detaching and not reattaching or reattaching to a completely different Str object.

The prototypes for attach() and detach() are:

char* detach(unsigned long& buff_size, bool& allocated);
void attach(char* buff, unsigned long buff_size, bool initialize=true,
                bool deallocate=false);

In the detach() case, the buff_size and allocated parameters hold information about the string that would otherwise be lost during the conversion to char*. The buff_size parameter will give the total buffer size, which will be larger than the string's length. If set, the allocated parameter indicates that the char* data is located on the heap and used to be owned by the Str object. If either the buffer is not heap-based or it was not owned by the Str, allocated is set to false.

The buff, buff_size and deallocate parameters of attach() correlate directly to the parameters returned by detach() These parameters can be transfered directly when reattaching a string. If the initialize parameter is true, then the buffer will be reinitialized to an empty string. This is useful when attaching to an uninitialized buffer but should generally be set to false when attaching to a buffer that contains a valid character string.

Another use of attach() is to wrap a string object around a buffer that you allocated yourself. Here is an example that uses getline()

#include "Str.hpp"
#include <stdio.h>

int main(int argc, char* argv) {

    char* buff       = 0;
    size_t buff_size = 0;
    int err = 0;
    unsigned int token_count = 0;
    Str str;

    if (getline(&buff, &buff_size, stdin) >= 0) {

        // Attach the getline to buffer to the string
        
        str.attach(buff, buff_size, false, true);

        // Tokenize the string

        Str* tokens = str.getAllTokens(token_count);
        
        if (tokens) {
            int i=0;
            for (; i<token_count; ++i) {
                fprintf(stdout, "%s\n", static_cast<const char*>(tokens[i]));
            }
            delete[] tokens;
        }
        
    } else {
        err = 1;
    }

    return err;
    
}

The function above will read in lines from standard input and print out each word on its own line. The C-library getline() function will read in a line of input into buff and will automatically reallocate buff if it is too small. The function then uses attach() to wrap a Str around this buffer.

Note that in this case buffer ownership is transferred to str by setting the deallocate parameter of attach() to true. This means that str will call free(buff) automatically when it is destructed.

One final note on attach is that, if you attach to a string that already has a buffer attached, the old buffer is discarded and free() is automatically called, if appropriate.

attach() constructor

The attach() function also has a constructor form:

    Str(char* buff, unsigned long buff_size, bool initialize=true,
        bool deallocate=false);

This is equivalent to:

Str x;
x.attach(buff, buff_size, initialize, deallocate);

Refer to chapter 5 for more detail on string constructors.

Next: char* Methods Up: char* Conversions Previous: Constant char* Contents

2007-05-05