ptypes

string


Table of Contents

initial page

Intro

The string class implements dynamically allocated reference-counted 8-bit character strings. The string class is a mixture of L-type and null-terminated strings: it has both the length indicator and the terminating null-symbol. The length of a string is theoretically limited to INT_MAX, and practically is limited to the amount of virtual memory space available to the application.

A string object itself contains only a reference to the first character of the string buffer. A string object can be implicitly converted to a null-terminated string, either a variable or passed as an actual parameter, thus allowing to combine both PTypes strings and traditional C strings in your application. A string object converted to char* or const char* never returns NULL pointers: it guarantees to always point to some data, even if the string is zero-sized.

The reference counting mechanism works transparently (known also as copy-on-write) and safely with regard to multithreading. You can manipulate string objects as if each object had its own copy of string data. Whenever you modify a string object the library safely detaches the buffer from all other string objects that may be using the same buffer and creates a unique copy so that changes won't affect the other "holders" of this string.

NOTE on multithreading: the dynamic string objects themselves are NOT thread-safe. In other words, each thread can manipulate objects (variables) of type string only within their scope. However, it is safe to pass strings as (copy) parameters when, for example, sending a message to a concurrent thread through a message queue. Whenever the recipient thread tries to modify the string, the shared data buffer is safely detached.

The string class is declared in <ptypes.h>.

Constructor/destructors

Operators

A string object can be constructed in 5 different ways:

  • default constructor string() creates an empty string.
  • copy constructor string(const string& s) creates a copy of the given string s. Actually this constructor only increments the reference count by 1 and no memory allocation takes place.
  • string(char c) constructs a new string consisting of one character c.
  • string(const char* s) constructs a new string object from a null-terminated string. If s is either NULL or is a pointer to a null character, an empty string object is created. This constructor can be used to assign a string literal to a string object (see examples below).
  • string(const char* s, int len) copies len bytes of data from buffer s to the newly allocated string buffer.

Destructor ~string() decrements the reference count for the given string buffer and removes it from the dynamic memory if necessary.

Examples:

string s1;             // empty string
string s2 = s1;        // copy
string s3 = 'A';       // single character
string s4 = "ABCabc";  // string literal
char* p = "ABCabc";
string s5 = p;         // null-terminated string
string s6(p, 3);       // buffer/length

Typecasts

A string object can be assigned to a variable or passed as an actual parameter of type const char* implicitly (by default). Such assignments should be used carefully, since the library does not keep track of whether a char pointer refers to the given string buffer. To make sure that a char pointer refers to a valid string buffer, always make the scope of a char pointer variable smaller than or equal to the scope of a string object. In most cases passing a string object to a system or API call is safe (see examples below). This typecast operator does not perform any actions and simply returns a pointer to the string buffer.

The value of the char pointer is guaranteed to be non-NULL. Even if the string is empty, the char pointer will refer to a null-symbol.

A string buffer can not be modified through a constant char pointer. If you want to modify the string buffer through a char pointer, use unique(string&) function instead. This function always returns a reference to a unique string buffer (i.e. when the reference count is 1).

Compatibility note: MSVC and GCC may treat type casts in different ways when passing a string object to a function that takes (...) parameters, e.g. printf() or outstm::putf(). You should explicitly instruct the compiler to cast the string object to (const char*) to avoid this problem. PTypes provides a shorter typedef pconst for this.

Examples:

void assignment_example()
{
   string s = "abcdef";
   const char* p = s;
   // do string manipulation here...
}

void function_call_example()
{
   string s = "abcdef";
   puts(s);
   printf("%s\n", pconst(s));
}

Manipulation

Conversion

PTypes provides 3 different string-to-int conversion functions: stringtoi(), stringtoie() and stringtoue(). The first function stringtoi() is for non-negative decimal numbers; it returns -1 on error. The other two functions with a suffix 'e' in their names ('e' is for 'exception') may throw exceptions, but they accept the full range of 64-bit values.

These functions replace the CRTL functions atoll() and strtoll() which are not implemented on all systems.

Both function families, string-to-int and int-to-string, accept numeration bases in the range 2 to 64. The 64-digit numeration uses all digits, letters and also '.' and '/'. It may be useful for representing, for example, MD5 checksums in a compact printable form (see function outmd5::get_digest() in src/pmd5.cxx). string itostring( value) converts the given ordinal value to a string. Various overloaded versions of this function accept ordinal values of different sizes and signness.

string itostring( value, int base, int width = 0, char pad = ' ') converts an integer value to a string with the numeration base specified by base, which can be in the range 2 - 64. To right-justify the resulting string you can specify width and pad parameters. If the numeration base is greater than 36 in addition to digits and letters itostring() uses '.' and '/' characters and also lowercase letters. For numeration bases other than 10 the parameter value is always treated as unsigned.

large stringtoi(const string& s) converts a string to a 64-bit integer. This function accepts only positive decimal large numbers and 0. It returns -1 if the string does not represent a valid positive number or an overflow occurred.

large stringtoie(const string& s) converts a string to a 64-bit integer. This function accepts signed decimal large numbers. Unlike stringtoi(), this function may throw an exception of type (econv*) if the string does not represent a valid number or an overflow occurred.

unsigned large stringtoue(const string& s, int base) converts a string to an unsigned 64-bit integer using the numeration base specified by base. This function accepts unsigned large numbers. It may throw an exception of type (econv*) if the string does not represent a valid number or an overflow occurred. Base can be in the range 2 - 64. For numeration bases from 2 to 36 this function uses digits and letters, and the letter case is insignificant. For numeration bases grater than 36, '.', '/' and lowercase letters are used additionaly.

string lowercase(const string& s) converts all characters of the given string s to lower case. The current version of the library "understands" only lower ASCII characters; all other characters remain unchanged. This function can effectively detect if all characters in the string are already in lower-case to avoid unnecessary string allocations.

Created: 10 years 9 months ago
by Natalie Adams

Updated: 10 years 9 months ago
by Natalie Adams

Old Revisions

Page rendered in 0.07473s using 26 queries.