ptypes

string


initial page

Intro

The string class implements dynamically allocated reference-counted 8-bit character strings. The string class is a mixture of L-type and null-terminated strings: it has both the length indicator and the terminating null-symbol. The length of a string is theoretically limited to INT_MAX, and practically is limited to the amount of virtual memory space available to the application.

A string object itself contains only a reference to the first character of the string buffer. A string object can be implicitly converted to a null-terminated string, either a variable or passed as an actual parameter, thus allowing to combine both PTypes strings and traditional C strings in your application. A string object converted to char* or const char* never returns NULL pointers: it guarantees to always point to some data, even if the string is zero-sized.

The reference counting mechanism works transparently (known also as copy-on-write) and safely with regard to multithreading. You can manipulate string objects as if each object had its own copy of string data. Whenever you modify a string object the library safely detaches the buffer from all other string objects that may be using the same buffer and creates a unique copy so that changes won't affect the other "holders" of this string.

NOTE on multithreading: the dynamic string objects themselves are NOT thread-safe. In other words, each thread can manipulate objects (variables) of type string only within their scope. However, it is safe to pass strings as (copy) parameters when, for example, sending a message to a concurrent thread through a message queue. Whenever the recipient thread tries to modify the string, the shared data buffer is safely detached.

The string class is declared in <ptypes.h>.

Constructor/destructors

1
2
3
4
5
6
7
8
9
10
#include <ptypes.h>
 
class string {
    string();
    string(const string&);
    string(char);
    string(const char*);
    string(const char*, int);
    ~string();
}

Operators

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <ptypes.h>
 
class string {
    // assignment
    string& operator =(const char*);
    string& operator =(char);
    string& operator =(const string&);
    friend void assign(string&, const char* buf, int len);
 
    // concatenation
    string& operator +=(const char*);
    string& operator +=(char);
    string& operator +=(const string&);
    string  operator +(const char*) const;
    string  operator +(char) const;
    string  operator +(const string&) const;
    friend  string operator +(const char*, const string&);
    friend  string operator +(char c, const string&);
 
    // comparison
    bool operator ==(const char*) const;
    bool operator ==(char) const;
    bool operator ==(const string&) const;
    bool operator !=(const char*) const;
    bool operator !=(char c) const;
    bool operator !=(const string&) const;
 
    // indexed character access, 0-based
    char& string::operator[] (int index);
    const char& string::operator[] (int index) const;
}

A string object can be constructed in 5 different ways:

  • default constructor string() creates an empty string.
  • copy constructor string(const string& s) creates a copy of the given string s. Actually this constructor only increments the reference count by 1 and no memory allocation takes place.
  • string(char c) constructs a new string consisting of one character c.
  • string(const char* s) constructs a new string object from a null-terminated string. If s is either NULL or is a pointer to a null character, an empty string object is created. This constructor can be used to assign a string literal to a string object (see examples below).
  • string(const char* s, int len) copies len bytes of data from buffer s to the newly allocated string buffer.

Destructor ~string() decrements the reference count for the given string buffer and removes it from the dynamic memory if necessary.

Examples:

1
2
3
4
5
6
7
string s1;             // empty string
string s2 = s1;        // copy
string s3 = 'A';       // single character
string s4 = "ABCabc"// string literal
char* p = "ABCabc";
string s5 = p;         // null-terminated string
string s6(p, 3);       // buffer/length

Typecasts

1
2
3
4
5
#include <ptypes.h>
 
class string {
    operator (const char*)() const;
}

A string object can be assigned to a variable or passed as an actual parameter of type const char* implicitly (by default). Such assignments should be used carefully, since the library does not keep track of whether a char pointer refers to the given string buffer. To make sure that a char pointer refers to a valid string buffer, always make the scope of a char pointer variable smaller than or equal to the scope of a string object. In most cases passing a string object to a system or API call is safe (see examples below). This typecast operator does not perform any actions and simply returns a pointer to the string buffer.

The value of the char pointer is guaranteed to be non-NULL. Even if the string is empty, the char pointer will refer to a null-symbol.

A string buffer can not be modified through a constant char pointer. If you want to modify the string buffer through a char pointer, use unique(string&) function instead. This function always returns a reference to a unique string buffer (i.e. when the reference count is 1).

Compatibility note: MSVC and GCC may treat type casts in different ways when passing a string object to a function that takes (...) parameters, e.g. printf() or outstm::putf(). You should explicitly instruct the compiler to cast the string object to (const char*) to avoid this problem. PTypes provides a shorter typedef pconst for this.

Examples:

1
2
3
4
5
6
7
8
9
10
11
12
13
void assignment_example()
{
   string s = "abcdef";
   const char* p = s;
   // do string manipulation here...
}
 
void function_call_example()
{
   string s = "abcdef";
   puts(s);
   printf("%s\n", pconst(s));
}

Manipulation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <ptypes.h>
 
// get/set length and misc.
int    length(const string& s);
char*  setlength(string&, int);
char*  unique(string&);
void   clear(string& s);
bool   isempty(const string& s);
 
// concatenate
void   concat(string& s, const char* sc, int catlen);
 
// copy (get substring by position and length)
string copy(const string& s, int from [, int cnt ] );
 
// insert string or character
void   ins(const char* s1, string& s, int at);
void   ins(char s1, string& s, int at);
void   ins(const string& s1, string& s, int at);
void   ins(const char* s1, int s1len, string& s, int at);
 
// delete substring
void   del(string& s, int at [, int cnt ] );
 
// find substring or character
int    pos(const char* s1, const string& s);
int    pos(char s1, const string& s);
int    pos(const string& s1, const string& s);
int    rpos(char s1, const string& s);
 
// compare substring
bool   contains(const char* s1, const string& s, int at);
bool   contains(char s1, const string& s, int at);
bool   contains(const string& s1, const string& s, int at);
bool   contains(const char* s1, int s1len, const string& s, int at);
int length(const string& s) //returns the actual length of the string, not counting the terminating null-symbol.
 
char* setlength(string&, int) // changes the actual length of the string. The content of the original string is preserved, however the content of extra characters added during reallocation is undefined. This function returns a pointer to a unique buffer (i.e. refcount is 1), like function unique() below. Even if the length of the string is not changing, setlength() guarantees to make the string unique.
 
char* unique(string&) // makes the string buffer unique, i.e. a new buffer is allocated and data is copied if necessary, so that the reference count after calling this function is guaranteed to be 1. All string manipulation functions call unique() whenever a modification is made on the string buffer. You may need to call this function explicitly to obtain a character pointer to the buffer; in all other cases reference counting mechanism works transparently.
 
bool isempty(string&) // returns true if the given string is empty. Using this function is preferable to comparing the string with empty string literal "".
 
clear(string&) // makes the given string empty. Using this function is preferable to assigning an empty string literal "".
 
concat(string& s, const char* sc, int catlen) // adds the given buffer sc of length catlen to the string object s. Use operators + and += instead to concatenate characters, null-terminated strings and string objects.
 
string copy(const string& s, int from [, int cnt ] ) // returns a substring of s starting from position from and containing cnt characters. If cnt is omitted, the rest of the string s starting from position from is returned.
 
ins(..., string& s, int at) // inserts a character, a null-terminated string, a string object or a buffer with specified length into string object s at the given position at. If the position is out of bounds, ins() does nothing.
 
del(string& s, int at [, int cnt ] ) // deletes cnt characters starting from position at of the string s. If cnt is omitted, the rest of the string starting from at is deleted.
 
int pos(..., const string& s) // returns the position of the first occurrence of a character, a null-terminated string or a string object (first parameter) in the source string s, or returns -1 if the substring is not found. Function rpos() performs reverse-search.
 
bool contains(..., const string& s, int at) // returns true if the given character, null-terminated string or string object (first parameter) equals the substring of s at the given position at.

Conversion

1
2
3
4
5
6
7
8
#include <ptypes.h>
 
string itostring(<ordinal> value);
string itostring(<ordinal> value, int base, int width = 0, char pad = ' ');
large  stringtoi(const string& s);
large  stringtoie(const string& s);
ularge stringtoue(const string& s, int base);
string lowercase(const string& s);

PTypes provides 3 different string-to-int conversion functions: stringtoi(), stringtoie() and stringtoue(). The first function stringtoi() is for non-negative decimal numbers; it returns -1 on error. The other two functions with a suffix 'e' in their names ('e' is for 'exception') may throw exceptions, but they accept the full range of 64-bit values.

These functions replace the CRTL functions atoll() and strtoll() which are not implemented on all systems.

Both function families, string-to-int and int-to-string, accept numeration bases in the range 2 to 64. The 64-digit numeration uses all digits, letters and also '.' and '/'. It may be useful for representing, for example, MD5 checksums in a compact printable form (see function outmd5::get_digest() in src/pmd5.cxx). string itostring( value) converts the given ordinal value to a string. Various overloaded versions of this function accept ordinal values of different sizes and signness.

string itostring( value, int base, int width = 0, char pad = ' ') converts an integer value to a string with the numeration base specified by base, which can be in the range 2 - 64. To right-justify the resulting string you can specify width and pad parameters. If the numeration base is greater than 36 in addition to digits and letters itostring() uses '.' and '/' characters and also lowercase letters. For numeration bases other than 10 the parameter value is always treated as unsigned.

large stringtoi(const string& s) converts a string to a 64-bit integer. This function accepts only positive decimal large numbers and 0. It returns -1 if the string does not represent a valid positive number or an overflow occurred.

large stringtoie(const string& s) converts a string to a 64-bit integer. This function accepts signed decimal large numbers. Unlike stringtoi(), this function may throw an exception of type (econv*) if the string does not represent a valid number or an overflow occurred.

unsigned large stringtoue(const string& s, int base) converts a string to an unsigned 64-bit integer using the numeration base specified by base. This function accepts unsigned large numbers. It may throw an exception of type (econv*) if the string does not represent a valid number or an overflow occurred. Base can be in the range 2 - 64. For numeration bases from 2 to 36 this function uses digits and letters, and the letter case is insignificant. For numeration bases grater than 36, '.', '/' and lowercase letters are used additionaly.

string lowercase(const string& s) converts all characters of the given string s to lower case. The current version of the library "understands" only lower ASCII characters; all other characters remain unchanged. This function can effectively detect if all characters in the string are already in lower-case to avoid unnecessary string allocations.

Created: 11 years 1 month ago
by Natalie Adams

Updated: 11 years 1 month ago
by Natalie Adams

Old Revisions

Page rendered in 0.08323s using 25 queries.