UMString Class Reference
[LeJa]

String class for handling multibyte and unicode strings. More...

#include <UMString.h>

List of all members.

Public Member Functions

UMString ()

Default Constructor for the UMString-Class.

UMString (const UMString &tmp_umstring)

Constructor for the UMString-Class which takes a UMString object as init-value.

UMString (const char *tmp_mb_string)

UMString Class Constructor for a Multibyte string passed at creation of an UMString Object.

UMString (const wstring tmp_uc_string)

UMString Class Constructor for Unicode string (wstring) passed at creation.

UMString (const wchar_t *tmp_uc_string)

UMString Class Constructor for Unicode string passed at creation.

UMString (const string tmp_stlstring)

UMString Class Constructor for STL-String passed at creation.

~UMString ()

Destructor for UMString Class.

UINT GetLengthUC () const

Indexed array access to the stored Unicode-string Reads out the number of characters in the stored Unicode string.

UINT GetLengthMB () const

Reads out the number of characters in the stored Multibyte string.

wchar_t * PushString (const wchar_t *tmp_uc_string)

Loads a Unicode string into the UMString object.

char * PushString (const char *tmp_mb_string)

Loads a Multibyte string into the UMString object.

wchar_t * GetStringUC () const

Returns the Unicode string to the calling function.

char * GetStringMB () const

Returns the Multibyte string to the calling function.

void Clear ()

Clears all stored string data in the object.

bool ReplaceSubstring (const UINT First, const UINT Last, const UMString NewString)

Replaces the substring beginning at character "First" up to character "Last".

bool SI_Set (const UINT x)

Sets the "sliding index" by passing an absolute position.

UINT SI_Get () const

Gets the "sliding index"-position (0-based).

UINT SI_Forward ()

Sets the sliding index to next character.

UINT SI_Backward ()

Sets the sliding index to previous character.

bool SI_NextHira ()

Sets the "sliding index" to the next hiragana-character in the string after the current "sliding index"-position.

bool SI_NextKata ()

Sets the "sliding index" to the next katakana-character in the string after the current "sliding index"-position.

bool SI_NextKanji ()

Sets the "sliding index" to the next kanji-character in the string after the current "sliding index"-position.

bool SI_NextDifferent ()

Sets the "sliding index" to the next character, which is of different CHARTYPE compared to the current.

bool SI_PrevHira ()

Sets the "sliding index" to the previous hiragana-character in the string before the current "sliding index"-position.

bool SI_PrevKata ()

Sets the "sliding index" to the previous katakana-character in the string before the current "sliding index"-position.

bool SI_PrevKanji ()

Sets the "sliding index" to the previous kanji-character in the string before the current "sliding index"-position.

bool SI_PrevDifferent ()

Sets the "sliding index" to the next left character, which is of different CHARTYPE compared to the current.

bool SI_TokenStart ()

Sets the sliding index pointer to the beginning of the current token.

bool SI_TokenEnd ()

Sets the sliding index pointer to the end of the current token.

const UMString SI_GetToken ()

Gets the next token consisting of characters like the one the sliding index points to.

const UMString SI_GetTokenPosAfter ()

gets the next token

const UMString SI_GetRemaining () const

Gets the string remaining after the "sliding index" position.

const UMString SI_GetSizedString (const UINT first, const UINT last) const

gets a slice of the string

const wchar_t SI_GetChar () const

Returns the Unicode-Character at the sliding index position.

const wchar_t SI_GetCharNext () const

Returns the Unicode-Character after the sliding index position.

void SI_SetChar (const wchar_t tmp_uc_char)

Replaces the Unicode-Character at the sliding index position.

int SI_GetPosChar (wchar_t &ucchar, UINT offset=0)

searches for a char in string, starting at an offset position

bool SI_SetPosChar (wchar_t &ucchar)

sets the sliding index to the next occurance of uuchar

bool SI_SplitUMString (UMString &, UMString &, const UINT)

splits the UMString into two at position x

const UMString & operator= (const UMString &right)

Copy-Constructor for the UMString Class.

bool operator== (const UMString &right)

Comparison operator for UMString class.

const UMString operator+ (const UMString &right)

Concatenation operator for UMString class.

operator const char * () const

casting to const char

operator const wchar_t * () const

casting to const wchar_t

operator const string () const

casting to const string

operator char * ()

casting to char

operator wchar_t * ()

casting to w_char

operator string ()

casting to string

Private Member Functions

void MB2UC ()

Synchronizes the MultiByte and the Unicode string.

void UC2MB ()

Synchronizes the MultiByte and the Unicode string.

Private Attributes

wchar_t * uc_string

char * mb_string

UINT si

Detailed Description

String class for handling multibyte and unicode strings.

Version:: 1.0

Date:: August 2002-2003

Author:: Torben Pastuch > superbaer@t-online.de < (slightly) modified by Iris Vogel > iris@urz.uni-heidelberg.de <

Version:: 0.8/15

Warning:: some of the functions are windows specific

See also:: UMString.h

CPP-File for the UMString library. This library provides a new class "UMString". The UMString type can be used to store Unicode and Multibyte strings. All Multibyte string loaded into an UMString are automatically available as Unicode strings and vice versa. his class also provides various function for Unicode and Multibyte string manipulation like easy concatenation, searching, regular expressions, ...

Constructor & Destructor Documentation

UMString::UMString ( )

Default Constructor for the UMString-Class.
Unicode and Multibyte char arrays are initialized with an array size of 1 char/wchar_t and a '' in element [0]

UMString::UMString ( const UMString & tmp_umstring )

Constructor for the UMString-Class which takes a UMString object as init-value.

UMString::UMString ( const char * tmp_mb_string )

UMString Class Constructor for a Multibyte string passed at creation of an UMString Object.

Parameters:

tmp_mb_string -> Initializing Multibyte string

UMString::UMString ( const wstring tmp_uc_string )

UMString Class Constructor for Unicode string (wstring) passed at creation.

Parameters:

tmp_uc_string -> Initializing String

UMString::UMString ( const wchar_t * tmp_uc_string )

UMString Class Constructor for Unicode string passed at creation.

Parameters:

tmp_uc_string -> Initializing String

UMString::UMString ( const string tmp_stlstring )

UMString Class Constructor for STL-String passed at creation.

Parameters:

tmp_stlstring -> Initializing String

UMString::~UMString ( )

Destructor for UMString Class.

Member Function Documentation

void UMString::Clear ( )

Clears all stored string data in the object.

UINT UMString::GetLengthMB ( ) const

Reads out the number of characters in the stored Multibyte string.

Returns:
(int) number of characters in the Multibyte string

UINT UMString::GetLengthUC ( ) const

Indexed array access to the stored Unicode-string Reads out the number of characters in the stored Unicode string.

Returns:
(int) number of characters in the Unicode string

char * UMString::GetStringMB ( ) const

Returns the Multibyte string to the calling function.

Returns:
(char*) the Unicode string stored in the object

wchar_t * UMString::GetStringUC ( ) const

Returns the Unicode string to the calling function.

Returns:
(wchar_t*) the Unicode string stored in the object

void UMString::MB2UC ( ) [private]

Synchronizes the MultiByte and the Unicode string.
Synchronizes the MultiByte and the Unicode string in the UMString-object by taking the MultiByte string as the original pattern. This routine is only used internally and is private to this class

Warning:
This function still uses windows-specific functions. These should be replaced, if everything else works fine

UMString::operator char * ( ) [inline]

casting to char

UMString::operator const char * ( ) const [inline]

casting to const char

UMString::operator const string ( ) const [inline]

casting to const string

UMString::operator const wchar_t * ( ) const [inline]

casting to const wchar_t

UMString::operator string ( ) [inline]

casting to string

UMString::operator wchar_t * ( ) [inline]

casting to w_char

const UMString UMString::operator+ ( const UMString & right )

Concatenation operator for UMString class.

Parameters:

right -> Reference to the right-hand string of the "+"-operation.

Returns:
(UMString&) The concatenated UMString

const UMString & UMString::operator= ( const UMString & right )

Copy-Constructor for the UMString Class.

Parameters:

right -> Reference to the right-hand string of the "="-operation.

Returns:
(UMString&) A reference to the left-hand object for further processing.

bool UMString::operator== ( const UMString & right )

Comparison operator for UMString class.

Parameters:

right -> Reference to the right-hand string of the "=="-operation.

Returns:
(bool) true, if the UCStrings are equal, false if the UCStrings are not equal

char * UMString::PushString ( const char * tmp_mb_string )

Loads a Multibyte string into the UMString object.

Parameters:

tmp_mb_string Multibyte string to store

Returns:
(char*) the string that was loaded is also returned

wchar_t * UMString::PushString ( const wchar_t * tmp_uc_string )

Loads a Unicode string into the UMString object.

Parameters:

tmp_uc_string Unicode string to store

Returns:
(wchar_t*) the string that was loaded is also returned

bool UMString::ReplaceSubstring ( const UINT First,

const UINT Last,

const UMString NewString

)

Replaces the substring beginning at character "First" up to character "Last".

Parameters:

First The first character to be replaced

Last End of substring to be replaced

NewString The replacing string

Returns:
true = Operation successfull
false = Operation unsuccessfull

UINT UMString::SI_Backward ( )

Sets the sliding index to previous character.

Returns:
(UINT) : new si-position (0-based)

UINT UMString::SI_Forward ( )

Sets the sliding index to next character.

Returns:
(UINT) : new si-position (0-based)

UINT UMString::SI_Get ( ) const

Gets the "sliding index"-position (0-based).

Returns:
(UINT) the index position

const wchar_t UMString::SI_GetChar ( ) const

Returns the Unicode-Character at the sliding index position.

Returns:
(wchar_t) The character at the sliding index position

const wchar_t UMString::SI_GetCharNext ( ) const

Returns the Unicode-Character after the sliding index position.

Returns:
(wchar_t) The character after the sliding index position

int UMString::SI_GetPosChar ( wchar_t & ucchar,

UINT offset = 0

)

searches for a char in string, starting at an offset position

Parameters:

ucchar wchar_t character to look for

offset UINT offset for search

Returns:
int : position of character in demand,
int -1 if character was not found

Note:
added by Iris

const UMString UMString::SI_GetRemaining ( ) const

Gets the string remaining after the "sliding index" position.

Returns:
(UMString) the remaining string in a UMString-object

const UMString UMString::SI_GetSizedString ( const UINT first,

const UINT last

) const

gets a slice of the string
Returns a new string object which is copied out of the original string, beginning at position first (0-based) stretching to last (also 0-based).

Parameters:

first first character to be copied out

first last last character to be copied out

Returns:
(UMString) : The new string object which contains the string indicated by "first" & "last"

const UMString UMString::SI_GetToken ( )

Gets the next token consisting of characters like the one the sliding index points to.

Returns:
(UMString) the token in a UMString-object

const UMString UMString::SI_GetTokenPosAfter ( )

gets the next token
Gets the next tokenconsisting of characters like the one the sliding index points to. After that the sliding index is positioned after the read out token
Returns:
(UMString) the token in a UMString-object

bool UMString::SI_NextDifferent ( )

Sets the "sliding index" to the next character, which is of different CHARTYPE compared to the current.

Returns:
(bool) : true = a different character was found and the the si was set
(bool) : false = a different character was not found and the si was not changed

bool UMString::SI_NextHira ( )

Sets the "sliding index" to the next hiragana-character in the string after the current "sliding index"-position.

Returns:
(bool) : true = a hiragana-character was found and the the si was set
(bool) : false = no hiragana-character was found and the si was not changed

bool UMString::SI_NextKanji ( )

Sets the "sliding index" to the next kanji-character in the string after the current "sliding index"-position.

Returns:
(bool) : true a kanji-character was found and the the si was set
false no kanji-character was found and the si was not changed

bool UMString::SI_NextKata ( )

Sets the "sliding index" to the next katakana-character in the string after the current "sliding index"-position.

Returns:
(bool) : true = a katakana-character was found and the the si was set
false = no katakana-character was found and the si was not changed

bool UMString::SI_PrevDifferent ( )

Sets the "sliding index" to the next left character, which is of different CHARTYPE compared to the current.

Returns:
(bool) : true = a different character was found and the the si was set
(bool) : false = a different character was not found and the si was not changed

bool UMString::SI_PrevHira ( )

Sets the "sliding index" to the previous hiragana-character in the string before the current "sliding index"-position.

Returns:
(bool) : true = a hiragana-character was found and the the si was set
(bool) : false = no hiragana-character was found and the si was not changed

bool UMString::SI_PrevKanji ( )

Sets the "sliding index" to the previous kanji-character in the string before the current "sliding index"-position.

Returns:
(bool) : true = a kanji-character was found and the the si was set
(bool) : true false = no kanji-character was found and the si was not changed

bool UMString::SI_PrevKata ( )

Sets the "sliding index" to the previous katakana-character in the string before the current "sliding index"-position.

Returns:
(bool) : true a katakana-character was found and the the si was set
(bool) : false no katakana-character was found and the si was not changed

bool UMString::SI_Set ( const UINT x )

Sets the "sliding index" by passing an absolute position.

Parameters:

x -> the absolute position (0-based)

void UMString::SI_SetChar ( const wchar_t tmp_uc_char )

Replaces the Unicode-Character at the sliding index position.

Parameters:

tmp_uc_char : Replacement character

bool UMString::SI_SetPosChar ( wchar_t & ucchar )

sets the sliding index to the next occurance of uuchar

Parameters:

wchar_t& : character to look for

Returns:
bool : true if found
bool false : if not found

Note:
added by Iris

bool UMString::SI_SplitUMString ( UMString & str1,

UMString & str2,

const UINT x

)

splits the UMString into two at position x

Parameters:

UMString& str1 : part of UMString from position 0 to x-1

UMString& str2 : part of UMString from position x to end of string

const UINT x : point to split

Returns:
bool : true if found
false : if not found

Note:
added by Iris

bool UMString::SI_TokenEnd ( )

Sets the sliding index pointer to the end of the current token.

Returns:
(bool)

bool UMString::SI_TokenStart ( )

Sets the sliding index pointer to the beginning of the current token.

void UMString::UC2MB ( ) [private]

Synchronizes the MultiByte and the Unicode string.
Synchronizes the MultiByte and the Unicode string in the UMString-object by taking the Unicode string as the original pattern. This routine is only used internally and is private to this class

Warning:
This function still uses windows-specific functions. These should be replaced, if everything else works fine

Member Data Documentation

char* UMString::mb_string [private]

The stored MultiByte-String

UINT UMString::si [private]

The "sliding index"-Pointer

wchar_t* UMString::uc_string [private]

The stored UNICODE-String

The documentation for this class was generated from the following files:

Generated on Mon Aug 18 19:27:10 2003 for LeJa by

1.3.3


Public Member Functions
	UMString ()
	Default Constructor for the UMString-Class.
	UMString (const UMString &tmp_umstring)
	Constructor for the UMString-Class which takes a UMString object as init-value.
	UMString (const char *tmp_mb_string)
	UMString Class Constructor for a Multibyte string passed at creation of an UMString Object.
	UMString (const wstring tmp_uc_string)
	UMString Class Constructor for Unicode string (wstring) passed at creation.
	UMString (const wchar_t *tmp_uc_string)
	UMString Class Constructor for Unicode string passed at creation.
	UMString (const string tmp_stlstring)
	UMString Class Constructor for STL-String passed at creation.
	~UMString ()
	Destructor for UMString Class.
UINT	GetLengthUC () const
	Indexed array access to the stored Unicode-string Reads out the number of characters in the stored Unicode string.
UINT	GetLengthMB () const
	Reads out the number of characters in the stored Multibyte string.
wchar_t *	PushString (const wchar_t *tmp_uc_string)
	Loads a Unicode string into the UMString object.
char *	PushString (const char *tmp_mb_string)
	Loads a Multibyte string into the UMString object.
wchar_t *	GetStringUC () const
	Returns the Unicode string to the calling function.
char *	GetStringMB () const
	Returns the Multibyte string to the calling function.
void	Clear ()
	Clears all stored string data in the object.
bool	ReplaceSubstring (const UINT First, const UINT Last, const UMString NewString)
	Replaces the substring beginning at character "First" up to character "Last".
bool	SI_Set (const UINT x)
	Sets the "sliding index" by passing an absolute position.
UINT	SI_Get () const
	Gets the "sliding index"-position (0-based).
UINT	SI_Forward ()
	Sets the sliding index to next character.
UINT	SI_Backward ()
	Sets the sliding index to previous character.
bool	SI_NextHira ()
	Sets the "sliding index" to the next hiragana-character in the string after the current "sliding index"-position.
bool	SI_NextKata ()
	Sets the "sliding index" to the next katakana-character in the string after the current "sliding index"-position.
bool	SI_NextKanji ()
	Sets the "sliding index" to the next kanji-character in the string after the current "sliding index"-position.
bool	SI_NextDifferent ()
	Sets the "sliding index" to the next character, which is of different CHARTYPE compared to the current.
bool	SI_PrevHira ()
	Sets the "sliding index" to the previous hiragana-character in the string before the current "sliding index"-position.
bool	SI_PrevKata ()
	Sets the "sliding index" to the previous katakana-character in the string before the current "sliding index"-position.
bool	SI_PrevKanji ()
	Sets the "sliding index" to the previous kanji-character in the string before the current "sliding index"-position.
bool	SI_PrevDifferent ()
	Sets the "sliding index" to the next left character, which is of different CHARTYPE compared to the current.
bool	SI_TokenStart ()
	Sets the sliding index pointer to the beginning of the current token.
bool	SI_TokenEnd ()
	Sets the sliding index pointer to the end of the current token.
const UMString	SI_GetToken ()
	Gets the next token consisting of characters like the one the sliding index points to.
const UMString	SI_GetTokenPosAfter ()
	gets the next token
const UMString	SI_GetRemaining () const
	Gets the string remaining after the "sliding index" position.
const UMString	SI_GetSizedString (const UINT first, const UINT last) const
	gets a slice of the string
const wchar_t	SI_GetChar () const
	Returns the Unicode-Character at the sliding index position.
const wchar_t	SI_GetCharNext () const
	Returns the Unicode-Character after the sliding index position.
void	SI_SetChar (const wchar_t tmp_uc_char)
	Replaces the Unicode-Character at the sliding index position.
int	SI_GetPosChar (wchar_t &ucchar, UINT offset=0)
	searches for a char in string, starting at an offset position
bool	SI_SetPosChar (wchar_t &ucchar)
	sets the sliding index to the next occurance of uuchar
bool	SI_SplitUMString (UMString &, UMString &, const UINT)
	splits the UMString into two at position x
const UMString &	operator= (const UMString &right)
	Copy-Constructor for the UMString Class.
bool	operator== (const UMString &right)
	Comparison operator for UMString class.
const UMString	operator+ (const UMString &right)
	Concatenation operator for UMString class.
	operator const char * () const
	casting to const char
	operator const wchar_t * () const
	casting to const wchar_t
	operator const string () const
	casting to const string
	operator char * ()
	casting to char
	operator wchar_t * ()
	casting to w_char
	operator string ()
	casting to string
Private Member Functions
void	MB2UC ()
	Synchronizes the MultiByte and the Unicode string.
void	UC2MB ()
	Synchronizes the MultiByte and the Unicode string.
Private Attributes
wchar_t *	uc_string
char *	mb_string
UINT	si

UMString Class Reference [LeJa]

Public Member Functions

Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation

UMString Class Reference
[LeJa]