Character conversion between the multibyte representation and the wide
character representation uses conversion state, of type mbstate_t.
Conversion of a string uses a finite-state machine; when it is interrupted
after the complete conversion of a number of characters, it may need to
save a state for processing the remaining characters.
Such a conversion
state is needed for the sake of encodings such as ISO-2022 and UTF-7.
The initial state is the state at the beginning of conversion of a string.
There are two kinds of state: The one used by multibyte to wide character
conversion functions, such as
and the one used by wide
character to multibyte conversion functions, such as
but they both fit in a mbstate_t, and they both have the same
representation for an initial state.
For 8-bit encodings, all states are equivalent to the initial state.
For multibyte encodings like UTF-8, EUC-*, BIG5 or SJIS, the wide character
to multibyte conversion functions never produce non-initial states, but the
multibyte to wide-character conversion functions like
produce non-initial states when interrupted in the middle of a character.
One possible way to create an
in initial state is to set it to zero: