Table of Contents
What is char , unsigned char , and signed char ?
The char
type in C , has a size of 1
byte . The size of a byte , as defined on a given machine , can be viewed by checking the macro CHAR_BITS
, in the limits.h
header .
/*Excerpt from limits.h */ #define CHAR_BITS 8 /* Define the number of bits in a char */
Typically a byte in C , or on a given machine is formed of 8
bits .
The char type is an integer type , it is used to store the encoding of characters . For example , the encoding of the character a
, when a
is considered to be part of the ascii character set , is 97
in decimal , or 01100001
in binary . If a
is considered to belong to a different character set , it might have a different encoding , as such a variable of the char
type , would store a different encoding value for the character a
.
The char type can be signed or it can be unsigned , this is implementation defined . The C standard defines , the minimum range that the char
type can have , an implementation can define large ranges .
If the char
type is unsigned , then it can only contain non negative values , and its minimum range as defined by the C standard is between 0
, and 127
. If the char type is signed , then it can contain 0
, negative , and positive values , and its minimum range as defined by the C standard , is between -127
, and 127
.
Beside the char
type in C , there is also the unsigned char
, and the signed char
types . All three types are different , but they have the same size of 1
byte . The unsigned char
type can only store nonnegative integer values , it has a minimum range between 0
and 127
, as defined by the C standard. The signed char
type can store , negative , zero , and positive integer values . It has a minimum range between -127
and 127
, as defined by the C standard .
character literals
A character literal , is formed of a character , such as a
, enclosed in single quote .
char achar = 'a';
A character literal can contain escape sequences . An escape sequence , is used as a way to represent characters , in the execution character set . For example a new line , that must appear on a console , or on the terminal .
Escape sequence | Action | Description |
---|---|---|
\a | alert | Causes an audible sound to be heard , such as the sound of a beep . |
\f | form feed | Causes the carriage to go to the start , of a new page . |
\r | carriage return | Causes the carriage to go back , to the start of the current line . |
\n | new line | Causes the carriage to advance , to the start of the next line . |
\b | backspace | Causes the carriage to go back , one space . |
\t | horizontal tab | Causes the carriage to move forward horizontally , to the next tab stop . A tab stop is usually every 8 characters , counting from 0 . |
\v | vertical tab | Causes the carriage to move forward vertically , to the next vertical tab stop . |
#include<stdio.h> int main( void){ char alert = '\a'; printf( "Hello world %c\n" , alert);} /* Output : Hello world */
Escape sequences , are also used to represent characters , that cannot appear in character literals , such as single quote , or in string literals , such as double quote .
Escape sequence | Action | Description |
---|---|---|
\' | single quote | A Single quote cannot be placed directly in a character literal , it must be escaped . |
\" | double quote | A double quote cannot be placed directly in a string literal , it must be escaped . |
\\ | backslash | A backslash cannot appear directly in a string , or a character literal , it must be escaped . |
#include<stdio.h> int main( void){ char quote = '\''; printf( "To quote : %c Limitation , Definition , " "Construction , Knowledge , Usability %c \n" , quote,quote);} /* Output : To quote : ' Limitation , Definition , Construction , Knowledge , Usability ' */
Escape sequences , are also used to escape the interpretation of some special characters , such as ?
.
Escape sequence | Action | Description |
---|---|---|
\? | interrogation mark | This is used to escape a trigraph , a trigraph is formed of two interrogation marks followed by a character . |
#include<stdio.h> int main( void){ char escape_interrogation_mark_using_trigraph = '??/?'; /* ??/ is a trigraph , and is replaced before preprocessing by the character \ , so it is as if the character literal , is written as '\?'*/ printf( "%c\n", escape_interrogation_mark_using_trigraph);} /* Output : ? */
Escape sequences , are also used , as a way to input characters , by entering their encoding , instead of the character itself .
Escape sequence | Action | Description |
---|---|---|
\{1 to 3 octal digits} | Encoding as octal digits | For single byte character types , such as char , the encoding values that can be used , are between \000 and \377 . For wide character types , such as wchar_t , the encoding values that can be used are between \0 and \777 . |
\x{1 or more hexadecimal digits} | Encoding as hexadecimal digits | For single byte character types , such as char , the encoded value can be between \x00 and \xff . For extended character types , such as wchar_t , the value that can be used depend on the size of the wide character type . |
#include<wchar.h> int main( void){ char character = '\141'; // a character = '\x61'; // a wchar_t wide_character = L'\7'; // alert wide_character = L'\x0000ab11'; /* ꬑ */}
Escape sequences , can also be used , as a way to enter a character universal name , in this case , the character must be a wide character type .
Escape sequence | Action | Description |
---|---|---|
\uhhhh | Character as universal character name | The universal character name , is the character’s unicode code point , also known as its short identifier , written as 4 hex digits . |
\uhhhhhhhh | Character as universal character name | The universal character name , is the character’s unicode code point , also known as its short identifier , written as 8 hex digits . |
#include<wchar.h> int main( void){ wchar_t wide_character = L'\u0800'; /* Samaritan letter alaf */ wide_character = L'\U00000800'; /* Samaritan letter alaf */ }
Since , signed char
, unsigned char
, and char
, and wide characters , are integer types , they can be initialized by using an integer literal . The integer literal in such a case , hold the value of the encoding of the character .
#include<uchar.h> int main( void){ unsigned char x = 97; /* 97 , is the encoding of the character a , in ascii */ char16_t wide_character = 97;}