Table of Contents
What is char , unsigned char , and signed char ?
The char type in C , has a size of 1 byte . The size of a byte , as defined on a given machine , can be viewed by checking the macro CHAR_BITS , in the limits.h header .
/*Excerpt from limits.h */ #define CHAR_BITS 8 /* Define the number of bits in a char */
Typically a byte in C , or on a given machine is formed of 8 bits .
The char type is an integer type , it is used to store the encoding of characters . For example , the encoding of the character a , when a is considered to be part of the ascii character set , is 97 in decimal , or 01100001 in binary . If a is considered to belong to a different character set , it might have a different encoding , as such a variable of the char type , would store a different encoding value for the character a .
The char type can be signed or it can be unsigned , this is implementation defined . The C standard defines , the minimum range that the char type can have , an implementation can define large ranges .
If the char type is unsigned , then it can only contain non negative values , and its minimum range as defined by the C standard is between 0 , and 127 . If the char type is signed , then it can contain 0 , negative , and positive values , and its minimum range as defined by the C standard , is between -127 , and 127 .
Beside the char type in C , there is also the unsigned char , and the signed char types . All three types are different , but they have the same size of 1 byte . The unsigned char type can only store nonnegative integer values , it has a minimum range between 0 and 127 , as defined by the C standard. The signed char type can store , negative , zero , and positive integer values . It has a minimum range between -127 and 127 , as defined by the C standard .
character literals
A character literal , is formed of a character , such as a , enclosed in single quote .
char achar = 'a';
A character literal can contain escape sequences . An escape sequence , is used as a way to represent characters , in the execution character set . For example a new line , that must appear on a console , or on the terminal .
| Escape sequence | Action | Description |
|---|---|---|
| \a | alert | Causes an audible sound to be heard , such as the sound of a beep . |
| \f | form feed | Causes the carriage to go to the start , of a new page . |
| \r | carriage return | Causes the carriage to go back , to the start of the current line . |
| \n | new line | Causes the carriage to advance , to the start of the next line . |
| \b | backspace | Causes the carriage to go back , one space . |
| \t | horizontal tab | Causes the carriage to move forward horizontally , to the next tab stop . A tab stop is usually every 8 characters , counting from 0 . |
| \v | vertical tab | Causes the carriage to move forward vertically , to the next vertical tab stop . |
#include<stdio.h>
int main( void){
char alert = '\a';
printf( "Hello world %c\n" , alert);}
/* Output :
Hello world */
Escape sequences , are also used to represent characters , that cannot appear in character literals , such as single quote , or in string literals , such as double quote .
| Escape sequence | Action | Description |
|---|---|---|
| \' | single quote | A Single quote cannot be placed directly in a character literal , it must be escaped . |
| \" | double quote | A double quote cannot be placed directly in a string literal , it must be escaped . |
| \\ | backslash | A backslash cannot appear directly in a string , or a character literal , it must be escaped . |
#include<stdio.h>
int main( void){
char quote = '\'';
printf( "To quote : %c Limitation , Definition , "
"Construction , Knowledge , Usability %c \n" ,
quote,quote);}
/* Output :
To quote : ' Limitation , Definition , Construction , Knowledge , Usability ' */
Escape sequences , are also used to escape the interpretation of some special characters , such as ? .
| Escape sequence | Action | Description |
|---|---|---|
| \? | interrogation mark | This is used to escape a trigraph , a trigraph is formed of two interrogation marks followed by a character . |
#include<stdio.h>
int main( void){
char escape_interrogation_mark_using_trigraph = '??/?';
/* ??/ is a trigraph , and is replaced before preprocessing
by the character \ , so it is as if the character
literal , is written as '\?'*/
printf( "%c\n", escape_interrogation_mark_using_trigraph);}
/* Output :
? */
Escape sequences , are also used , as a way to input characters , by entering their encoding , instead of the character itself .
| Escape sequence | Action | Description |
|---|---|---|
| \{1 to 3 octal digits} | Encoding as octal digits | For single byte character types , such as char , the encoding values that can be used , are between \000 and \377 . For wide character types , such as wchar_t , the encoding values that can be used are between \0 and \777 . |
| \x{1 or more hexadecimal digits} | Encoding as hexadecimal digits | For single byte character types , such as char , the encoded value can be between \x00 and \xff . For extended character types , such as wchar_t , the value that can be used depend on the size of the wide character type . |
#include<wchar.h>
int main( void){
char character = '\141'; // a
character = '\x61'; // a
wchar_t wide_character = L'\7'; // alert
wide_character = L'\x0000ab11'; /* ꬑ */}
Escape sequences , can also be used , as a way to enter a character universal name , in this case , the character must be a wide character type .
| Escape sequence | Action | Description |
|---|---|---|
| \uhhhh | Character as universal character name | The universal character name , is the character’s unicode code point , also known as its short identifier , written as 4 hex digits . |
| \uhhhhhhhh | Character as universal character name | The universal character name , is the character’s unicode code point , also known as its short identifier , written as 8 hex digits . |
#include<wchar.h>
int main( void){
wchar_t wide_character = L'\u0800'; /* Samaritan letter alaf */
wide_character = L'\U00000800'; /* Samaritan letter alaf */ }
Since , signed char , unsigned char , and char , and wide characters , are integer types , they can be initialized by using an integer literal . The integer literal in such a case , hold the value of the encoding of the character .
#include<uchar.h>
int main( void){
unsigned char x = 97;
/* 97 , is the encoding of the
character a , in ascii */
char16_t wide_character = 97;}
