The arithmetic data types in C , are the integer types , such as
int
or unsigned long
, and the floating point types , such as
float
, or long double
.
Table of Contents
Conversion of the integer types
What is widening ?
Widening only applies to the signed and unsigned integer types . It does
not apply to other types , such as float
or double
.
Widening is not about converting from signed to unsigned , or from unsigned to signed , it is about expanding the signedness of an integer type , from a smaller type to a larger type , so from a smaller number of bits , to a larger one . The signedness of the integer type does not change .
For the signed type , widening is done by what is called :
sign extension
. If the value of the signed type is negative , it is extended by filling the
new bits by the value 1
, if the value of the signed type is nonnegative , than it is
extended by filling the newly allocated bits , with 0
.
For the unsigned type , the newly allocated bits are filled
with 0
.
What is truncation ?
Truncation happens only for the integer types , that have the
same signedness , when passing from a larger integer type , to a smaller integer type , such as , when
passing from int
to char
, or from unsigned int
to
unsigned char
.
Be it is signed or unsigned , the larger type is made to fit the smaller type , by discarding the bits , of the larger type , that lies outside of the width of the smaller type , keeping only the lower order bits .
Truncation as described , which is keeping only a limited number of bits ,
doesn’t happen when converting from a floating point type , to an integer type , or
between floating point types .
When converting from a floating type , to an integer type , the fractional part is discarded , the
floating type bits are not made to fit the integer type width , but the floating type is transformed from
a floating point representation , to an integer representation .
The integer conversion procedure
Conversion of the integer types , consists of either
performing widening or truncation first , and later reinterpreting the bits .
So first truncation or widening is performed . Truncation or widening does not change the
signedness of the type , the type is only made to fit , a larger or narrower width , of the
same signedness .
After having performed widening or truncation , the gotten bits in the target width , are only
reinterpreted , as belonging to a new integer type , the target type .
The only exception to this rule , is when converting to the
_Bool
type , any nonzero value is converted to 1
, and any zero value , is
converted to 0
.
Having explained , how conversion happens for the integer types , let us
explain the concept of integer rank
, before explaining when conversion to another
integer type , takes place .
What is a rank ?
Each integer type in C has a rank .
unsigned
, and signed
integer types of the
same type , disregarding the signedness , have the same rank . For example
int
, and unsigned int
, have the same rank .
The order of the ranks of the integer types is , as follow :
_Bool < signed char < short < int < long < long long
The rank of any standard integer type , outlined above , is larger than the rank of any implementation defined , extended integer type , having the same width .
The char , and signed char
integer types , have the same
rank .
The rank of an enum type is equal to its assigned implementation , defined integer type .
Integer promotion
Integer types , which have a rank smaller than int
, each is promoted , to either the int
type , if the int
type can represent all
its possible values , or to the unsigned int
type , in the other case .
Integer types having a rank smaller than int
, are
promoted when an operator expects , that one of its operand , to be of an arithmetic type .
Arithmetic types in C , are the integer or floating point types . An example of such an operator , is the
subtraction operator -
, or the unary negation operator -
.
Integer promotion , is not how the integer is converted to another type ,
it is when a integer is converted to another type , in this case it is because it has a
lower rank than int
.
unsigned char x = 1; signed char y = -x; /* x , is an unsigned char type , it has a rank lower than int . The unary negation operator is used , as such x must be promoted . An int on 32 bits systems , have a typical width of 32 bits , as such it can store all the possible values of an unsigned char , which is limited to only 8 bits . As such the target promotion type is an int . Now that the target type is decided , which is an int , it has a larger width than an unsigned char , as such widening must be performed . Widening does not change signedness , so the unsigned char is widened to an unsigned int , by using zero fill . The resulting value is 1 , and has a bit representation of : 00000000000000000000000000000001 The resulting bits , are interpreted as if , as being of the type signed int , and the negation operator is applied . The result of the negation operator is -1 , which , has a bit representation of : 11111111111111111111111111111111 The value is to be stored in a signed char type , both int and signed char , have the same signedness , as such truncation is applied . The width of a signed char type is 8 bits , as such the leading 24 bits are discarded , and the result is : 11111111 , which is equal to -1 . */
Function call or return value
When making a function call , and an argument is of an integer type , different from the target parameter integer type , type conversion occurs .
The argument is converted to the parameter type , by first truncation , or widening , to the same width , and later on reinterpreting the bits as of the target parameter type .
#include<stdio.h> void trivialFunction( unsigned int val){ if( val == 4294967295){ printf( "%u is equal to 4294967295\n" , val); } else{ printf( "%u is different from 4294967295\n" , val);}} int main( void){ signed char x = -1 ; trivialFunction( x); unsigned char y = x ; trivialFunction( y);} /* Output 4294967295 is equal to 4294967295 255 is different from 4294967295 */ /* In the first call to trivialFunction , the passed argument is of type signed char . trivialFunction expects its argument , to be of the type unsigned int , as such the argument must be converted . First widening to the same width of unsigned int takes place . x is a signed char , as such it is widened to a signed int , by sign extension . The value of x is -1 , and it has a bit representation of 11111111 . The resulting value is -1 , and it has a bit representation of 11111111111111111111111111111111 . This bit pattern , is next reinterpreted as an unsigned int , so 11111111111111111111111111111111 , as an unsigned int , has a value of 4294967295 . This is why trivialFunction prints : 4294967295 is equal to 4294967295 The variable y , has a type of unsigned char . It is assigned the value of x , which is a signed char . Both x , and y have the same width , as such no widening occurs , only the bits of x , are reinterpreted as being unsigned . The bits of x , has a value of -1 , which is 11111111 , when reinterpreted as unsigned , this yield the value of 255 . Next trivialFunction is called with y , as a parameter . Widening occurs , because y is an unsigned char , and the function parameter is an unsigned int . It is done by using zero fill . Hence the value of 1 which has a bit representation of 11111111 , is widened to the value 255 which has a bit representation of : 00000000000000000000000011111111 The function prints : 255 is different from 4294967295 */
When the return value of a function is different from its return type , the return value is converted to the function’s return type .
Assignment and initialization
When performing assignment or initialization to an integer
variable , using the =
operator , and the expression to be assigned , is of a different
integer type , integer conversion takes place .
unsigned char x = 1 ; /* 1 is an integral literal , it is of the type int . The type int has a typical width of 32 bits , while an unsigned char has a typical width of 8 bits . The bits of the int type 00000000000000000000000000000001 are reinterpreted as being the bits of an unsigned int type . The result is : 00000000000000000000000000000001 The unsigned int value is truncated to 8 bits , and the value gotten is 00000001 , which is assigned to the unsigned char x .*/
For further information , about the type of integer literals , you can check this article .
Arithmetic operators
When one of the following operators , is being used , an integer conversion will occur .
* / % + - /* Multiplication , division , modulos , addition , subtraction .*/ < <= > >= == != /* Relational less , less or equal , larger , larger or equal , equal , not equal .*/ & ^ | /* Bitwise and , xor , or .*/ ?: /* The Ternary operator expression , must return a value of a specific type , the second and third operand , are converted to a same type . */ += -= *= /= %= &= ^= |= /* operate and assign operators */
In the case of these operators , a common type for the operands and the result must be determined . This is done as specified by the following table .
Unsigned | Signed | Rank | Operands and Result type |
---|---|---|---|
uT | sT | uT >= sT | uT |
uT | sT | uT < sT | sT , if sT is capable of holding , all the possible values of : uT . The unsigned type of sT , if sT is not capable of holding all the possible values of : uT . |
uT , means an unsigned type , such as :
unsigned int sT , means a signed type , such as : int The result of relational operators such as < , is always 0 for true , and 1 for false , and it is always of the type int . |
int si = -1 ; unsigned int ui = 0 ; ui = ui + si ; /* int , and unsigned int , have the same rank , one is signed , and the other is unsigned , as such both operands , must be converted to the unsigned int type , and the result of the operation , is of an the unsigned int type . ui is unsigned , so no conversion is necessary . To convert si , to the unsigned int type , and since both si and ui have the same width , the bits of si , which are 11111111111111111111111111111111 , are kept as is , they are only reinterpreted as belonging to a signed type , so now they have a value of 4294967295 . The addition is performed , and the result of the addition operation , is as such : 4294967295 + 0 = 4294967295 , and is of the type unsigned int . ui is of type unsigned int , as such no conversion is necessary , and the result of 4294967295 is stored in ui . */ long long lli = -1 ; ui = lli + ui ; /* lli is of the type long long , it has a higher rank than unsigned int . long long can hold all the values of unsigned int , as such , both operands must be of the type long long . lli is of the type long long , so no conversion is necessary . ui is an unsigned int , it has a bit representation of 11111111111111111111111111111111 it is first extended to unsigned long by using zero fill . 0000000000000000000000000000000011111111111111111111111111111111 After that , the gotten bits are reinterpreted as being of type long long . The gotten value , is the same as the value of ui , which is 4294967295 . The addition is performed between , the two long long integer types , and the result is -1 + 4294967295 , which is equal to 4294967294 , and is of the type long long . ui is of the type unsigned int , the result which is of the type long long must be converted to unsigned int . It is first converted to unsigned long long , the bits pattern does not change , so it remains 0000000000000000000000000000000011111111111111111111111111111110 . Now that is is of the type unsigned long long , it is truncated to the type of ui , which is unsigned int , and it has the format : 11111111111111111111111111111110 , and a value of : 4294967294 . */ long int li = -1 ; li = li + ui ; /* ui is an unsigned int , whereas li is a long int . A long int has a higher rank than an unsigned int . Assuming that on this machine , both long int and unsigned int have a width of 32 bits , this means that long int is not capable of representing all the possible values of the unsigned int type , as such the operands , and the result must be of the unsigned long type . li bits are kept as is , and only reinterpreted as being of the type unsigned long , as such li will have the value of 4294967295 . ui is converted to unsigned long , by widening. unsigned int , and unsigned long have the same width , as such the bits of ui remains the same , 11111111111111111111111111111110 . Adding 4294967295 + 4294967294 = 8589934589 . The result is of the type unsigned long , it cannot fit the width of the unsigned long type , which has a max value of 4294967295 , as such overflow has occurred . The modulo of the result , with regards to 2 to the power of the number of bits , of the unsigned long type is taken . This is equal to 8589934589 % 4294967296 = 4294967293 , which is 11111111111111111111111111111101 The result must be stored in li , li is of the long type , the gotten value is of the type unsigned long , the bits are kept as is , and the result is only reinterpreted as being a signed long , as such the value of li is -3 . */
For information on overflow of the signed , and unsigned integer types , you can check this , and this article .
Cast operator
Explicit
conversion happens , when using the
casting operator : (Type) expression
. As an example , explicitly casting the
int
literal 1
, to the long
type .
(long) 1
When using the cast operator , this is called
explicit
casting , all other cases of conversion , are called implicit
casting .
Conversion of the floating point types
Converting from one floating type , to another
Each floating point type , has a different range , and precision . As such , when passing from floating point types , with higher precision and range , to floating point types with smaller precision and range , a loss of precision , an underflow , or an overflow , can occur .
double dbl = 16777217 ; float fl = dbl ; /* Loss of precision , fl value is 16777216 .*/ dbl = 1.2e-50 ; fl = dbl ; /* Underflow occurs , the behavior is implementation defined , in this case , fl has a value of 0 .*/ dbl = 3.4e100; fl = dbl ; /* Overflow occurs , the behavior is implementation defined , fl has a value of positive infinity */
When passing from a floating point type , with a smaller range and precision , to a floating point type , with a higher range and precision , no precision loss , underflow , or overflow , occurs . The precision and range are preserved .
float fl = 1.099511627776e12f; double dbl = fl ; /* precision and range are preserved , dbl has a value of : 1.099511627776e12f , which is 2 to the power 40 .*/
Converting from a floating point type , to an integer type
The process of converting a floating point type , to an integer type , can be thought of , as if , the number is first converted to the decimal notation , then the fractional part is discarded , then it is represented in its signed or unsigned representation .
double dbl = 1.3f; unsigned char uc = (unsigned char) x ; // uc is equal to 1 dbl = -1.3f; uc = dbl ; // uc is equal to 255
If the floating point number , is too large to be represented in an integer type , the behavior is not defined by the C standard , it is defined by the implementation .
double dbl = 3.4E30 ; unsigned long ul = dbl ; //ul is equal to 0 unsigned int ui = dbl ; //ui is equal to 0 unsigned char uc = dbl ; //uc is equal to 0 long li = dbl ; //li is equal to -9223372036854775808 int si = dbl ; //si is equal to -2147483648 signed char sc = dbl ; //sc is equal to 0
Converting from an integer type , to a floating point type
When converting an integer type , to a floating point type , the integer value can always be represented , but there might be a loss of precision .
int si = 16777216 ; float fl = si ; // fl is equal to 16777216 si = 16777217 ; float fl = si ; // fl is equal to 16777216
When does the conversion occurs ?
When a floating point type is involved , conversion occurs , when a function call is made , and the passed argument is of a different type than the function parameter , or when the return value of a function , is of different type , than its return type .
It also happens , when performing floating point variables , assignment and initialization . Finally , it
occurs when using arithmetic operators , or when done explicitly , using explicit casting , such as
(float) 1.0
.
Rank of floating point types
The rank of a floating point type , is always higher than the rank of an integer type , so an integer type , is always converted to a floating point type .
As for the floating point types , they have the following ranks :
float < double < long double
If an arithmetic operation , involves two floating point types of different ranks , the floating point type with the lower rank , is converted to the floating point type with the higher rank .
The type of the result of an operation involving a floating point type , is the same type , as the one determined by the ranking algorithm .
Floating point types promotion
As with the integer type promotion , when performing arithmetic operations , a floating point type can be promoted , to a type with a higher precision and range .
This is not to be confused with how conversion happens , or in which rank , an arithmetic operation involving a floating point type , is to be performed , It is always performed using the higher operand rank .
The promotion rule , is defined in the macro
FLT_EVAL_METHOD
, defined in the header float.h
.
If FLT_EVAL_METHOD
is set to 0
, then arithmetic
operations are done in the type of the widest operand , so if both operands are float
, the
arithmetic operation is done using the float
type , so no promotion occurs .
If FLT_EVAL_METHOD
is set to 1
, then arithmetic
operations are performed , by promoting the operands to long double
, if any operand is of
the long double
type , otherwise operands are promoted to the double
type , even
if both operands are of the float
type .
If FLT_EVAL_METHOD
is set to 2
, then arithmetic
operations are performed , by promoting the operands , to the long double
type .
If FLT_EVAL_METHOD
is set to -1
, then the behavior
is not defined .