Floating-point numbers are typically stored using a standardized format defined by the IEEE 754 standard. This format divides the number into three main components:
-
Sign Bit:
- 1 bit that indicates the sign of the number (0 for positive, 1 for negative).
-
Exponent:
- A certain number of bits (usually 8 bits for single precision and 11 bits for double precision) that represent the exponent. The exponent is stored in a biased format, which means a fixed value (bias) is added to the actual exponent to allow for both positive and negative exponents.
-
Mantissa (or Significand):
- The remaining bits (23 bits for single precision and 52 bits for double precision) that represent the significant digits of the number. The mantissa is usually normalized, meaning it is adjusted so that it falls within a specific range.
Example of Storage:
For a single-precision floating-point number (32 bits total):
- 1 bit for the sign
- 8 bits for the exponent
- 23 bits for the mantissa
Representation:
The value of a floating-point number can be calculated using the formula:
[ \text{Value} = (-1)^{\text{sign}} \times (1 + \text{mantissa}) \times 2^{(\text{exponent} - \text{bias})} ]
This structure allows for efficient storage and computation of real numbers in computer systems.
