"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > What is the Optimal Base for Emulating Double-Precision Addition with Pairs of Floats?

What is the Optimal Base for Emulating Double-Precision Addition with Pairs of Floats?

Posted on 2025-02-25
Browse:364

What is the Optimal Base for Emulating Double-Precision Addition with Pairs of Floats?

Emulating Double-Precision Arithmetic with Pairs of Floats

In embedded systems with limited numerical capabilities, emulating double-precision data types becomes necessary for certain algorithms. This article explores the technique of emulating "double" data type using a tuple of two "float" values to achieve increased precision.

The comparison operation between two emulated doubles follows a straightforward lexicographic ordering. However, the addition operation presents challenges due to the need to detect carry-outs. The underlying question is, which base should be used for this operation? While FLT_MAX might be a potential candidate, it requires further consideration.

Emulating Addition

To emulate addition, we need to consider not only the addition of the individual components but also the potential for carry-outs. The base used for the operation should provide sufficient resolution to capture all possible carry-outs.

One approach is to use the sum of the two bounds of the float data type, FLT_MAX and -FLT_MAX, as the base. This ensures that any carry-out from the addition of the low components is accounted for in the addition of the high components.

Detecting Carry-outs

Detecting carry-outs requires monitoring the overflow or underflow status during the addition of the individual components. If an overflow occurs in the addition of the low components, a carry-out is indicated and should be added to the high component. Similarly, an underflow in the subtraction of the low components triggers a carry-down, which can be handled in the same manner.

Resources for Further Study

Additional insights can be gained from research in the field of double-float techniques. Two notable papers are:

  • [Implementation of float-float operators on graphics hardware](https://hal.archives-ouvertes.fr/hal-00021443)
  • [Extended-Precision Floating-Point Numbers for GPU Computation](http://andrewthall.org/papers/df64_qf128.pdf)

These resources provide valuable information on implementing float-float operators and optimizing their performance.

Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3