fas is floating point arithmetic for arbitrary mantissa and exponent types in modern header-only C++.  It lets you construct various different float t

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-10-18 18:00:14

fas is floating point arithmetic for arbitrary mantissa and exponent types in modern header-only C++. It lets you construct various different float types using template parameters for the mantissa, exponent and base.

The constructed float-types look and fell like a native float/double for arithmetic operations. Furthermore all methods are performed on the stack and do not require any heap space.

There are overloadings for std::numeric_limits available: fas::Float<int8_t, int8_t>::MAX() returns the same as std::numeric_limits<fas::Float<std::int16_t, std::int8_t>>::MAX() which is approx 2.16079e+40.

The setting of different bases allows to represent specific fractions exaclty. In this case the base is 7, so any fraction by 7 is represented exactly. Compare to the native double which is always to the base of 2, thus can not represent 1/7 exactly.

Leave a Comment