The saga of the Power ISA 128-bit long double


Our last post on MAME's failure to build on Power yielded some interesting information in the comments which, in my copious spare time, seemed worth researching further. But first a technical digression.

128-bit long double is currently implemented in gcc for Power ISA and PowerPC using "IBM extended doubles." This datatype is actually two 64-bit doubles paired together rather than a single 128-bit value, and specially handled in software. Because this is really a pair of 64-bit floats with altered handling, although it does improve precision (two 52-bit fractions, plus signs), it does not extend the range (as this NumPy bug demonstrates). In this scheme the effective exponent is still 11 bits, and the value thus yielded is not the same as the IEEE 754R 128-bit floating point datatype, which has a 15-bit exponent and a 112-bit mantissa plus sign.

The work to implement the true 128-bit type in gcc supersedes the extended double approach and offers better mathematical performance, and thus eliminates the need on supported systems to implement constant folding for IBM extended doubles, the crux of the g++ constexpr issue that screws up building MAME. What suitable 128-bit registers exist on Power? Why, AltiVec, of course. You can then use POWER9's native quad-precision support to support the new __float128 datatype (or emulate it in software on POWER7 and POWER8). gcc currently implements this feature behind the -mfloat128 option, which requires VSX and has hardware acceleration on POWER9.

However, if you try to use true 128-bit floats as long doubles now, it won't work properly. Why? Because the OS (and in particular libc) also needs to be adjusted for the ABI change. For Fedora, the transition to true 128-bit long double is scheduled for F30, which will include backwards compatibility symbols as the original implementation of 128-bit long double did, and the problem should finally be put to rest. It is likely that other distributions supporting ppc64le are undertaking the same process (we run Fedora here at Floodgap-Talospace).

For big-endian ppc64, there doesn't seem to be an endian restriction on 128-bit float support, so it should work the same way on Talos II systems being run BE (as well as BE POWER7 and POWER8). However, for pre-VSX platforms (such as 32-bit PowerPC, G5, and POWER4 through POWER6), the best solution for compatibility may simply be to find a system that still uses 64-bit long double. By doing so you avoid the problem completely if you don't need the extra precision. Adélie Linux, as reported by its maintainer A. Wilcox, is apparently such a platform.

Comments