Posts

Showing posts from June, 2021

PowerPC and the Western Digital My Book Live debacle


Users relying on the Western Digital My Book Live and My Book Duo NAS systems had an ugly surprise last week when they were abruptly and remotely reset to factory default, erasing all their data. A combination of multiple exploits and Western Digital commenting out a password check appear to be responsible not only for the injury of data loss, but also the added insult of being infected with malware at the same time to join a botnet.

The interest to us here is that the WD My Book Live and Duo family are 32-bit PowerPC devices, more specifically the 800MHz Applied Micro APM82181, which is an enhanced 90nm PowerPC 440 core with additional DSP instructions called the PPC 464. The PowerPC 464FP used here includes a 7-stage pipeline and floating-point unit, and the APM82181 adds a DDR2 controller (256MB onboard) and 256K of RAM configurable as L2 cache. You can boot Gentoo and OpenWRT on it, all of which is unsurprising because the My Book Live basically runs Debian. Western Digital has not issued updates for this device since 2015 and many distros (including Debian, starting with stretch) have dropped 32-bit PowerPC support, though it is still supported in the kernel (except for the PowerPC 601) and these operating systems plus Void PPC and others still support the architecture generally.

The attack abuses a zero-day (CVE-2021-18472) to drop a malware executable named .nttpd,1-ppc-be-t1-z. This is a 32-bit PowerPC-compiled ELF binary and is part of the Linux.Ngioweb family, which in its most recent iteration "supports" 32 and 64-bit x86, MIPS, ARM (32/64) and PowerPC, and there are rumours it's been spotted on s390x (!) and Hitachi SuperH. There is no "preferred device" and the new presence of this malware on PowerPC hosts simply means the authors write good portable code and are expanding to more targets (we'd rather they were porting more generally useful applications, of course).

The upshot of all this is platforms are only as good as their security. There's nothing about the vulnerability which is specific to PowerPC, merely to the spin of Debian they use and what they've layered on top of it. WD recommends disconnecting these NASes from the Internet, and technically as IoTs they probably shouldn't be out naked on a WAN in the first place, but a better idea is to put something on them that's actually supported and maintained. It's fortunate that these devices are "open enough" that you can do it. What about the systems or hardware that aren't?

How to make GNOME not suck on Fedora 34 ppc64le


Hats off to Daniel Kolesa who noticed this initially. Today Floodgap Orbiting HQ was crashed because of power company shenanigans, so since this T2 was down for the count, I decided to do a little GNOME surgery and see if he was right.

For context, read our less than complimentary review of Fedora 34 (nothing personal, Dan!). Besides the usual wholesale breakage of add-ons with each GNOME update, GNOME 40 also broke colour profiles and most importantly had a dramatic reduction in graphics performance. Mucking with GPU performance settings, which seemed to work for other users on x86_64, didn't appear to make much difference in my case. Daniel's discovery, mentioned in the comments, was that one of the major underlying libraries graphene had support for gcc vector intrinsics but it wasn't using them on anything but x86_64.

So, before I brought this Fedora 34 Talos II back to the desktop (remember that I boot to a text login), I downloaded the sources for the version of graphene (1.10) used in Fedora 34 release and applied his patch. You'll need meson (at least 0.50.1), gobject-introspection-devel, gtk-doc, and gobject-2.0 (at least 2.30.0), and of course gcc, so make a quick dnf trip, unpack the source, apply his very trivial patch and mkdir build; cd build; meson ..;ninja to kick it off. (I also threw a common_cflags += ['-O3', '-mcpu=power9'] in meson.build to juice it a little more.) I backed up the old /lib64/libgraphene-1.0.so.0.1000.6 (it says 1.0, but it's 1.10) and replaced it with the one you'll find in build/src, and then did a startx to bring up GNOME.

The result? Not only didn't it crash, but dang is it buttah. I didn't test Wayland, because Wayland, but I would be very surprised if it didn't improve as much as X11 did — desktop performance is now just about back to Fedora 33 level, with only rare stuttering, even when I turned all GNOME animations back on. This also explains why mucking with the GPU performance settings didn't yield much improvement, because the GPU wasn't (at least primarily) the problem. Nice to have all that VSX and VMX doing something useful!

If you are using F34, you should just go ahead and do this; I've recommended to Dan Horák that this should be adopted for the official package, but the performance regression for me was so bad that I'm delighted I don't have to wait. If you don't want to build it yourself I uploaded a gzipped copy of .so I'm using to the Github pull request, but make sure you back up your old file before you replace it, don't replace it while GNOME is running, and you do so at your own risk (but you kept the old file so you can back it out, right?). The version I built is for POWER9, so don't run it on a POWER8. Naturally if you run Void PPC (because Daniel) you already have this change, but if you're on Fedora as I am, using F34 is suddenly a lot more pleasant.

Make water cooling great again


Although I'm sure there are one-offs, the last and most powerful Power architecture workstation to ship liquid-cooled from the factory was the Power Mac Quad G5. It's nice and quiet compared to the air-cooled G5s and allowed Apple to get that extra edge out of the PowerPC 970, though I did have to replace my cooling system a few years ago and you always have to watch for leaks. While I don't find this dual-8 Raptor Talos II to be particularly loud with the stock HSFs, it might be nice to get my 4-core HTPC Blackbird a little quieter when I'm watching a movie and I think that would be worth having to do a little service on it now and then.

While Raptor has no known plans to ship liquid-cooled machines, Vikings has your back (note that their store is down as of this writing for "spring cleaning"). The key with the IBM HSFs is that they get good, high-pressure contact between the heat spreader and the fan heatsink such that you can run 4 and 8-core parts without thermal compound or indium pads. While the custom mount Vikings developed doesn't achieve that level presently, with MX5 thermal compound they demonstrated over a 10 degree C reduction in core temperatures under load compared to the stock HSFs on a single 22-core POWER9. The custom CPU fitting then connects to an off-the-shelf Laing DC pump and a 120mm radiator.

There are of course problems to be solved before this becomes a workable product even though this prototype is very promising. Vikings is still trying to figure out how much pressure should be applied by the CPU clamp; while the use of thermal compound allows a bit of wiggle room here, clamp down too much and you'll crack the chip but too little and it won't cool. (The IBM HSFs are very user-friendly in this regard: turn until it stops.) There are also concerns the compression fittings may be too tight and a manufacturing issue with the mechanism's upper plate. For my money, since I already have one expensive liquid-cooled computer under this desk, I'd also want easy serviceability to drain fluids and replace tubing, and I'd want high quality components and fittings to reduce evaporative loss and the chance of any dreaded leaks. Like anything else, pay now for quality or pay later for damage.

Still, additional cooling options would be great for getting OpenPOWER machines in more places they haven't been before, and while the little 4-cores run very cool with just passive heatsinks, running dual-8s or higher core counts might really benefit from a liquid cooling system (especially you nuts out there trying to cram 18-core parts into Blackbirds). No ETA on a saleable product yet but I'm looking forward to seeing it develop.

Debian 10.10


Debian 10.10 is released, the latest stable version of what I suspect is the other of the two most common distros on OpenPOWER workstations. Like all stable Buster releases, it concentrates almost exclusively on critical fixes and security updates; in particular, the kernel remains at 4.19. Support for Buster is expected through 2022 and updated ISOs are now available.

Firefox 89 on POWER


Firefox 89 was released last week with much fanfare over its new interface, though being the curmudgeon I am I'm less enamoured of it. I like the improvements to menus and doorhangers but I'm a big user of compact tabs, which were deprecated, and even with compact mode surreptitously enabled the tab bar is still about a third or so bigger than Firefox 88 (see screenshot). There do seem to be some other performance improvements, though, plus the usual more lower-level changes and WebRender is now on by default for all Linux configurations, including for you fools out there trying to run Nvidia GPUs.

The chief problem is that Fx89 may not compile correctly with certain versions of gcc 11 (see bugs 1710235 and 1713968). For Fedora users if you aren't on 11.1.1-3 (the current version as of this writing) you won't be able to compile the browser at all, and you may not be able to compile it fully even then without putting a # pragma GCC diagnostic ignored "-Wnonnull" at the top of js/src/builtin/streams/PipeToState.cpp (I still can't; see bug 1713968). gcc 10 is unaffected. I used the same .mozconfigs and PGO-LTO optimization patches as we used for Firefox 88. With those changes the browser runs well.

While waiting for the updated gcc I decided to see if clang/clang++ could now build the browser completely on ppc64le (it couldn't before), even though gcc remains my preferred compiler as it generates higher performance objects. The answer is now it can and this time it did, merely by substituting clang for gcc in the .mozconfig, but even using the bfd linker it makes a defective Firefox that freezes or crashes outright on startup; it could not proceed to the second phase of PGO-LTO and the build system aborted with an opaque error -139. So much for that. For the time being I think I'd rather spend my free cycles on the OpenPOWER JavaScript JIT than figuring out why clang still sucks at this.

Some of you will also have noticed the Mac-style pulldown menus in the screenshot, even though this Talos II is running Fedora 34. This comes from firefox-appmenu, which since I build from source is trivial to patch in, and the Fildem global menu GNOME extension (additional tips) paired with my own custom gnome-shell theme. I don't relish adding another GNOME extension that Fedora 35 is certain to break, but it's kind of nice to engage my Mac mouse-le memory and it also gives me a little extra vertical room. You'll notice the window also lacks client-side decorations since I can just close the window with key combinations; this gives me a little extra horizontal tab room too. If you want that, don't apply this particular patch from the firefox-appmenu series and just use the other two .patches.

Progress on the OpenPOWER SpiderMonkey JIT


Progress!

% gdb --args obj/dist/bin/js --no-baseline --no-ion --no-native-regexp --blinterp-eager -e 'print("hello world")'
GNU gdb (GDB) Fedora 10.1-14.fc34
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from obj/dist/bin/js...
(gdb) run
Starting program: obj/dist/bin/js --no-baseline --no-ion --no-native-regexp --blinterp-eager -e print\(\"hello\ world\"\)
warning: Expected absolute pathname for libpthread in the inferior, but got .gnu_debugdata for /lib64/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
[New LWP 2797069]
[LWP 2797069 exited]
[New LWP 2797070]
[New LWP 2797071]
[New LWP 2797072]
[New LWP 2797073]
[New LWP 2797074]
[New LWP 2797075]
[New LWP 2797076]
[New LWP 2797077]
hello world
[LWP 2797072 exited]
[LWP 2797070 exited]
[LWP 2797074 exited]
[LWP 2797077 exited]
[LWP 2797073 exited]
[LWP 2797071 exited]
[LWP 2797076 exited]
[LWP 2797075 exited]
[Inferior 1 (process 2797041) exited normally]

This may not look like much, but it demonstrates that the current version of the OpenPOWER JavaScript JIT for Firefox can emit machine language instructions correctly (mostly — still more codegen bugs to shake out), handles the instruction cache correctly, handles ABI-compliant calls into the SpiderMonkey VM correctly (the IonMonkey JIT is not ABI-compliant except at those edges), and enters and exits routines without making a mess of the stack. Much of the code originates from TenFourFox's "IonPower" 32-bit PowerPC JIT, though obviously greatly expanded, and there is still ongoing work to make sure it is properly 64-bit aware and takes advantage of instructions available in later versions of the Power ISA. (No more spills to the stack to convert floating point, for example. Yay for VSX!)

Although it is only the lowest level of the JIT, what Mozilla calls the Baseline Interpreter, there is substantial code in common between the Baseline Interpreter and the second-stage Baseline Compiler. Because it has much less overhead compared to Baseline Compiler and to the full-fledged Ion JIT, the Baseline Interpreter can significantly improve page loads all by itself. In fact, my next step might be to get regular expressions and the OpenPOWER Baseline Interpreter to pass the test suite and then drag that into a current version of Firefox for continued work so that it can get banged on for reliability and improve performance for those people who want to build it (analogous to how we got PPCBC running first before full-fledged IonPower in TenFourFox). Eventually full Ion JIT and Wasm support should follow, though those both use rather different codepaths apart from the fundamental portions of the backend which still need to be shaped.

A big shout-out goes to Justin Hibbits, who took TenFourFox's code and merged it with the work I had initially done on JitPower way back in the Firefox 62 days but was never able to finish. With him having done most of the grunt work, I was able to get it to compile and then started attacking the various bugs in it.

Want to contribute? It's on Github. Tracing down bugs is labour-intensive, and involves a lot of emitting trap instructions and single-stepping in the debugger, but when you see those small steps add up into meaningful fixes (man, it was great to see those two words appear) it's really rewarding. I'm happy to give tips to anyone who wants to participate. Once it can pass the test suite at some JIT level, it will be time to forward-port it and if we can get our skates on it might even be possible to upstream it into the next Firefox ESR.

For better or worse, the Web is a runtime. Let's get OpenPOWER workstations running it better.