Posts

Latest Posts

Fedora 31 mini-review on the Blackbird and Talos II


As promised here's my periodic mini-review after upgrading both our Blackbird and Talos II systems to Fedora 31, the most current release, typed up in Firefox 70 running on Fedora 31 on my T2. Even though there are many of you who don't run Fedora on OpenPOWER, these reviews are still relevant because Red Hat does a lot of the work on the components you do use, and problems are likely to turn up here first. Much to my disappointment, one late breaking note is that 128-bit long double still isn't in Fedora ppc64le, and didn't make 31 either. I can't tell from the bug or the wiki page why the deadline keeps slipping.

I did the upgrade first on my home theater GPU-less 4-core Blackbird because it was already bitten by the librsvg2 issue and early reports indicate the updated LLVM-rustc pair in F31 fixed it. The steps are the same as I used for F29 and F30 except for changing the parameter to --releasever=31 (duh). A quick check demonstrated updating librsvg2 to the latest available for F30 didn't solve the problem, so I went on to downloading the packages for F31.

When I rebooted into the F31 installer, however, the projector freaked out and went into an endless loop of trying and failing to sync to the display. I don't know if it was unhappy with the video mode the installer set, but even the A/V receiver wouldn't pass through the HDMI video (the T2 did something similar which I'll note in a moment). I eventually had to pull up a second VTY and then and only then would the projector display anything. I then logged in as root and monitored the messages from dnf with periodic dnf system-upgrade log --number=-1 | tail -10 until the machine rebooted on its own.

Fortunately, F31 came right back up. I've done only minimal customization on the Blackbird, so pretty much everything transferred over unchanged, and no packages had to be dropped to do the installation. F31 comes with GNOME 3.34, which is alleged to have performance improvements, and actually I was very pleasantly impressed as you can see from the screen shots:

Video playback on this GPU-less Blackbird was a lot better in this release; in fact, Firefox 70 didn't drop any frames or audio at all (though I'm sure the rapidly improving VMX/VSX support has something to do with it ;). Although VLC on the unaccelerated Blackbird is still not perfect and playback was not completely smooth, pixel pushing was much improved in both DVD and Blu-ray playback and there were fewer dropouts with the TOSLINK surround sound (mplayer of course still played everything just fine). Unfortunately, I think the improvements are strictly in Mutter and GNOME itself, not llvmpipe, because Xonotic was still only ekeing out a bare 5fps at 1920x1080 as in F30.

As advertised, librsvg2 was working again. If you have exclude=librsvg2 in /etc/dnf/dnf.conf, you should remove it before you do the update. Every GNOME release has some vanity changes for no good reason and the new icons and minor UI tweaks seemed largely unnecessary but they weren't objectionable. On the apps side, GNOME Videos doesn't seem to grok the length of my AIFF music files correctly, though it does play them (MP3 was fine). GNOME Web was also working again after a long hiatus but it seemed to have minor glitches, and since there are people who use POWER9 now who are actually helping to maintain Firefox, you should just use Firefox.

Since I didn't find any obvious major regressions in my normal usage, the next step was to update the Talos II. The T2 does not have the WX7100 firmware in the BMC PNOR, so I expect to run the installer "blind," but interestingly my LCD would not sync to the display either just like the projector wouldn't. The LCD synced fine when I popped open a VTY, just as with the Blackbird, so I'm thinking there's something up with the installer's video mode. Otherwise, the install proceeded unattended and rebooted uneventfully.

As my daily driver the T2 is rather more customized than the Blackbird. It's pretty much a given that I'll lose some of my GNOME extensions in the upgrade or the custom "classic" OS X-like theme I use will have some odd breaking edge case, and that happened here as usual. In this case Dash to Dock was the casualty and the GNOME Extensions Manager refused to update it, requiring me to manually install it. Tweaks and Settings still have visual issues with my theme, but didn't seem worse, just annoying. The only thing installed that didn't transfer over were my custom Perl libraries which got eaten and needed to be reinstalled. I know, I know, I'm the last person on Earth who still likes Perl apparently.

On the T2 with its WX7100 workstation card, the graphical performance improvements were not as notable as with the Blackbird, but some things seemed better, and a few 3D games that chugged a bit on F30 seemed faster on F31. I'd still say performance was a net win, just smaller.

Both systems use X.Org, but I do try to at least test Wayland. My T2 is configured to come up in a text boot so that I have a console to fall back on, an artifact of originally being Fedora Server and converted to Workstation. I was able to start GNOME in Wayland from the command line with XDG_SESSION_TYPE=wayland exec dbus-run-session gnome-session (instead of startx). I would call it incrementally improved from before. Some apps (mostly games) still don't start, and some games that do start have odd aspect ratios, but more at least work. The issue with some apps, particularly XWayland ones, not obeying GNOME theming seems to be fixed, and while it didn't feel quite as snappy as X.Org it was still better than previous releases. However, my custom appmodmap tool for dynamically remapping the keyboard only works with things that run in XWayland, because it watches X events to know which window is up, and the GNOME Wayland compositor currently has no plans to offer this information. So back to X.Org.

However, the situation was even worse with the Blackbird. Since overall graphical performance seemed better I decided to push my luck and see how it worked in Wayland (which previously ran like treacle on Thorazine in January north of the Arctic Circle), but as soon as I switched to Wayland and rebooted, this time the Blackbird would not come up in a graphical boot at all. On several test boots after the kernel messages it immediately went to a grey screen with the mouse pointer and then froze hard, requiring me to power cycle it because I couldn't open any VTYs or get to the OS. Since this is a workstation installation rather than a server installation converted into workstation, I had to boot the Fedora rescue installer to fix /etc/gdm/custom.conf because I couldn't get the machine to come up otherwise. If you are installing F31 from scratch, you may want to make sure that WaylandEnable=false is uncommented before you try your installation out.

Overall Fedora 31 is both a good release and a bad omen. Performance (at least in GNOME under X.Org) is overall much improved, especially if you don't have a GPU, but it's still obviously better even if you do. Some bugs were fixed and packages installed uneventfully. There were the regular growing pains in GNOME, but I didn't lose anything irreplaceable, and other than the usual bumps one experiences with custom themes and extensions pretty much everything just worked.

But Wayland on ppc64le continues to be worrisome. I must concede that at least on the T2+WX7100 things has improved since F30, and since I freely admit I'm a Wayland sceptic those of you who are heavily invested in it probably don't care about my opinion. But overall it's still a step backwards because there are still things that won't run in it, a big part of my own personal workflow may never work with it, and on the GPU-less Blackbird beforehand I couldn't use it and now I can't even start the machine in it. Meanwhile, Red Hat's made some very public signals that Wayland is the future and X.Org will be going away. In their rush to do so not much attention is being paid to people using 2D framebuffers with Wayland, and this is a real problem because no currently available GPU is libre and not supporting the built-in BMC in every shipping Raptor system is a waste (not to mention requiring people to incur additional expense just to get something to work that was "already working"). If you want a truly blob-free system, right now you just plain can't use Wayland, and it doesn't seem like they care.

DD2.3 POWER9 steppings now available


Raptor now has SKUs for the Sforza DD2.3 POWER9 chips, which they're calling "POWER9 v2". Currently just the 4-core and 8-core are available, but the higher core counts are presumably soon to come. There is a slight price premium of around 15-20% for these over the DD2.2 CPUs, but they fix a number of errata including functional hardware watchpoints (no more YOLO mode) and add the new Ultravisor mode for enhanced security (which will be the subject of a future article). In addition, although TDP, clock speed and cache specifications are the same, improved Spectre v2 mitigations in this stepping (specifically count cache flushing with hardware assist) mean possible performance improvements particularly for branch-heavy workloads. Support for this feature should already be in current Linux kernels.

If you have a T2 family system, you can order these today, and the SKUs are reported as in-stock. They are drop-in replacements for all T2s and Blackbirds and because their TDPs are the same can use the same heat sinks and HSFs. Systems shipping now may still have DD2.2 chips in them, though Raptor says you can get a DD2.3 for a slight upcharge.

Talos II and Talos II Lite officially FSF Respects Your Freedom products


No one disputes the Free Software Federation practices what they preach, and no one disputes that their standards are strict. So hats off to Raptor, who today officially received FSF Respects Your Freedom designations for both the Talos II and T2 Lite (here's the official announcement).

The designation recognizes that the T2 and T2 family have full system schematics and source code available for the entire firmware stack from the BMC up, and no keys are needed to update or replace any firmware component unless you require your own. (The same applies to the Blackbird, too, of course; presumably its own FSF RYF certification is soon to follow.) Naturally the designation presupposes you are using a free distribution, as the FSF defines it.

The T2 family joins a relatively small number of complete systems that have RYF endorsements and given those systems' loadouts is easily the most powerful, at least of this writing. Not only is this a nice win for Raptor, who have made libre computing a cornerstone of their company, but it's also a great validation for OpenPOWER. A designation like this from the FSF, who stakes their entire reputation on libre computing, is no small matter no matter how you slice it. Congratulations!

FreeBSD 12.1 available


FreeBSD 12.1 is now available. This is largely a maintenance release. To the best of my knowledge this is the BSD with the best track record on OpenPOWER so far; it is otherwise a relatively straightforward 64-bit big-endian Power implementation. I'm still a NetBSD dweeb personally (on mac68k, macppc, cobalt and hpcsh) and I'm looking forward to someone porting it sooner or later, but if you want a BSD on your Blackbird or Talos II right now this is probably your best bet.

The installation directions for the Blackbird should work as is for the Talos II. However, if you've already got the ISO (not the .img) dd'ed to a USB stick, it seems to me that it should "just work" in Petitboot without all the goofing around at the BMC prompt (if you don't, though, then these instructions will allow you to bring the machine "up from nothing").

If you are already running FreeBSD, unfortunately it does not seem that the PowerPC port of FreeBSD supports freebsd-update(8) yet, though I imagine this is planned. FreeBSD 13-CURRENT boots and runs fine on the Raptor family as well, but no clear word on when that will reach release yet.

Fedora 31 available


Fedora 31 is now available, the next iteration of the somewhat bleeding edge of Red Hat (the totally bloody-all-over-the-floor edge is of course Rawhide). It is of particular interest to me personally since the Talos II I'm typing on is running Fedora 30, and it's a useful canary for future hiccups on Power ISA especially because Red Hat is an IBM thing now. Even if you don't use Fedora personally, its relatively rapid update schedule can help identify and fix architecture-specific issues well in advance in your own distro of choice.

F31 moves to GNOME 3.34 (presumably with performance improvements, so I look forward to seeing how this performs on my GPU-less Blackbird) and glibc 2.30. This last is particularly important to Power systems because it may finally mark the end of the 128-bit long double saga by transitioning to the new float ABI. Fedora is also encouraging the use of toolbox, a workspace container system; this too is supported on ppc64le, though I haven't messed with it much yet. Finally, based on this report, F31's use of LLVM 9 should also solve the codegen and faulty assertion issue plaguing librsvg2. Since I now have two POWER9 systems here, I'll do a test upgrade on the Blackbird and then the Talos II once the package mirrors have caught up, reporting back as in our prior reviews, but if these improvements in fact live up to the release notes this actually sounds like a really nice release for us especially.

In miscellaneous notes, F29 will be unsupported one month after this point, so make sure you're upgrading if you're still on that, and 32-bit i686 is no longer a thing on Fedora. (32-bit PowerPC was unsupported long ago in F22, just to desperately keep on topic.)

Firefox 70 on POWER


Firefox 70 is out and about. This is a very important release particularly for Power ISA because this includes a repaired 64-bit xpconnect and build system support for VMX and VSX (with VMX support in parts of the DOM and for libjpeg). VMX/VSX support is determined at runtime but I still advise if you build yourself to manually specify your CPU to the compiler (such as -mcpu=power9) to make sure everything is detected and better code can be generated. All these features work on both big and little endian configurations.

Fx70 is also the first release to officially enable the Quantum Render GPU-accelerated 2D compositor on all Windows-supported GPUs, which emerged from the Servo browser testbed as WebRender and has been gradually translated to Firefox. This is clearly the intended future of the browser, so we need to ensure it's operational on our platform.

AMD has been a supported GPU since Fx68 (Northern Islands, i.e., Radeon HD 6000 et al., and newer), so while Linux is not currently an officially supported Quantum Render target the WX 7100 sold with the Talos II should work. And, well, it does.

Performance is a bit sprightlier and I see better FPSes in demos, though our FPS rate is now increasingly JavaScript limited (yes, I know) as the rest of the rendering chain gets faster and faster. I have not encountered any stability or rendering issues with it so far. To enable WebRender, you need to enable hardware GPU acceleration in general and make sure that's working first; go to about:config, set layers.acceleration.force-enabled to true and restart the browser. I've been running with GPU acceleration myself for the past several releases, so I know it should work on at least the WX7100. Verify it's enabled by going to about:support and making sure that acceleration does not appear as "Blocked."

Once you have established GPU acceleration is enabled and operational, then go back to about:config, set gfx.webrender.all to true and restart the browser again. Go back to about:support; the window should look like the smaller one in the first screenshot. If sites go haywire, don't render right or seem to animate improperly, please flip those prefs back and compare so we can figure out why.

Northern Islands is a pretty low bar for WebRender and frankly if you're trying to run this on an even older AMD (or ATI??) GPU, you'll probably have lots of problems with almost certainly no benefit. Likewise, if you try to do this on Nvidia with nouveau, you're crazy. I don't see any reason why this wouldn't work on the *BSDs but I'd be interested to hear from anyone who has tried.

Meanwhile, more VMX and VSX improvements are in the pipeline and are certain to reach you faster with the increased release cadence in 2020. The .mozconfigs I personally use and support are unchanged from Firefox 67.

Is the warrant canary still warranted?


UPDATE: Raptor will keep the canary but reduce the frequency to every six months. There appears to be some significant cost to them, so this seems like a good compromise to me.

Somebody is actually watching Raptor's warrant canary, and mentioned it hasn't been updated in 6 months (as of this writing the last date is March 3, 2019). Although my usual tendency is to glance at it before installing a firmware update, 1.06 is over a year old, so I hadn't noticed myself.

Conceptually, the warrant canary helps to protect purchasers by acting as a negative indicator if they are under a gag order regarding a subpoena or other state actor legal action: if the canary disappears or isn't reupped, then caveat emptor. Raptor's response in the Twitter thread suggests that the failure to update was inadvertent and my gut impression is this is probably true, but the real question is how likely Talos or Blackbird owners are to be targets for state-level threats. We're using niche machines here but the OpenPOWER workstation userbase tends to be more cognizant of how it can be monitored, and if we weren't on watchlists before OpenPOWER started getting more popular, especially in certain countries the number of workstations may now be at a level where such concerns are no longer preposterous.

Raptor, in the same thread, is asking users to speak up about whether the warrant canary is still useful. (They mention a cost/benefit ratio; I'm interested to hear what the cost is. Is it time, money, both?) Lest one think a smaller company could be pushed around more easily, I don't think size is really a factor here; in fact, I'd argue that a bigger company is even less likely to care about such things because of increased bureaucracy and potentially competing internal priorities over government contracts. I agree their point we really should get comfortable with rolling our own firmware is very well taken, but by the same token it's not necessarily a small task for an individual to audit Raptor's tree either. Particularly for critical or time-sensitive updates we will still have some level of vendor dependency and it would be nice to have the canary in those circumstances when using a pre-built firmware package becomes necessary, so put my vote down as "please keep it." We're using these machines for a reason, and the more failsafes there are, the more we're better protected from Mayhem — like meow.

(*not sponsored or endorsed by Allstate)

Ubuntu 19.10 available


Ubuntu 19.10 is now available with the vaguely unwieldy name "EoanErmine" based on kernel 5.3 and GNOME 3.34. An interesting improvement in this release is their expanded cross-compilation toolchain allowing building for s390x, ppc64le, riscv and ARM targets, which hopefully will expand the number of ports and pre-compiled packages on this platform; another interesting one is experimental ZFS on root support. Although an official desktop release of Ubuntu for ppc64le still doesn't exist, the release notes do say that "[t]he ppc64el [sic] ... live-server ISO images are now considered production ready and are the preferred media to install Ubuntu Server on bare metal" (excellent!), so download the server ISO, and then for your workstation you can convert it to desktop Ubuntu.

librsvg2 issue on ppc64le


If you are using Fedora, keep an eye on bug 1756838 where an LLVM 8 codegen issue is suspected with ppc64le causing an apparently faulty assertion in librsvg2. Unfortunately, this library is heavily used by (at least) GNOME and Xfce, meaning the issue may well make your desktop environment unusable -- for example, my Blackbird with the faulty library couldn't open the Applications drawer without crashing gnome-shell. Unfortunately, reducing the codegen issue has not been trivial.

The faulty build is librsvg2-2.46.0-2. If you keep, or downgrade to, librsvg2-2.45.90-1, this version is unaffected because it was built with an earlier toolchain. At least for Fedora, there appear to be no ABI changes between 2.45.90 and 2.46.0 (thanks to Dan Horák for confirming this) and there are no known or at least visible security issues in the earlier version, so it is currently safe to stay there.

On Fedora, if you are on F30 and have not yet been affected, you may wish to consider putting exclude=librsvg2 into /etc/dnf/dnf.conf to inhibit updates to it until further notice. If you have been affected, you can attempt to downgrade to 2.45.90, though interestingly on my (unaffected) Talos II that has not updated to the bad version,

% strings /usr/lib64/librsvg-2.so.2.46.0 | fgrep 2.4 | grep fc
[...]
librsvg-2.so.2.46.0-2.45.90-1.fc30.ppc64le.debug

It is possible F31 may smooth this over with LLVM 9, which should arrive later this month, and doesn't appear to suffer from this problem.

This may not affect other distributions with older toolchains. If your distribution is also affected, please post in the comments.

CentOS 8 and CentOS Stream (The Freshmaker)


Yeah, okay, we've had a lot to say about Red Hat derivatives lately. On the heels of CentOS 7's latest service release now comes CentOS 8 in a new minty flavour CentOS Stream, "a midstream distribution that provides a cleared-path for participation in creating the next version of RHEL," rebranding the "classic" CentOS build from RHEL as CentOS Linux. Mentally translating, the intention appears to be as a staging area for updates from Fedora mainline to trickle into minor releases of RHEL (and thence to mainline CentOS), using CentOS Stream to more gradually introduce updates and incorporate user feedback in a rolling release fashion rather than the typical all-at-once version churn that previously resulted. You know, like chewy candy mints that make things fresher the moment you pop one in your mouth.

That said, mainline Fedora is plenty stable for (my) daily use on POWER9 and elsewhere (we're not talking bleeding-edge saddle sore Rawhide, kids), and Fedora will still be the ultimate upstream, so while I think this will help CentOS developers dogfood changes more gradually I'm having difficulty envisioning the small slice of conservative-but-not-that-conservative users this will appeal to as a daily driver. More likely people will simply regard it as the "public beta" channel for CentOS and RHEL, and I think that will be the actual role it serves regardless of the frilly language.

The CentOS Download site is not currently showing Power ISA (or other AltArch) builds for either CentOS 8 or CentOS Stream yet, but I expect these to emerge soon. It will be interesting to see if big-endian ppc64 is still supported when they do, but there should be POWER9 and "generic" little-endian builds at minimum.

Low-level change to Firefox 70 and ESR coming


If you are using Firefox on 64-bit Power, you'll want to know about bug 1576303 which will be landing soon on the beta and ESR68 trees to be incorporated into 70 and the next ESR respectively. This fixes a long-standing issue with intermittent and difficult to trace crashes (thanks to Ted Campbell at Mozilla for figuring out the root cause and Dan Horák for providing the hardware access) due to what in retrospect was a blatant violation of the ELF ABI in xpconnect, which glues JavaScript to native XPCOM. This needed several dodgy workarounds until we found the actual culprit.

The patch is well tested on multiple little-endian systems including this Talos II, but because it's an issue with register allocation in function calls the issue also theoretically affects big-endian Power even though we haven't seen any reports. I'm pretty sure the code I wrote will work for big-endian but none of my big-endian Power systems run mainline Firefox (and TenFourFox even on the G5 is 32-bit, where the problem isn't present). If you're using a big-endian system, you may want to pull a current release and make sure there is no regression in the browser with the changes; if there is and you can bisect to it, post in the bug so we can do a follow-up fix. On the other hand, if you're building from an old ESR such as 52 (the last non-Rust-required one), you may want to backport this fix because the problem has been there pretty much since it was first written.

Stuff like this actually proves Linus Torvalds' point that "as long as everybody does cross-development, the platform won't be all that stable." Linus was talking about ARM-based servers being undercut by a dearth of ARM-based PCs, but the point is also true here: 64-bit Power may do well in the data center but it was rarely used for workstations other than the Power Mac G5 and the small number of non-Apple PowerPC 970 towers, meaning this bug went undiscovered until people like us finally started dogfooding Power-based desktops again. (For that matter, the official PowerPC Mac OS X builds of Firefox were also always 32-bit, even on the G5, so no one would have noticed it there.) There's just no substitute for improving the quality and quantity of software for Power ISA like having one under your desk, and as the number of machines increases I expect we'll get more of these ugly corner bugs ironed out in other packages too.