Posts

Showing posts from 2024

Baseline JIT patches available for Firefox ESR128 on OpenPOWER


It's been a long hot summer at $DAYJOB and I haven't had much time for much of anything, but I got granted some time this week to take care of an unrelated issue and seized the opportunity to get caught up.

The OpenPOWER Firefox JIT still crashes badly in Wasm and Ion for reasons I have yet to ascertain, but the Baseline Interpreter and Baseline Compiler stages of the JIT continue to work great and are significantly faster than the baseline Interpreter (even in a PGO-LTO build), so I did the needful and finally got them pulled up to the new Extended Support Release which is Firefox 128.

I then spent the last two days bashing out crashes and bugs, including a regression from Firefox's new WebAssembly-based in-browser translation engine. The browser chrome now assumes that WebAssembly is always present, but on JIT-less tier-3 machines (or partially implemented JITs like ours, and possibly where Wasm is disabled in prefs) it isn't, so it hits an uncaught error which then blows up substantial portions of the browser UI like the stop-reload button and context menus. The Fedora official ppc64le build of Firefox 128.0.3 is affected as well; I filed bug 1912623 with a provisional fix. Separately all JIT and JavaScript tests completely pass in multiple permutations of Baseline Interpreter and Baseline Compiler, single- and multi-threaded.

As a sign of confidence I've been dogfooding it for the last 24 hours with my typical massive number of tabs and add-ons and can't get it to crash anymore, so I'm typing this blog post in it and using it to upload its own changesets to Github. Grab the ESR source from Mozilla (either pull a tree with Mercurial or just download an archive) and apply the changesets in numerical order, though after bug 1912623 is fixed you won't need #823094. The necessary .mozconfig for building an LTO-PGO build, which is what I'm using, is also in that issue; it's pretty much the same as earlier ones except for --enable-jit.

Little-endian POWER9 remains the officially supported architecture. This version has not been tested on POWER8 or big-endian POWER9, though the JIT should still statically disable itself even if compiled with it on, so the browser should still otherwise work normally. If this is not the case, I consider that a bug, and will accept a fix (I don't have a POWER8 system here to test against). There are no Power10 specific instructions, but I don't see any reason why it wouldn't work on a Power10 machine or on a SolidSilicon S1 whenever we get one of those.

Comments always solicited, though backtraces and reliable STRs are needed to diagnose any bug, of course. Meanwhile I've got more work cut out for me but at least we're back in the saddle for another go.

Chromium Power ISA patches ... from Solid Silicon


It appears that some of the issues observed by me and others with Chromium on Fedora ppc64le may in fact be due to an incomplete patch set, which is now available on Solid Silicon's Gitlab. If your distro doesn't support this, now you have an upstream to point them at or build your own. They include the Ungoogled changes as well, even though I retain my philosophical objections to Chromium, and still use Firefox personally (I've got to get back on the horse and resume maintaining my personal builds now that I've got Plasma 6 back running on Xorg again).

Oh, yeah, it really is that Solid Silicon. You can make your own speculations from the commit log, though regardless of whether Solid Silicon is truly a separate concern or a Raptor subsidiary, it wouldn't be surprising that Raptor resources are assisting since they've kind of bet the store on the S1.

Timothy Pearson's comments in the Electron Github suggest that Google has been pretty resistant to incorporating support for architectures outside of their core platforms. This is not a wholly unreasonable position on Google's part but it's not a particularly charitable one, and unlike Mozilla, the Chrome team doesn't really have the concept of a tier-3 build nor any motivation to. That kind of behaviour is all the more reason not to encourage browser monocultures because it's not just the layout engine that causes vendor lock-in. Fortunately V8, the JavaScript engine, is maintained separately, and reportedly has been more accommodating presumably because of things like Node.js on IBM hardware (even IBM i via PASE!).

Mozilla is much more accepting of this as long as regressions aren't introduced. This is why TenFourFox patches were largely not upstreamed since they would potentially cause problems with Cocoa widgets in later versions of macOS, though what patches were generally applicable I would do so. The main reason I'm still maintaining the Firefox ppc64le JIT patches outside is because I still can't solve these recent startup crashes deep within Wasm code, which largely limits me to Baseline Compiler and thus is not suitable for loading into the tree yet (we'd have to also upstream pref changes that would adversely affect tier-1 until this is fixed). I still intend to pull these patches up to the next ESR, especially since Github is glacially slow now without a JIT and it's affecting my personal ability to do other tasks. Maybe I should be working on something like rr for ppc64le at the same time because stepping through deeply layered code in gdb is a great way to go stark raving mad.

A RISC-V option for your Framework laptop (how about POWER next?)


Many of you have heard of the Framework laptop, a modular system that you can DIY from a mainboard and parts or purchase fully assembled. The designs are open-sourced on Github and Framework has actively been trying to develop an ecosystem around the product.

The part that's potentially most interesting is the mainboard. Framework actively advertises the notion that you can just replace components piecemeal to upgrade, including the logic board, yet keep the same display, port loadout, keyboard, battery and so on if they still work. You can even stick the old one in a case and use it for something else, which is not only environmentally conscious but very customer-friendly.

Now the first third-party Framework mainboard is coming, and it's not x86: it's RISC-V, and it fits in their 13" chassis. A RISC-V option is of course not new in portable computers; I reviewed the ClockworkPi RISC-V DevTerm a couple years ago, which can take either an RPi ARM compute module or an Allwinner D1 based on the 1GHz RV64IMAFDCVU XuanTie C906. However, the CPU is more powerful than that, a quad-core StarFive JH7110 with four SiFive U74 cores. The new Framework mainboard is based on an existing DeepComputing laptop product called "Roma;" DeepComputing now sells a more advanced version in a laptop of their own based on the octocore SpacemiT K1. Combined with the generally well-regarded Framework loadout and creature comforts, this could definitely be a product to watch.

That said, much as I was disappointed with the performance of the RISC-V DevTerm, most people are going to be similarly unimpressed with the performance of this one. Phoronix's benchmarks placed it well below the Raspberry Pi 4 (and the Orange Pi 5 crushed it), and Framework is trying to set expectations low by saying, "The peripheral set and performance aren’t yet competitive with our Intel and AMD-powered Framework Laptop Mainboards." That would certainly be an understatement, and is yet another example of the self-licking RISC-V ice cream cone getting ahead of its skis on real-world throughput. Framework also apologetically notes that the board "has soldered memory and uses MicroSD cards and eMMC for storage, both of which are limitations of the processor." Still, it's (soon to be) available and functional, and it could be mounted in one of those small desktop cases, so if you want a sidecar RISC-V machine to play with you've got another option better than yet another SBC.

But more important than that: it proves that you can put really any architecture on such a board and take advantage of the Framework, uh, framework instead of reinventing the wheel completely. So, instead of these various attempts at building a PowerPC laptop, why isn't there a Power ISA Framework mainboard? Wouldn't that approach just make more sense?

A baby Power10, if you're desperate


Are you really desperate to have your own Power10 (libre issues notwithstanding) while we wait for S1? IBM historically releases "little" versions of their servers after the launch systems have exhausted their novelty and now it's time for this generation's. If you've got 2Us in your rack, a wad of money in your wallet and an IBM salesdroid in your Rolodex, in about a month the Power S1012 could be yours.

Based on the size of the board, no one would mistake this for a Blackbird, yet it's pretty much the IBM equivalent: a single socket supporting up to eight cores. It comes as either a rackmount or in IBM's mega-tower case with four RAM slots for up to 256GB of memory. Tape and RAID are options, and it boots Linux, AIX or IBM i. If you need more sockets, there's the S1022 with a second one in the same form factor, and if you need more capacity, the 4U S1014 has you covered — and is still tower-ready in the same way that Orson Welles was suit-ready.

IBM hasn't shown as much love for their baby towers recently, though. In fact, there wasn't an IBM 2U option at all in POWER9's generation (no doubt much to Raptor's relief); if you wanted Big Blue in a Littler Box, you had to buy the 4U S914 instead (or a leftover POWER8 S812). Also, it seems like the S1012 tower's power output is gimped somewhat: the spec sheet says the rackmount can put 240W through the single CPU socket but the tower manages "only" 195W, which limits your core count. In the glory days, though, we had things like this.

This is my long-trucking POWER6 p520, the 2U baby of the old POWER6 generation. You could get it with two sockets and the same CPUs as its larger siblings, and since the POWER6 was SMT-2, I've got four threads running on its single LPAR. It has RAID and an optical drive and 16GB of RAM, with more available if you were willing to do battle with IBM Capacity on Demand codes. All in all, not bad for 2009.

Of course, I'm being very facetious in this article, because naturally none of these towers are really workstation substitutes. The S1012 (and certainly the S1022) is undoubtedly as loud as the POWER6, and while the POWER6's back baffle reduces some of the noise, it correspondingly reduces ventilation. There's a reason, after all, that I gave the thing its own room with the other geriatric servers. Plus, IBM doesn't talk to us end users: you'll have to buy it through a VAR or authorized rep. That was why I said screw it to buying a brand-spanking new POWER7 back in the day and got the POWER6, because it was used, cheaper and actually available. Which reminds me — if you have to ask how much it is, you almost certainly can't afford it. Hope you've been saving your pennies for the S1.

Rocky Linux 9.4


Rocky Linux 9.4 is out, based on RHEL 9.4, but, you know, free. (Note that Rocky Linux 8.9 doesn't come in a ppc64le version, so Rocky 9.x is your only choice.) If you want the stability of RHEL but don't like the pricetag and don't need the support, here's one of your options. As is typical for such point releases, this one primarily refreshes included software along with security updates. Boot, minimal and DVD ISOs are available for download.

End of the road for PowerPC 40x in Linux


The original PowerPC 400-series embedded chips are no longer supported in the Linux kernel as of today. Despite its prior design wins in many set top boxes, service processors and network equipment, there are no known current consumers of the code and no maintainers. The change affects the 401, 403 and 405, but in case you were worried the change is irrelevant to the embedded PowerPC 405 variant used as an on-chip controller for OpenPOWER, since it runs the Self-Boot Engine and not mainline Linux. It also does not extend to the 44x and above, like the Amiga clone Sam440ep and Sam460ex (AmigaOne 500) boards, which use the AMCC 440EP and 440-derived AMCC 460EX respectively and thus remain supported.

Fedora 40 mini-review on the Blackbird and Talos II (and a taste of Chromium)


This is Chromium running on GNOME in Xorg in Fedora 40 on the Talos II. I think it says it all, really.
Now, I won't mince words here: I don't like Chromium on philosophical grounds and you shouldn't expect me to be complimentary. But I salute the work that went into making it run. I'll have more to say about that later.

Meanwhile, it's that time again: in the same way I preface all these mini-reviews, Fedora was one of the first mainstream distributions to support POWER9 out of the box, it's still one of the top distributions OpenPOWER denizens use and its position closest to the bleeding, ragged edge is where we see problems emerge first and get fixed (hopefully) before they move further downstream. That's why it's worth caring about it even if you yourself don't run it. Also, as always, recall both my T2 and Blackbird are configured to come up in a text boot instead of gdm and I start the GUI manually from there. I always recommend a non-graphical boot as a recovery mechanism in case your graphics card gets whacked by something or other, and on Fedora this is easily done by ensuring the symlink /etc/systemd/system/default.target points to /lib/systemd/system/multi-user.target.

F40 is the first release since the 2.10 Petitboot PNOR firmware update and I had high hopes that it would fix the problem with stuck XFS logs sometimes making it puke, since it leapfrogs to a much more recent kernel. Both my GPU-less 4-core Blackbird and WX7100 dual-8 Talos II were already upgraded to the current PNOR before beginning and I recommend you do the same. Don't forget to save a copy of your BOOTKERNFW flash partition if your GPU requires it since this operation will erase it (you can flash it back when it's done).

dnf still!!!!! doesn't update grub's config (bug 1921479, showing messages like 0ed84c0-p94177c1: integer expression expected during the process), so the process remains largely unchanged from F38 and F39:

dnf upgrade --refresh # upgrade prior system and DNF
grub2-mkconfig -o /boot/grub2/grub.cfg # force grub to update
dnf install dnf-plugin-system-upgrade # install upgrade plugin if not already done
dnf system-upgrade download --refresh --releasever=40 # download F40 packages
dnf system-upgrade reboot # reboot into upgrader

I always do the Blackbird first as a checkpoint, and got this:

I'm not sure what the issue was, since the Blackbird mostly runs Workstation with only a few extra packages. This didn't happen on the T2. However, I crossed my fingers with --allowerasing and I was able to get it to download on the Blackbird and install.

I also should note that there was no installation screen on either the Blackbird or T2 this time around; for both systems I needed to log in as root on an alternate VTY (Ctrl-Alt-F2 or as appropriate) and dnf system-upgrade log --number=-1 intermittently to watch the updates. You can probably also still monitor it on the virtual TTY in the BMC web interface. Both systems then rebooted (fast reboot is disabled on both) and came up clean, so no XFS burp on the T2! One more grub2-mkconfig -o /boot/grub2/grub.cfg was needed to get Petitboot's menu looking right and the install was complete. I do note with approval that Fedora's boot from Petitboot to prompt was very quick this time around. Good work there.

Now, that desktop environment. I migrated to KDE from GNOME a few releases back after GNOME started messing with my themes, but KDE Plasma 6 in F40 is now Wayland-only; startplasma-x11 doesn't even exist. There is apparently an unofficial package to restore the X11 session but I haven't tried this yet due to a bigger problem I'll get to momentarily. On the GPU-less Blackbird this is a problem because Wayland remains limited to 1024x768 over the built-in HDMI output (Xorg can be coerced up to 1920x1200 with a modeline), so if you've decided to give up and embrace the KDE Wayland Wasteland, you either get to compute like it's 1999 or you get a GPU.

GNOME, on the other hand, does still work in Xorg and performs well on both the Blackbird and T2. Set your .xinitrc to

export XDG_SESSION_TYPE=x11
/usr/bin/gnome-session

and then the usual startx will bring it up from a text boot. Which brings us to this screenshot again:

Chromium is now officially available for the first time on ppc64le as a Fedora package. However, in Xorg it has many visual glitches, and this is true whether or not you have a GPU (this was taken on the Talos II, which has the Raptor BTO WX7100 workstation card).

The reason I entertained running it under Xorg at all was Plasma 6 pretty much broke my custom theme completely and a lot of my applets, even though its Wayland compositor runs fine on the Talos II. (Start it from the text prompt simply with startplasma-wayland.) But the application appears normally.

There are many problems with Chromium on ppc64le (big endian need not apply) and I suspect the major reason is because its JIT appears unfinished. In particular, it seems like most Wasm and certain other operations will make it trap, and as such it's not yet ready for prime time. I'm sure it will continue to improve and the porters are to be congratulated for their hard work on it, but I'll still be trying to get all the pieces in Firefox to go in the same direction, and once the next ESR (128) starts hitting the beta channel we'll at least have Baseline JIT acceleration available for it while I continue to struggle with Ion and Wasm.

Before that, though, I'm deciding what to do, whether to go back to GNOME or try to piece together my custom theme again in KDE. It'll need a fair bit of work. I guess this means it wasn't a good upgrade, though not because it doesn't work on OpenPOWER; it just wasn't a good upgrade, period. I certainly hope the churn will be less in F41.

Fedora 40


Fedora 40 is now out, the most current release that I personally use on my own Talos II and Blackbird systems. (This means that Fedora 38 will go EOL in about a month.) This release is presently based on kernel 6.8.7 and GNOME 46, but not the anticipated new Anaconda installer and DNF5 package manager updates. Also included are gcc 14.0, GNU binutils 2.41, glibc 2.39, gdb 14.1, Golang 1.22, LLVM 18, Ruby 3.3 and PHP 8.3.

Perhaps the biggest news for this release is that an official Chromium build is available once again for ppc64le while I still spin my wheels with the Firefox JIT (Wasm is now broken again and I have not been able to figure out why). I don't like Chromium for philosophical reasons but I'm sure it will make many of you happy.

That said, this release is probably more notorious for eliminating X11 support in the KDE Plasma 6 spin, which yours truly also uses. It will be interesting to see how well that works on the GPU-less Blackbird here since I haven't seen anything to suggest the issue with 1920x1080 through the onboard HDMI has been fixed, but the trusty old BTO WX7100 in the T2 should be fine in the Wayland Wasteland. Attempts to remove the X11 session from GNOME reportedly didn't land for this release either, so we'll see how that turns out too. I usually give the repos a few days to catch up before updating and then I'll post my usual mini-review (here's the one for F39).

OpenBSD 7.5


OpenBSD 7.5 is out with multiple kernel and SMP improvements (we love SMP improvements on our multicore beasts), more hardware support, and LibreSSL 3.9.0, OpenSSH 9.6/9.7, and LLVM-clang 16.0.6. The only headliner Power ISA specific improvement to the big-endian powerpc64 port is a smoother upgrade process, but all the other advancements are welcome too. Download from any of the many mirror sites.

Firefox 124 on POWER


Firefox 124 is out, featuring additional platform improvements and some other updates not highly relevant to us. This release needs an updated PGO-LTO patch and the .mozconfigs from Firefox 122.

Firefox 123 on POWER


Finally getting back towards something approaching current. Firefox 123 is out, adding platform improvements, off-main-thread canvas and the ability to report problematic sites. Or, I dunno, sites that work just fine but claim they don't, like PG&E, the soulless natural monopolist Abilisks of northern California. No particular reason. The other reported improvement was PGO optimization improvements on Apple silicon Macs and Android. How cute! Meanwhile, our own PGO-LTO patch got simpler and I was able to drop the other changes we needed for Python 3.12 on Fedora 39, which now builds with this smaller PGO-LTO patch and .mozconfigs from Firefox 122. Some of you reported crashes on Fx122 but I haven't observed any with that release or this one built from source. Fingers crossed.

Early Power11 signals in the kernel


A number of people have alerted me to some new activity around Power11 in the Linux kernel, such as this commit and a PVR (processor version register) value of 0x0F000007. It should be pointed out that all this is very preliminary work and is likely related to simulation testing; we don't even know for certain what node size it's going to be. It almost certainly does not mean such a CPU is imminent, nor does this tell us when it is. Previous estimates had said 2024-5, but the smart money says no earlier than next calendar year and probably at the later end of that timeframe.

That said, the reputed pressures around Power10 that caused closed IP to be incorporated are hopefully no longer as acute for Power11, and off-the-books discussions I've had suggest IBM internally acknowledges its strategic mistake. That would be good news for Power11, but it's not exactly clear what this means for Solid Silicon and the S1 because S1's entire value proposition is being Power10 without the crap. While S1 will certainly come out before Power11, we still don't know when, and if there's a short window between S1 and a fully open Power11 then S1 could go like Osborne.

"Short" here will be defined in terms of how much work it takes to adapt the Power11 reference system. IBM understandably always likes to sell its launch systems first and exclusively before the chips and designs trickle down. The Talos II and to a lesser extent the Blackbird are a relatively straightforward rework of Romulus (POWER9's reference), so one would think adapting Power11 would similarly require little adjustment, though Romulus used the ASPEED BMC and any Raptor Power11 would undoubtedly use (Ant)arctic Tern/Solid Silicon's X1. In contrast, there'd be a bit more work to port Rainier (Power10) to S1 since the RAM would be direct-attach instead of OMI and there may be differences to account for with PCIe, plus the BMC change. The last estimate we had for the S1 machines was late 2024; putting this all together and assuming that date is at all accurate, such a system may have a year or two on the market before Power11 exits its IBM-exclusive phase.

That could still be worth it, but all of this could be better answered if we had a little more insight into S1 and its progress, and I've still got my feelers out to talk to the Solid Silicon folks. You'll see it here first when I get a bite.

Firmware 2.10 available for Talos II and Blackbird


Raptor has released firmware updates for Talos II and Blackbird (version 2.10). I'm still between residences but I intend to install this myself on both my machines in the next couple days. The biggest update is that Skiroot makes a big jump to kernel 6.6 which hopefully should solve glitches like Petitboot pooping its pants on XFS volumes with stuck log entries, not that that's ever happened to me twice, and there is a tweak for sporadic crashes on systems with more than 8 cores. Officially this wasn't a supported configuration on the Blackbirds, but there are people who try, and it's definitely appreciated for T2 and T2 Lite. Hostboot, HCODE, Skiboot, Skiroot and Petitboot are also all pulled up to current, InfiniBand drivers are now live in Skiroot (and thus Petitboot), and the Hostboot runtime has been compressed to give you more headroom in the BOOTKERNFW partition.

An intriguing change for the future also in this release is to enable firmware component signature checks during IPL by default. But using what key, you ask? You didn't sign anything! The key is the insecure known key in the official firmware builds, which adds no security currently and doesn't look any different from before, but provides the framework for you signing it later. At that point you'd sign it with your own key and provide that; now everything is already set up, and the process should "just work" with fewer steps. This is a long-running entry I keep intending to write and this is a good excuse to do that in the near future.

ICYMI: Hugo Landau explains how the Broadcom BCM5719 was freed


In case you missed it, Hugo Landau in December appeared at the 37th Chaos Communication Congress (37C3) to talk about how the Broadcom BCM5719 was freed in our favourite OpenPOWER systems. Sure, he's got lots of information on his blog, and you can look at the firmware written to his spec by Evan Lojewski, but there's nothing like hearing him explain the process of how he got it all open (and your jaw dropping to hear that the firmware never checks the signature). It's a good hero story but also reinforces the standard principles of how to make hardware your own, including hardware not particularly amenable to subversion. And Broadcom's a good example of that, by golly. Thanks to Jeremy Rand for the tip.

Firefox 122 on POWER


Right now during our relocation I'm not always in the same ZIP code as my T2, but we've still got to keep it up to date. To that end Firefox 122 is out with some UI improvements and new Web platform support.

A number of changes have occurred between Fx121 and Fx122 which improve our situation in OpenPOWER world, most notably being we no longer need to drag our WebRTC build changes around (and/or you can remove --disable-webrtc in your .mozconfig). However, on Fedora I needed to add ac_add_options --with-libclang-path=/usr/lib64 to my .mozconfigs (or ./mach build would fail during configuration because Rust bindgen could not find libclang.so), and I also needed to effectively fix bug 1865993 to get PGO builds to work again on Python 3.12, which Fedora 39 ships with. You may not need to do either of these things depending on your distro. There are separate weird glitches due to certain other components being deprecated in Python 3.12 that do not otherwise affect the build.

To that end, here is the updated PGO-LTO patch I'm using, as well as the current .mozconfigs:

Optimized

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you like
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9 -fpermissive"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options --with-libclang-path=/usr/lib64
ac_add_options MOZ_PGO=1

export GN=/home/censored/bin/gn # if you haz
export RUSTC_OPT_LEVEL=2

Debug

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you like
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9 -fpermissive -DXXH_NO_INLINE_HINTS=1"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options --with-libclang-path=/usr/lib64

export GN=/home/censored/bin/gn # if you haz