Posts

Latest Posts

Firefox 123 on POWER


Finally getting back towards something approaching current. Firefox 123 is out, adding platform improvements, off-main-thread canvas and the ability to report problematic sites. Or, I dunno, sites that work just fine but claim they don't, like PG&E, the soulless natural monopolist Abilisks of northern California. No particular reason. The other reported improvement was PGO optimization improvements on Apple silicon Macs and Android. How cute! Meanwhile, our own PGO-LTO patch got simpler and I was able to drop the other changes we needed for Python 3.12 on Fedora 39, which now builds with this smaller PGO-LTO patch and .mozconfigs from Firefox 122. Some of you reported crashes on Fx122 but I haven't observed any with that release or this one built from source. Fingers crossed.

Early Power11 signals in the kernel


A number of people have alerted me to some new activity around Power11 in the Linux kernel, such as this commit and a PVR (processor version register) value of 0x0F000007. It should be pointed out that all this is very preliminary work and is likely related to simulation testing; we don't even know for certain what node size it's going to be. It almost certainly does not mean such a CPU is imminent, nor does this tell us when it is. Previous estimates had said 2024-5, but the smart money says no earlier than next calendar year and probably at the later end of that timeframe.

That said, the reputed pressures around Power10 that caused closed IP to be incorporated are hopefully no longer as acute for Power11, and off-the-books discussions I've had suggest IBM internally acknowledges its strategic mistake. That would be good news for Power11, but it's not exactly clear what this means for Solid Silicon and the S1 because S1's entire value proposition is being Power10 without the crap. While S1 will certainly come out before Power11, we still don't know when, and if there's a short window between S1 and a fully open Power11 then S1 could go like Osborne.

"Short" here will be defined in terms of how much work it takes to adapt the Power11 reference system. IBM understandably always likes to sell its launch systems first and exclusively before the chips and designs trickle down. The Talos II and to a lesser extent the Blackbird are a relatively straightforward rework of Romulus (POWER9's reference), so one would think adapting Power11 would similarly require little adjustment, though Romulus used the ASPEED BMC and any Raptor Power11 would undoubtedly use (Ant)arctic Tern/Solid Silicon's X1. In contrast, there'd be a bit more work to port Rainier (Power10) to S1 since the RAM would be direct-attach instead of OMI and there may be differences to account for with PCIe, plus the BMC change. The last estimate we had for the S1 machines was late 2024; putting this all together and assuming that date is at all accurate, such a system may have a year or two on the market before Power11 exits its IBM-exclusive phase.

That could still be worth it, but all of this could be better answered if we had a little more insight into S1 and its progress, and I've still got my feelers out to talk to the Solid Silicon folks. You'll see it here first when I get a bite.

Firmware 2.10 available for Talos II and Blackbird


Raptor has released firmware updates for Talos II and Blackbird (version 2.10). I'm still between residences but I intend to install this myself on both my machines in the next couple days. The biggest update is that Skiroot makes a big jump to kernel 6.6 which hopefully should solve glitches like Petitboot pooping its pants on XFS volumes with stuck log entries, not that that's ever happened to me twice, and there is a tweak for sporadic crashes on systems with more than 8 cores. Officially this wasn't a supported configuration on the Blackbirds, but there are people who try, and it's definitely appreciated for T2 and T2 Lite. Hostboot, HCODE, Skiboot, Skiroot and Petitboot are also all pulled up to current, InfiniBand drivers are now live in Skiroot (and thus Petitboot), and the Hostboot runtime has been compressed to give you more headroom in the BOOTKERNFW partition.

An intriguing change for the future also in this release is to enable firmware component signature checks during IPL by default. But using what key, you ask? You didn't sign anything! The key is the insecure known key in the official firmware builds, which adds no security currently and doesn't look any different from before, but provides the framework for you signing it later. At that point you'd sign it with your own key and provide that; now everything is already set up, and the process should "just work" with fewer steps. This is a long-running entry I keep intending to write and this is a good excuse to do that in the near future.

ICYMI: Hugo Landau explains how the Broadcom BCM5719 was freed


In case you missed it, Hugo Landau in December appeared at the 37th Chaos Communication Congress (37C3) to talk about how the Broadcom BCM5719 was freed in our favourite OpenPOWER systems. Sure, he's got lots of information on his blog, and you can look at the firmware written to his spec by Evan Lojewski, but there's nothing like hearing him explain the process of how he got it all open (and your jaw dropping to hear that the firmware never checks the signature). It's a good hero story but also reinforces the standard principles of how to make hardware your own, including hardware not particularly amenable to subversion. And Broadcom's a good example of that, by golly. Thanks to Jeremy Rand for the tip.

Firefox 122 on POWER


Right now during our relocation I'm not always in the same ZIP code as my T2, but we've still got to keep it up to date. To that end Firefox 122 is out with some UI improvements and new Web platform support.

A number of changes have occurred between Fx121 and Fx122 which improve our situation in OpenPOWER world, most notably being we no longer need to drag our WebRTC build changes around (and/or you can remove --disable-webrtc in your .mozconfig). However, on Fedora I needed to add ac_add_options --with-libclang-path=/usr/lib64 to my .mozconfigs (or ./mach build would fail during configuration because Rust bindgen could not find libclang.so), and I also needed to effectively fix bug 1865993 to get PGO builds to work again on Python 3.12, which Fedora 39 ships with. You may not need to do either of these things depending on your distro. There are separate weird glitches due to certain other components being deprecated in Python 3.12 that do not otherwise affect the build.

To that end, here is the updated PGO-LTO patch I'm using, as well as the current .mozconfigs:

Optimized

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you like
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9 -fpermissive"
ac_add_options --enable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options --with-libclang-path=/usr/lib64
ac_add_options MOZ_PGO=1

export GN=/home/censored/bin/gn # if you haz
export RUSTC_OPT_LEVEL=2

Debug

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24" # or as you like
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9 -fpermissive -DXXH_NO_INLINE_HINTS=1"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd
ac_add_options --without-wasm-sandboxed-libraries
ac_add_options --with-libclang-path=/usr/lib64

export GN=/home/censored/bin/gn # if you haz

Fedora 39 mini-review on the Blackbird and Talos II (and other woes)


Merry Christmas and Happy Holidays: what a long strange trip it's been trying to get to this point, between trying to get a place to live at the new $DAYJOB and fix the Blackbird, which seemed to have gotten its BMC setting scrambled, then get it and the Talos II upgraded to Fedora 39 so that I can get back to work on the Firefox JIT. Here we are finally, just in time to write up our usual mini-review (see what I wrote for Fedora 38). As I always say in these mini-reviews, Fedora was one of the first mainstream distributions to support POWER9 out of the box, it's still one of the top distributions OpenPOWER denizens use and its position closest to the bleeding, ragged edge is where we see problems emerge first and get fixed (hopefully) before they move further downstream. That's why it's worth caring about it even if you yourself don't run it.

Also, as usual, recall both my T2 and Blackbird are configured to come up in a text boot instead of gdm and I start KDE manually from there. I still test GNOME on both systems, but my primary desktop environment has been KDE Plasma for several versions now, and I always recommend a non-graphical boot as a recovery mechanism in case your graphics card gets whacked by something or other. On Fedora this is easily done by ensuring the symlink /etc/systemd/system/default.target points to /lib/systemd/system/multi-user.target.

Unfortunately, dnf kernel updates still! don't seem to properly update grub's config (basically bug 1921479, showing messages like 0ed84c0-p94177c1: integer expression expected during the process), so the process remains largely unchanged from F38:

dnf upgrade --refresh # upgrade prior system and DNF
grub2-mkconfig -o /boot/grub2/grub.cfg # force grub to update
dnf install dnf-plugin-system-upgrade # install upgrade plugin if not already done
dnf system-upgrade download --refresh --releasever=39 # download F39 packages
dnf system-upgrade reboot # reboot into upgrader

I do the Blackbird first as a check, since it can afford to be down waiting for updates, but I can't have that happening with the Talos II since it's my primary workstation. Unfortunately, while the packages downloaded okay, the actual upgrade process didn't do anything. After a couple failed attempts where it rebooted back into 38 seemingly unchanged, I watched it like a hawk and observed the following:

The installer was saying that the Fedora 39 GPG signatures weren't valid "yet" and so nothing that was downloaded could be verified. This turns out to be the same basic issue that bedevils Raspberry Pi 4 and earlier models that lack a realtime clock (bug 2242759), but the Blackbird has an RTC, so what gives?

I checked the CMOS battery coin cell and got a full 3.0 volts, and couldn't find anything otherwise wrong with the installer itself. The answer was labouriously going through the logs and finding that the system time was about two years in the past, confirmed in the Petitboot shell. This gets fixed in a normal boot by chrony synching up with NTP, but that apparently doesn't happen with the installer. Worse, it looked like everything had somehow gotten scrambled in the Blackbird's BMC because I couldn't log on with the admin password and fix the time (either from the SSH server or the web interface). Time to crack the case and connect to the BMC's serial port.

Connect your terminal at 115200bps, 8-N-1 to the inside 9-pin serial port headers. We're basically following these instructions to reset the BMC's internal persistent storage, which will zap everything including the BMC password (if you're an old Mac user like me, think of this as the ASPEED equivalent of "zapping PRAM"). However, a wrinkle here is that the system can reboot multiple times, so it's important to change the bootargs as many times as it resets back to U-Boot until the kernel actually comes up.

Here's what you might see in your terminal program. Make sure the terminal program is running and the serial port is connected before you apply power to the system to start up the BMC, or you won't be early enough to talk to U-Boot.

DRAM Init-V12-DDR4
0abc1-4Gb-Done
Read margin-DL:0.3725/DH:0.3803 CK (min:0.30)


U-Boot 2016.07 (Feb 19 2020 - 11:51:39 +0000)

       Watchdog enabled
DRAM:  496 MiB
Flash: 32 MiB
In:    serial
Out:   serial
Err:   serial
Net:   aspeednic#0
Hit any key to stop autoboot:  0 
ast# setenv bootargs console=ttyS4,115200n8 root=/dev/ram overlay-filesystem-in-ram rw
ast# boot
## Loading kernel from FIT Image at 20080000 ...
   Using 'conf@aspeed-bmc-opp-blackbird.dtb' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x20080128
     Data Size:    2656176 Bytes = 2.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x80001000
     Entry Point:  0x80001000
     Hash algo:    sha1
     Hash value:   1815ece74a2e27241c471b9ed87885071dd9e143
   Verifying Hash Integrity ... sha1+ OK
## Loading ramdisk from FIT Image at 20080000 ...
   Using 'conf@aspeed-bmc-opp-blackbird.dtb' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  obmc-phosphor-initramfs
     Type:         RAMDisk Image
     Compression:  lzma compressed
     Data Start:   0x20310f88
     Data Size:    1583941 Bytes = 1.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    sha1
     Hash value:   55e0853d6ad703d5ea225837d85223a73f7cf3a4
   Verifying Hash Integrity ... sha1
DRAM Init-V12-DDR4
0abc1-4Gb-Done
Read margin-DL:0.3745/DH:0.3784 CK (min:0.30)


U-Boot 2016.07 (Feb 19 2020 - 11:51:39 +0000)

       Watchdog enabled
DRAM:  496 MiB
Flash: 32 MiB
In:    serial
Out:   serial
Err:   serial
Net:   aspeednic#0
Hit any key to stop autoboot:  0 
ast# setenv bootargs console=ttyS4,115200n8 root=/dev/ram overlay-filesystem-in-ram rw
ast# boot
## Loading kernel from FIT Image at 20080000 ...
   Using 'conf@aspeed-bmc-opp-blackbird.dtb' configuration
   Trying 'kernel@1' kernel subimage
     Description:  Linux kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x20080128
     Data Size:    2656176 Bytes = 2.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x80001000
     Entry Point:  0x80001000
     Hash algo:    sha1
     Hash value:   1815ece74a2e27241c471b9ed87885071dd9e143
   Verifying Hash Integrity ... sha1+ OK
## Loading ramdisk from FIT Image at 20080000 ...
   Using 'conf@aspeed-bmc-opp-blackbird.dtb' configuration
   Trying 'ramdisk@1' ramdisk subimage
     Description:  obmc-phosphor-initramfs
     Type:         RAMDisk Image
     Compression:  lzma compressed
     Data Start:   0x20310f88
     Data Size:    1583941 Bytes = 1.5 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: unavailable
     Entry Point:  unavailable
     Hash algo:    sha1
     Hash value:   55e0853d6ad703d5ea225837d85223a73f7cf3a4
   Verifying Hash Integrity ... sha1+ OK
## Loading fdt from FIT Image at 20080000 ...
   Using 'conf@aspeed-bmc-opp-blackbird.dtb' configuration
   Trying 'fdt@aspeed-bmc-opp-blackbird.dtb' fdt subimage
     Description:  Flattened Device Tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x203089e8
     Data Size:    34013 Bytes = 33.2 KiB
     Architecture: ARM
     Hash algo:    sha1
     Hash value:   f53d8ad3a7c573a4903f910fd124507c73f0bbfb
   Verifying Hash Integrity ... sha1+ OK
   Booting using the fdt blob at 0x203089e8
   Loading Kernel Image ... OK
   Loading Ramdisk to 9ea16000, end 9eb98b45 ... OK
   Loading Device Tree to 9ea0a000, end 9ea154dc ... OK

Starting kernel ...

[...]

Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) 0.1.0 blackbird ttyS4

blackbird login: root
Password: 0penBmc
root@blackbird:~# flash_eraseall /dev/mtd/rwfs
Erasing 64 Kibyte @ 400000 - 100% complete.
root@blackbird:~# reboot

Notice that it fell back to U-Boot once before actually going into the OpenBMC kernel. You need to set the bootargs both times (because the watchdog is aggressive and will reboot after even a brief period of inactivity, I recommand cutting and pasting that setenv command instead of typing it). Once that's done, the default 0penBmc password will work, and we can reset the persistent storage. You may need to powercycle the BMC after this too as the manual reboot may not be enough.

With that corrected, I was able to log into the web interface and set the time manually, though I also ensured the time owner was Host (i.e., not the BMC) so that hopefully the system can deal with this itself on startup again. Petitboot confirmed the time was correct and the install succeeded. On my Blackbird I got the usual graphical progress bar on the ASPEED BMC framebuffer; on my T2 with a BTO AMD WX7100, as long as you ensure the kernel is manually selected through Petitboot on startup, you'll still get to see the install log live as text. If you forget to manually select the kernel and the system comes up to an apparently black screen, you can either monitor on the serial port, or from a connected system viewing the serial console over the BMC's web server, or by logging into another VTY with CTRL-ALT-F2 or as appropriate as root and periodically issuing dnf system-upgrade log --number=-1 to watch log updates.

The update did not cause a stuck XFS log entry on the Blackbird, but after the reboot I still had to do one more grub2-mkconfig -o /boot/grub2/grub.cfg and a restart to ensure the right kernel and version were being used. Currently the kernel version as of this writing is 6.6.7.

Fedora 39 is apparently the last hurrah for the X11 session of KDE Plasma, largely because the Fedora KDE SIG doesn't want to deal with it. While Plasma 6 still has a legacy X11 session, KDE as provided in the Fedora spin won't have it — Wayland Wasteland or bust. (In particular, this will obsolete both kwin-x11, presumably including /usr/bin/kwin_x11 and /usr/bin/startplasma-x11, and plasma-workspace-wayland.) Naturally this directly affects me personally, so let's start with the system with the worst Wayland support, our stripped-down Blackbird. All it has is the ASPEED BMC framebuffer connected over HDMI.

(GNOME Wayland started manually with XDG_SESSION_TYPE=wayland /usr/libexec/gnome-session-binary --builtin)
(KDE Plasma 5 Wayland started manually with /usr/bin/startplasma-wayland)

Both GNOME 45 (GNOME's own X11 session support removal is further behind, and doesn't appear imminent) and Plasma 5 in F39 are still limited to 1024x768 with the on-board HDMI. This is a terrible limitation that yet again remains unaddressed, especially for those people trying to run their systems completely blobless, and also affects Arctic Tern which uses the same IT66121FN HDMI transceiver PHY. While performance remains stably improved over the horrid morass it used to be, further fixes appear to be stuck on a plateau.

The good news is, a lot of the graphical glitches that plagued F38 and earlier in GNOME on the AST framebuffer (both Wayland and X11) appear to be corrected in GNOME 45. Window decorations and window movement seem to work properly again and performance is good enough. This places me in the unusual situation of recommending GNOME to blobless or no-GPU OpenPOWER users, even though it's not a particularly lightweight desktop environment, simply because you'll still be able to run it under X11 at least for a little while and you can add an X11 modeline for higher resolution — as I've done. Otherwise, start budgeting for a video card.
Plasma 5, of course, continues to work fine without a GPU in X11. Enjoy that while it lasts.

Next was the Talos II, which has a custom KDE theme and a lot more packages. Its clock is fine, so that wasn't the problem, but the install left a stuck log entry on the XFS root again and Petitboot duly puked. I haven't (quite, anyway) gotten to the point of replacing the XFS root with ext4, but I now have a script to automatically do the volume group gyrations and XFS repair on the Blackbird instead of typing in commands, and for future upgrades I'll be doing ipmitool chassis bootdev safe (at the suggestion of a commenter) before running the installer so that if I can't scan for devices, I can at least restart, immediately jump into the Petitboot shell and do repairs if necessary. Note that turning off the automatic device scan can impair autobooting, such as me remotely bringing up the system from the BMC after a power failure, so I've set it back to normal with ipmitool chassis bootdev none after the upgrade and repair were successful.

So far F39 is running fine on the T2 as well, with only a couple old compat packages that didn't make the jump (meanwhile, for those of you hanging on to F37, that support is already over). It's F40 where the problems are expected to start for me with KDE, which as of this writing is scheduled for mid-April 2024. As usual no one cares, least of all the Wayland community, which has simply fallen back on strong-arm tactics to drive adoption and functionality be damned. I'll have to evaluate my choices then based on my app mix and how well Wayland-only KDE can handle them, and it won't be just OpenPOWER users like me in that boat.