Posts

Latest Posts

Progress on the Firefox ppc64le JIT


A picture is worth a thousand Wasm opcodes. This is further along than we've gotten on earlier drafts. More soon.

Partial ppc64le JIT available again for Firefox 115ESR


I've been rehabilitating the old ppc64le JIT against more current Firefoxes and there is now available a set of patches you can apply to the current 115ESR. This does not yet include support for Ion or Wasm; the first still has some regressions, and the second has multiple outright crashes. Still, even with just the Baseline Interpreter and Baseline Compiler it is several times faster on benchmarks than the interpreter-only 115. I've included also the relevant LTO-PGO and WebRTC patches so you can just apply the changesets numerically and build. The patches and the needed .mozconfigs for either building an optimized browser or a debug JS shell (should you want to poke around) are in this Github issue.

While this passes everything that is expected to pass, you may still experience issues using it, and you should not consider it supported. Always backup your profile first. But it's now an option for those of you who were using the previous set of patches against 91ESR.

ppc64le JIT now officially landing (again) in DOSBox Staging


Waaaay back when, I wrote up a basic dynamic recompilation JIT for vanilla DOSBox (the most well-known of the DOS-specific emulators, if you've been under a rock for awhile), which increases performance in x86 protected mode by as much as several times. This was an unofficial patch and I just kept it out of the tree, since the 32-bit PowerPC JIT it was based on wasn't part of it either.

Well, little did I know, but the patch got picked up as part of the DOSBox Staging spin six months later and apparently ran fine until an upstream commit broke it. I never noticed because I was happily using my old build, but Trung Lê did and reported it. So I fixed it and added proper support for 4K or 64K page sizes, and it was committed to the source tree today as part of 0.81. If you can't wait, build from source today, or wait for your package manager to pick it up whenever 0.81 gets formally released.

Firefox 117 on POWER


Now that the Talos II is upgraded and tuned up, it's back to development work, starting with (after a TenFourFox patch dump) Firefox 117. Maybe it's just me, but it seems subjectively zippier than 116, even accounting for the cruft that builds up during long browser sessions, and there are some notable developer-facing improvements. As usual, for the long-playing bug 1775202, either put --disable-webrtc in your .mozconfig if you don't need WebRTC, or tweak third_party/libwebrtc/moz.build with the patch from Fx116. The browser otherwise builds and works with a tweaked PGO-LTO patch and the .mozconfigs from Firefox 105.

It's a plug and pray night


The Talos II got an upgrade today but not without a lot of messing around. Some of you have noted I've said little about further Firefox JIT updates and that's because of 1) $DAYJOB and 2) running dangerously low on space on the 1TB Samsung 960 EVO NVMe SSD mounted on /home, so having lots more source code sitting around wasn't happening.

Well, as of Monday I'm now officially between jobs (don't worry, I'll be getting a paycheck again in October) and I finally got the T2 to cooperate with a new 2TB Samsung 980 PRO. At the same time I also replaced the Raptor BTO Marvell 88SE9235 SATA card with a JMicron JMB582 card. It's two ports instead of four, but I'm only using it for the two optical drives, and it seems to be much more reliable than the Marvell which would sometimes come up with drives and other times stall out.

But getting them to all work together was unexpectedly tricky. Here's what's in there now.

% lspci
0000:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0000:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon Pro WX 7100]
0000:01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04)
0005:02:00.0 Multimedia video controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0030:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:01:00.0 SATA controller: JMicron Technology Corp. JMB58x AHCI SATA controller

Originally, the NVMe drive I bought was a 2TB WD Black SN850X. This worked great in an external USB3 enclosure. I rsynced /home to it and put it into the PCIe carrier and restarted, and it failed to show up in Petitboot, lspci or Fedora. I tried a different passive adaptor on the off-chance that had something to do with it and moved it around the available slots, but nothing made it work. Later I found an old post on the Raptor forums reporting a similar problem, nor was I willing to get one of those pricey switched multi-M-2 cards to try.

Since the boot drive is (still) a Raptor BTO Samsung 960 and the drive I was replacing was also a Samsung, I decided the cheaper option would be to just buy another Samsung, though the TLC flash in the 980 PRO makes it more like a 980 EVO in my book. Anyway, I left everything copying overnight, came back this morning, pulled the PCIe carrier, swapped the SSD sticks and fired it back up, and Petitboot wouldn't see it either. (I had installed the JMicron card at the same time and it did show up, but that wasn't too helpful just then.) At this point two drives acting exactly the same way, including one that would be very likely to work, made me suspicious this was a configuration problem.

This is a fully-populated dual-8 T2, so both CPUs and all five PCIe slots are live. At this point, other than the BMC, Ethernet and USB, the only device on the first CPU's slots was the Raptor BTO AMD Radeon Pro WX7100 workstation card in the 16x (the 8x is open); the original NVMe SSDs and the Marvell SATA card occupied the three lower slots (16x, 16x, 8x) handled by the second CPU.

I started off by pulling the SATA card completely and just leaving the two SSDs and the WX7100. Sforza POWER9s support three PCIe controllers (PECs), 48 PCIe lanes and six PCIe host bridges (PHBs) per processor module. In theory this looks like three x16 slots per CPU, but in practice PEC1 on each CPU is always bifurcated x8x8 and PEC2 is optionally trifurcatable x8x4x4. Plus, there are on-board resources also competing for those lanes, so some of the T2 slots are necessarily bifurcated and others aren't. The x8 slot on CPU0 is actually a bifurcated PEC1, with x4 allocated to the Microsemi PM8068 BTO option (whether present or not), and its phantom PEC2 is split between the BMC, the Broadcom Ethernet controllers and the TI USB 3.0 host controller. On CPU1, slot three is PEC2 x16, slot four is PEC0 x16 (never bifurcated on either CPU), and slot five is PEC1 x8, with x4 allocated to OCuLink.

The old 960 EVO and the unresponsive new 980 were originally in slot 5 (CPU1, PEC1) and the boot 960 in slot 4 (CPU1, PEC0). Moving the new 980 into slot 1 (CPU0, PEC1) finally allowed it to coexist and be mountable with the boot 960, so next I put the JMicron SATA card into the newly available slot 5 and restarted ... and the x16 video card in slot two (CPU0, PEC0) failed to come up. Petitboot was fine on serial.

Figuring I had exceeded some maximum on CPU0 somehow, I moved the WX7100 to the open x16 on slot 3 (CPU1, PEC2). Not only did I still get a black screen, but I also got the dreaded PHB Freeze/Fence detected ! on that slot in the Hostboot output, which usually means the system planar is not happy with you now.

I returned to the immediately preceding configuration, returning the video card to CPU0 PEC0 in slot 2. Putting the 960 into slot 5 and the JMicron card into slot 4 also failed, so within those constraints the only thing I hadn't tried yet was putting the JMicron SATA card in slot 3 instead of slot 5. This seemed technically disgusting since I was wasting an entire x16 slot on a miserable little x1 card, and of course it worked.

The JMicron has a bright blue LED, so with the final configuration and the existing LEDs on the board I suppose I could slap a Honda hood ornament on the front and go street racing. This configuration leaves slot 5 unoccupied, which is an x8, so in the end it's no worse than what I started with (slot 1 was originally free).

I'm not sure what the moral of the story is here. It's possible the Marvell was misbehaving because of conflicts too, but it never failed to show up in lspci even though the drives connected to it sometimes did, whereas the new NVMe drive didn't even show up as a device, let alone a mountable filesystem. It also seems like the Radeon doesn't like being in a bifurcatable slot, though to test this I'd have to try it in slot 4. On the other hand, slot 5 should have been exactly the same as slot 1, yet neither the new 980 nor the SN850X would work in slot 5, nor would the JMicron card. Perhaps the SN850X would work fine in slot 1 as well. If I'm inside the case again to do something else, I should test some or all of these theories.

One thing that is worth remembering is that a PCIe device that initially fails to work or be recognized when installed may simply be picky about where you put it depending on what other devices are present. Unfortunately that means a whole lot more trial and error when you have multiple devices, and tonight I'm not interested in pushing my luck further. Once I've built the next Firefox (article to follow), then it's time to get back to work with a terabyte more space to expand into. And that'll hold a lot of source trees.

Linux 6.5


Linux 6.5 is released, including deprecation of the old SLAB allocator, faster PCIe waits (especially notable for us when things like SATA controllers start timing out), faster parallel direct I/O on ext4, and improvements to the workqueue. There's not a lot notable for Power ISA, though ELFv2 is now the default for 64-bit big-endian kernel builds, and if you're running Power10 this release adds support for the DEXCR SPR (Dynamic Execution Control Register) which helps to reduce speculative execution risk. Expect to see 6.5 in bleeding-edge distros like Fedora soon (and almost certainly in Fedora 39).