Posts

Tonight's game on OpenPOWER: Death Rally


Let's go for another shooter, but this time a top-down one. Essentially a murderous Super Sprint, Finnish developer Remedy's first game was the exuberantly zippy Death Rally, a decidedly socially hostile top-down racer with machine guns, bombs, spikes, sabotage, splattable spectators and cold hard cash. You can pick up missions for extra money (assassinations, drug running, the usual) or just exterminate your competition for a big bonus and plow the proceeds into a better chassis, a faster engine and better tires and armour. Apogee picked it up and added Duke Nukem for extra competition along with Remedy's cadre of obviously named parody racers (just try to guess who Bogus Bill is).

While Death Rally plays well in DOSBox, and even got an official freeware Windows port, there was never a Mac version (I had to play it on my beat-up old 486) and it was never open-sourced. A few attempts have been made to decompile it but the best of these is dRally, which is 64-bit compatible and compiles out of the box (just make -j24 or as you like for your number of cores) with SDL2 on Fedora 35. The old IPX multiplayer mode isn't supported but everything else mostly works.

Once it's compiled, you'll need a copy of the game assets, of course. I have the GT Interactive retail CD; I have not tried to extract it from the Steam version, 3D Realms no longer sells it directly, and I can't find it on GoG, so let's assume you've got a disc too. For that, just copy over all the *.BPA files (make sure they remain uppercase!) and the CINEM subdirectory from the CD. Create a file called CDROM.INI that simply has one line, ./CINEM (to tell it where the movies are). Start the game in that directory (I made a directory called assets that has everything in it and just symlinked ../drally_linux for convenience).

There are two bugs I ran across, neither serious. The DOS game alternates between 640x480 and a letterboxed (in SDL) 320x200 mode; the former is used for menus and the latter for videos and actual gameplay. On my Talos II in Fedora 35 with a WX7100 there's a lot of garbage in the letterboxed area which can be improved, though not completely eliminated, with this patch. Also, at least with my GT retail copy, if you let the game sit at the menu too long it will eventually go through its brag screens and crash when it hits one that doesn't seem to be in the archive. This patch wallpapers the problem.

Otherwise, I've been gunning down other cars all afternoon, and I haven't had to fire up DOSBox to do it. Get ready to go!

Attack of the zombie PowerVR GPUs


Like rotted silicon corpses, several new recent GPU cards based on Imagination Technologies' old school PowerVR IP have emerged in the Chinese market, including the Innosilicon Fantasy One series and most recently the Moore Threads MUSA family. Don't kid yourself either about these being particularly more open than AMD and Nvidia offerings, though support for the Rogue GPUs is landing in kernel and in Mesa3D; while there is fairly good documentation on the GPU's ISA the firmware is still likely to be binary blobs (though this is no worse than the alternatives). Moreover, their performance even from the OEM specs just isn't competitive with current cards. That said, any GPU at all beats the flat framebuffer of the ASTs in our systems, and these chips may move more slowly enough to be more stable in the kernel compared to the occasional churn with amdgpu and friends. Assuming it can be flexible on page size, another alternative GPU architecture, even a basic one, would be welcome (at least until the Microwatt-based Libre-SoC is a viable option). We'll be watching that space.

Firefox 99 on POWER


Firefox 99 is out. The major change here is that the Linux sandbox has been strengthened to eliminate direct access to X11 (which is important because many of us do not live in the Wayland Wasteland). Note that the sandbox apparently doesn't work currently on ppc64le; this is something I intend to look at later when I'm done with the JIT unless someone™ gets to it first.

Unfortunately, Fx99 does not build from source on ppc64 or ppc64le and I was too busy on the JIT to do my usual smoke tests early. The offender is bug 1758610 but the patches do not apply cleanly to 99, so I have provided a consolidated diff for your convenience. You will also need a tweaked PGO-LTO patch; with those applied the .mozconfigs from Firefox 95 will work.

All three stages of the JIT (Baseline Interpreter, Baseline Compiler and Ion, as well as Wasm) now function and pass tests on POWER9 except for a couple depending on memory layout oddities; that last unexpected test failure took me almost a week and a half to find. (TenFourFox users will be happy because a couple of these bugs exist in TenFourFox and I'll generate patches to fix them there for source builders.) However, it isn't shippable because when I built a browser with it there were regressions compared to Baseline Compiler alone (Google Maps fonts look fuzzier in Ion, the asm.js DOSBOX dies with a weird out of range error, etc.). The shippable state is that Ion should be a strict superset of Baseline Compiler: it may not make everything work, but anything that worked in Baseline Compiler should also work with Ion enabled, just faster. These problems don't seem to have coverage in the test suite. You can build the browser and try it for yourself from the new branch, but make sure that you set the Ion options in about:config back to true. Keep in mind that this is 97.0a1, so it has some unrelated bugs, and you shouldn't use it with your current Firefox profile.

Smoking out these failures is going to be a much harder lift because debugging a JIT in a heavily multi-threaded browser is a nightmare, especially on a non-tier-1 platform with more limited tooling; a lot of the old options to disable Electrolysis and Fission seem to be ignored in current releases. With that in mind and with the clock counting down to Firefox 102 (the next ESR) I've decided to press on and pull down Firefox 101 as a new branch, drag the JIT current with mozilla-central and see if any of the added tests can shed light on the regressions. A potential explanation is that we could have some edges where 32-bit integer quantities still have the upper word set (64-bit PowerPC doesn't really have comprehensive 32-bit math and clearing the upper bits all the time is wasteful, so there are kludges in a few places to intercept those high-word-set registers where they matter), but changing these assumptions would be major surgery, and aside from the delays may not actually be what's wrong: after all, it doesn't seem to break Baseline. One other option is to deliberately gimp the JIT so that Ion is never activated and submit it as such to Mozilla to make the ESR, and we'd have to do this soon, but indefinitely emasculating our Ion implementation would really suck to me personally and may not pass code review. I'm sure I've made stupid and/or subtle errors somewhere and a lot of code just isn't covered by the test suite (we have tons of controlled crashes, asserts and traps in untested paths, and none of it is triggered in the test suite), so I really need more eyes on the code to see what I can't.

Asahi Linux gives hope for weird page sizes yet


Asahi Linux, a new distribution for Apple silicon-based machines (which sounds like some Star Trek silicon-based lifeform), has made an official alpha release. Although we've got a couple M1 systems here at Floodgap Orbiting Headquarters, that's not really the reason we care about it here. The reason is that Asahi Linux is currently a 16K-page system.

Recall from our previous article that many Power ISA distros, most notably Fedora, default to a 64K memory page for performance reasons rather than the more typical 4K page seen in most operating systems and on most x86_64 machines. (Void and others, on the other hand, are 4K.) This was also true of aarch64, which Fedora and RHEL also used 64K pages for, but Fedora moved to 4K pages on that architecture to cater to small devices like the Raspberry Pi and issues with the Continguous Memory Allocator.

However, Asahi Linux's choice of a 16K page size is even more unusual and not all aarch64 implementations support it, though that's irrelevant on the specific machines they cater to. From the docs, it was chosen because it "both performs better, and is required with our kernel branch right now in order to work properly with the M1’s IOMMUs." Like 64K pages this breaks software that assumes a 4K page size. Most notably for Asahi this breaks Chromium (trying to find sympathy, still trying, still trying, sympathy failed) but also jemalloc and a number of other things.

The fact that a more high-profile architecture also relies on an atypical page size is good news for us, because while there are plenty of projects that couldn't care less about ppc64le, those same projects are much more likely to pay attention to the same problem on an M1 Mac. If the fixes they upstream aren't lazy and support all of the page sizes aarch64 does (4K, 16K and 64K), then this in turn will help 64K-page OpenPOWER systems like the one I'm typing this on. To the extent that happens, we should thank the hackers on Asahi Linux, because they will have made it possible.

The first production RISC-V portable?


Yes, there have been various one-off RISC-V kinda-portables, just like there have been various one-off RISC-V kinda-workstations. But it was inevitable a production "RISC-V PC" would become available, and now there's a new first, a RISC-V version of the Clockwork DevTerm portable.

The DevTerm series is clearly inspired by the Kyotronic 85 family, which many people will remember in the form of the TRS-80 Model 100. This warms my heart because an NEC PC-8201A (the other white meat) was my one and only portable computer for a month when I was overseas in Malaysia and Singapore in 2000. (It acquitted itself fabulously, for the record.) The keyboard is unfortunately smaller (one review of the ARM version compared it to an HP 200LX, which is a small keyboard indeed) and it's not touch-enabled, but the machine is very modular and apparently all that was necessary to go R5 was a redo of the OS and a new CPU-memory card (which you can buy separately for anything that will accept a Raspberry Pi compute module form factor). The CPU appears to be an Allwinner D1 containing a OpenXuantie C906, a 1GHz RV64IMAFDC(V)U single core with 1GB of memory onboard. There is no GPU, just a framebuffer.

Bluntly, these are underwhelming specs (the specs for the HiFive Unmatched didn't get my sizzle zizzled much either), and even the lowliest 4-core POWER9 Blackbird or T2 Lite would smash a HiFive into itty bitty little RISC bits. Against this, well, I think my iBook G4 wouldn't have much to worry about. But there's no portable option for OpenPOWER yet, and while a Microwatt or Libre-SoC system might get there one day — perhaps as a compatible compute module, even — this is here today, and could be a worthy mobile complement to an OpenPOWER desktop workstation. I haven't reviewed their Github exhaustively, but the C906 seems to be open-source. Plus, even though the specs are lower than their other ARM-based offerings, it's also the cheapest (the set is $239, but the CPU-memory card is just $29). This is convenient and inexpensive enough that I've squeezed a bit out of our budget to order one and I'll do a review here when it arrives. I'm unquestionably a Power ISA bigot at heart, but that doesn't mean I'm not Five-curious.

Firefox 98 on POWER


Firefox 98 is released, with a new faster downloads flow (very welcome), better event debugging, and several pre-release HTML features that are now official. One thing that hasn't gotten a lot of airplay is navigator.registerProtocolHandler() now allows registration for the FTP family of protocols. I already use this for OverbiteWX and OverbiteNX to restore Gopher support in Firefox; I look forward to someone bolting back on FTP support in the future. It builds out of the box on OpenPOWER using the .mozconfigs and LTO-PGO patch from Firefox 95.

On the JIT front the Ion-enabled (third stage compiler) OpenPOWER JIT gets about 2/3rds of the way through the JIT conformance test suite. Right now I'm investigating a Ion crash in the FASTA portion of SunSpider which I can't yet determine is either an i-cache problem or a bad jump (the OpenPOWER Baseline Compiler naturally runs it fine). We need to make Firefox 102 before it merges to beta on May 26 to ride the trains and get the JIT into the next Extended Support Release; this is also important for Thunderbird, which, speaking as a heavy user of it, probably needs JIT acceleration even more than Firefox. This timeframe is not impossible and it'll get finished "sometime" but making 102 is going to be a little tight with what needs doing. The biggest need is for people to help smoke out those last failures and find fixes. You can help.