Showing posts from 2021

Introducing Kestrel, part II: it's a soft-BMC

Well, geez, guys, why didn't you just say so in the first place? Kestrel is a "soft" BMC replacement, meaning you can devise your own Baseboard Management Controller/service processor to bring an OpenPOWER system up from absolutely nothing but a Lattice ECP5. Now, that's cool!

The underpinnings are strongly based on the Microwatt soft core and as such makes it a true OpenPOWER processor itself, not ARM like the ASPEED BMC on shipping Raptor systems. Kestrel is not currently far enough along to bring up the On-Chip Controllers on the POWER9 (PowerPC 4xx-like cores), but this appears to merely be a matter of adding more IPMI command support. It is, however, enough to kick off the POWER9's Self-Boot Engines and go into Hostboot, so the basics absolutely work.

Right now I think this system is a little raw for general usage, and the soldering requirement on a several thousand dollar board that's badly backordered is not appealing. The whole dev stack is also intended for Raptor systems, though to be honest if you care about Kestrel you undoubtedly already own one. But Raptor is to be commended for making a shippable product out of Microwatt and making it truly open, as one would expect from them. What I'm interested to see, however, is whether future Raptor systems have Kestrels on board instead of the ASPEED. That would be really impressive in terms of owner control and would make the current valiant but vain efforts to neuter x86 firmware look even more pathetic.

There is no CentOS 8, there is only Stream and also this limited no-cost developer thing

Enough sting came from Red Hat's decision to jettison CentOS that Red Hat has backpedaled ... sort of. Instead of reviving CentOS in its prior RHEL-for-RHEL form, however, you'll just get Red Hat Enterprise Linux itself for no money out of pocket, provided you run it on a "small production workload" (16 systems or less, up from the prior single machine limitation) and sign up for the newly expanded Red Hat Developer Subscription program. Development teams can also take advantage of the free tier for cloud work. Either way, this "freemium" RHEL becomes available on February 1.

Is that enough? It's not ideal, but it is RHEL and you get it for $0 with a promise you won't become sales call fodder. However, Ars Technica is reporting that annual renewal is required, which doesn't sound that great for people who desire RHEL for long-term purposes (i.e., just about everyone who would use RHEL over CentOS Stream). On the other hand, while CloudLinux's free version now has its new name AlmaLinux, there's no official word yet on whether OpenPOWER is supported and RockyLinux's initial Q2 release will only be ARM and x86. If you're an OpenPOWER shop but the lesser stability guarantees of CentOS Stream really won't work for you, then freemium RHEL may be your only choice as long as your operation is small enough, at least for right now.

Introducing Kestrel

The new product tease we reported on from Raptor has a name: Kestrel. We theorized it was a FlexVer peripheral, but Raptor says it isn't, though it says "it's one of the most critical components of one." The image posted on Twitter (reproduced here) requires connection to the LPC, I2C (both platform control and AVSBus) and FSI signals, the latter of which will require either soldering or voltage conversion. Seems a strange omission for why that wouldn't simply be included on the board. Another curious omission is that the image also adds it has not yet been tested on the flagship Talos II (or presumably the T2 Lite), just the Blackbird, though reading the little what's available I don't see why it wouldn't work.

So, after all that, what does it do? The connection to those buses suggests some sort of low-level system monitor. If you really want to get down to the lowest level of what your system is executing, this is probably the device you want on one of the few systems that lets you do it as a supported feature. Here's a schematic of what the POWER9 is doing on bootup (what IBM docs call IPL, or the Initial Program Load):

The FSI connection is the biggest key here, which is the OpenPOWER Flexible Support Interface. This interface is active very early in standby mode, with clock signals available shortly after the machine is connected to power and the BMC is coming up — before even "Step 0" on this flowchart. Among many other things, the FSI triggers when the POWER9 Self-Boot Engine starts executing from its fused-in OTPROM, which contains the first instructions the POWER9 executes, and the BMC uses the FSI to determine which side of the SEEPROM the main CPUs should boot from. The FSI and IPC connections also allow monitoring the PNOR and other low-level traffic. With all these busses running through the Kestrel, you should be able follow the BMC and main CPUs all the way from standby to the end of IPL.

What we still don't know is if it will let you actually manipulate these signals, which could be a very powerful tool. Yes, you could potentially wreak havoc on your machine but I think soldering connections wrong would have a similar effect, so why not give us the keys to the store? Even if it's merely as a monitor, however, it could certainly be a way to have confidence a machine has not been tampered with and for hardware designers to understand better how present-day OpenPOWER systems operate.

The first Raptor tease of 2021

If you're wondering what Raptor has in the works next, it's not a new Condor: it's apparently something a little smaller. The company is teasing a new device for "for anyone interested in FPGA, open HDL, open FPGA tooling, low level IBM OpenPOWER POWER9 initialization, and minimal roots of trust." This doesn't sound like a computer; it may well be a peripheral for the FlexVer connector. If such a device could help improve the currently arduous secure boot procedure, this would be another big step forward in owner control and security.

Fedora 3-4K?

As previously mentioned, Fedora is the standard distribution we run right now on our two Raptor systems at Orbiting Floodgap HQ and, being one of the first distros to work more or less out of the box on POWER9 systems, is probably one of the most frequently deployed distributions on OpenPOWER workstations (Debian and Void PowerPC, of course, being in the mix there too).

Besides the big-endian and little-endian splits, the 64K vs 4K page size variation is now another notable fault line. Much software assumes the dominant little-endian byte ordering just as much other software, often the same software, assumes the dominant 4K page size, despite significant performance reasons for 64K pages on some platforms and not just on ppc64/ppc64le. Void is prominently 4K, for example, but Fedora uses a 64K page kernel as shipped, meaning significant software like the Wine-QEMU fusion Hangover can't currently run on Fedora as presently configured.

What about making Fedora Workstation 4K? There are clear advantages to keeping Server on a 64K page size, especially since those systems are likely to have the large memory volumes where the performance benefits of big pages would be most visible, but there are increasing compatibility disadvantages of having workstations on 4K. In a proposal mailing list thread, Daniel Pocock points out that 4K is needed both for the open-source Nvidia Nouveau driver as well as for the AMD RX 5700 (and probably other similar cards based on the same GPU generation), and Peter Robinson notes that the aarch64 spin went 4K for other reasons particularly acute on RPis and similar devices (though it remains 64K on RHEL 7 and 8). And all of this is independent of the compatibility issues that are already well-known and some non-trivial to fix. Is it time for a 4K spin of Fedora 34?

Compiling a 4K kernel is as "simple" as setting CONFIG_PPC_64K_PAGES=n but there is also the confidence and regression testing to prove a longstanding 64K platform doesn't itself have unknown assumptions going the other way, to say nothing of the risk of marginalizing 64K page systems when (and arguably unlike big vs little) performance differences of substance may exist and of what amounts to essentially fragmenting an already comparatively niche architecture. (As observed by another developer in the thread, "We have avoided doing so for much larger target markets than the power [sic] workstation market.") It may also be possible to solve this unofficially in a relatively maintainable fashion with a Copr kernel package, though I didn't see an existing one and I don't have much personal experience with this (perhaps someone who knows of such a package or the process can chime in).

Still, part of the justification for ppc64le on a long-standing big-endian platform was to meet existing software where it was and 4K pages may force the same change. If a 4K Fedora Workstation existed, I'd certainly use it: I have no especial loyalty to Fedora, but it's what I'm used to, and I'd rather stick with it right now than deal with an inconvenient migration ... at least until something vital comes along that I need to run.