Latest Posts

And now a bird that's not a Raptor: the H3 Falcon II

A post on the OpenPOWER blog caught my eye about the H3 Falcon II. This is not Raptor hardware; in fact, it's technically not even a computer, but rather a big rackmountable box of up to 16 GPUs. The Falcon II is PCIe 4.0 based and supports up to 31.5 GB/s bandwidth on each of its x16 GPU slots, with four host interconnects that can share them on demand. Obviously a lot of deep learning and other kinds of GPU-heavy tasks would be very interested in that much power that hopefully can be dynamically utilized as processing need requires.

POWER9 systems are well positioned to take advantage of this kind of hardware due to their prodigious I/O capacity and full support for PCIe 4.0, but although the machine is shown with Tesla V100 GPUs, their PCIe version currently "just" supports 3.0. AMD does have a PCIe 4.0 GPU, the data center-oriented Radeon Instinct MI60 and MI50, but let's not forget the Tesla has one other trick up its sleeve: NVLink 2.0, providing up to 150 GB/s bandwidth each way which is directly supported by the POWER9 as well. However, the Falcon II doesn't seem to offer NVLink.

The Falcon II is definitely an interesting unit and for POWER9-based datacenters looking at really heavily compute-bound learning tasks could be a more economical way of sharing lots of powerful GPUs between multiple nodes. It's not likely to achieve its fullest potential until PCIe 4.0 GPUs are more common and it lacks the flat-out crushing bandwidth of NVLink 2.0, but so far NVLink isn't shareable in the way this is and the Falcon II's NVMe-to-GPU link is truly innovative. That said, if you're in the kind of AI stratosphere where you would actually be cramming 16 $10,000 GPUs into one box, economy may not be the most pressing concern you'd probably have.

A couple followups to the Blackbird semi-review

Martin Kukač asked a few good clarifying questions about the Blackbird review; you can see his queries and my comments. I think it's worth clarifying that while I don't think the naked single-4 is the best Blackbird experience, especially where media is concerned, it's definitely useable as is (and certainly far faster than my long-suffering Quad G5).

For proof, Phoronix now has Blackbird benchmarks of that same basic single-4 (16 thread) system. They also tried to measure the performance impact of the Spectre mitigations, which Raptor enables on every machine they ship (both kernel and user-level), and discovered that while there is an impact on POWER9 of about 8% it is still rather less than the effect of the mitigations on Intel chips (about 18%, not counting microarchitectural mitigations required to deal with ZombieLoad et al). There's a little weirdness in the numbers in that sometimes the mitigated throughput was slightly better performing, possibly some unaccounted-for statistical artifact, but it's good to see that the price paid for better security on our systems is rather less than on x86_64. That said, if you really want to wring out that extra 8%, you can configure the protection level in the BMC based on your anticipated risk.

Finally, a big thumbs up to Red Hat, who fixed this ppc64le-specific bug in LibreOffice I reported via Dan Horák in record time.

A semi-review of the Raptor Blackbird: POWER9 on the cheap(er)

(*A big thanks to Tim Pearson at Raptor for his help with some of the technical questions, though I jealously guard my editorial integrity: this review was neither written nor approved by Raptor, and this machine was bought as a retail item without commercial consideration or discount.)

Much has been made and occasionally mocked of the Raptor Talos II's purchase price, which is hardly alone in the market even though I think you get what you pay for, but still admittedly eye-watering. (I'm saving pennies for a spare system and upgrading to a dual-8 instead of this dual-4, but that probably won't happen until I get my tax refund next year and I'm fortunate to make a decent living in a field largely unrelated to computing.) There are the usual folks who miss the point by stating that ARM and x86 machines are cheaper, and then there are the people who will miss the boat by waiting around for RISC-V, but while I think the people in the first category have their priorities mixed up they're also not wrong. The reason simply is that there are so many of them bought, sold and manufactured. No other architectures have the economy of scale of x86_64 and ARM, and therefore, no other architecture will have their price advantages for the foreseeable future. Boutique systems cost boutique bucks; the classic example is the PowerPC Amiga systems, which make it even worse by running embedded CPUs. If price or (perceived) value for dollar is your biggest concern, stop reading right now, because nothing in this review will convince you otherwise. Just don't ever complain someday when you don't have choices in computing architectures, because you did have a choice, and you chose cheap.

The point of the Blackbird is for people who either (like me) don't feel like feeding the Chipzilla monoculture further, or (like many others) prefer an alternative computing architecture that's open and fully auditable, and would prefer a smaller capital outlay. Maybe you're curious but you're not ready to make POWER9 your daily driver. Maybe you'd like to have one around as an "option" for playing around with. Maybe you have a need or interest to port software. Maybe some of your tasks would benefit from its auditability, but you don't need a full T2 for them. Or, more simply, maybe you see the value in the platform but this is all you can afford.

Now, finally, there's an existing lower-cost option. Not low cost, but lower cost: the price just plain won't be competitive with commodity systems and you shouldn't buy it with that expectation. I call this article a "semi-review" because it's not (primarily) about performance testing, benchmarks or other forms of male body part measurement; you can get that at Phoronix, and they have a Blackbird of their own they're testing. Instead, I'm here to ask two questions: how cheap can you get for a decent POWER9 system? And how good will that low-cost system be? There's no spoiler alert in saying I think this system is solid and a good value for the right kind of buyer, because really, what are you reading this blog for? I'll tell you what I paid, I'll tell you what I got for it, I'll tell you the ups and downs, and then I'll let you decide.

As with the Talos II the Blackbird is a fully open-source, auditable POWER9 system from the firmware up, but the biggest advantage of Blackbird over the T2, even more so than its price, is its size. The T2 is a hulking EATX monster and even the cut-down Lite is in the same form factor, but Blackbird is a lithe micro-ATX system and will fit in pretty much any compliant case. Cutting it down to size has some substantial impacts, though: there's a single socket only, and because only one CPU can be installed and the CPU handles directly-attached RAM and PCIe, you have fewer memory channels and PCIe lanes. That means a smaller RAM ceiling (just two DIMM slots for 256GB, vs 2TB in a loaded T2), fewer PCIe slots (a single x16 and x8 apiece versus three x16 and two x8), and of course fewer hardware threads. The 3.2/3.8GHz Sforza POWER9 sold for the Blackbird is the same CPU as the T2, so that means SMT-4 and a maximum of 88 threads with a 22-core part, but you'll only have one of them. Plus, this smaller board also has weaker power regulation, causing anything with more than 8 cores such as a 16-core or that 22-core beast to be unable to run at fullest performance (if at all).

That said, the Blackbird makes up for the limited expansion options with basic creature comforts on board, including USB 3.0, SATA, HDMI video out using the ASpeed BMC as a 2D framebuffer, 5.1 audio over analogue, S/PDIF, HDMI and Ethernet. (These devices are all blob-free except the NIC, by the way, and that last deficiency is already being worked on.) I decided in order to keep the cost really low that I'd just use the 2D BMC graphics and the on-board audio, a SATA SSD instead of NVMe like in this T2, and go with only 16GB of RAM.

I also thought about where to put it. We actually do have a need for a home theatre PC, and my wife is curious about Linux, so I decided to configure it that way and connect it into our existing DLP projection system. There's no Ethernet drop in the home theatre yet, so this machine will be wireless. Let's go shopping!

As of this writing the low-end 4-core (16 thread) Blackbird bundle starts at $1280 (all prices in US dollars). This includes the CPU and an I/O plate but does not include a heatsink or high-speed fan (HSF) assembly, which is extra and required and frankly Raptor should just build that into the price. I was fortunate that I got in on the Thanksgiving Black Friday special and got one for $1000. With the 2U heatsink, installation tool and shipping it came almost exactly to $1280 out of my pocket, so let's add another $280 to the current price for what you're likely to pay and budget $1560. I don't claim the prices below to be particular bargains; they're just what seemed decent on Amazon and your money/mileage may vary. Prices rounded up to the nearest whole dollar.

Blackbird Bundle (4-way CPU, heatsink, I/O plate and install tool, shipped to your door)
Micron MTA18ASF2G72PDZ-2G6D1 16GB DDR4-2666 ECC RDIMM
Seasonic Focus 650W 120mm Modular ATX PSU
(admittedly overkill)
SilverStone ML03B mATX Case
Samsung 860 EVO 500GB SATA III SSD
LG 14x BD-RW
Logitech K400 Wireless Combo Keyboard + Touchpad
Vonets VAP11G-300 WiFi Bridge
Arctic F8 80mm PWM case fans x 2
SSD bracket
Various random cables from my bag of crap
cheap as

That's a grand total of $2071 (I later spent an additional $22 which I'll explain in a minute) for what I would consider to be a basic system, not absolute barebones, but not generously configured either. Given that the major cost is the board itself, no matter what deals you find on similar items, you're largely guaranteed to be paying in the ballpark of two grand for a comparable loadout without a GPU.

One delivery exception and useless mailbox staff member later, I have a nice big box marked Fragile from the Raptor folks. If the NSA TAO got into this, it wasn't obvious from the outside.

Inside the big box was the 2U heatsink, the CPU (smaller white box), a 5/32" ballhead driver for cinching down the heatsink because I couldn't find where the other one had gone, and of course the Blackbird. I have serial #75, which seems high for an early order and I'm sad I missed one of the serial number 1 certificates.

Let's open up the Blackbird's box:

The DVD includes schematics, the manual, firmware builds and source code. There's also a paper info sheet, the test readout, the I/O plate, a couple cool stickers (hey, can we buy more of these OpenPOWER ones?) and, there it is, the motherboard. Other than the fact it's brand-spanking new, plus some additional markings and a couple of now-populated pin headers, at cursory glance I didn't see many changes in this Rev 1.01 board from the Rev 1.00 board I saw at SCaLE 17x, showing how mature the design was already by that stage.

The other very important thing in the box is a piece of receipt paper with red and black dot matrix printing containing the BMC factory root password. It is not 0penBmc as the manual states, and in fact nothing other than the slip of paper itself even references its existence. If you toss it out carelessly as I almost did, you won't be able to get into the BMC with anything short of a serial cable to its headers on the board. I applaud Raptor for their level of security, but it would have been nice to have been warned to watch for it, and the password on my unit had a capital O instead of a zero but the font doesn't differentiate them! (By the way, duh, I've changed the password. Don't bother writing in that you can see it.)

One item to note in particular on the board are the status LEDs, at the top left in this picture's orientation. If the case front panel does not have sufficient LEDs to display their state, and the one I bought doesn't, you may want to position this LED bank in such a way that you can see them easily. On my system they are visible through the side ventilation holes.

Let's pop the motherboard in the case, which I already prepared with the SSD, optical drive and PSU. We'll then remove (carefully) the plastic cap on the socket:

The CPU socket is a ZIF-type affair and has alignment notches so you can't put the CPU in wrong. It just drops in (though don't literally drop it, those delicate pins will bend).

The 2U heatsink is shown here. It just clears the top of the case in the mATX case I was using, but is not an active cooler, only passive. The manual mentioned an indium pad but that didn't sound necessary for the 4-core and indeed neither the 4 nor the 8 require one. The copper base with a couple small voids is shown, but doesn't seem to affect its ability to cool the chip. You don't need, and in fact should not use thermal compound, though I did polish the heat spreader with a microfibre cloth before installing the heatsink to remove fingerprints.

The side clasps grip the heatsink and should be level. Then insert your 5/32" ballhead in the top and turn clockwise. It requires a little bit of torque, but it's absolutely obvious when it's cinched down all the way because unless you're Ronda Rousey it won't turn any more.

At this point I also installed the two 80mm case fans and connected one to each of the two 4-pin PWM fan zones (if you have the heatsink-fan assembly, you would connect that instead to the zone closest to the CPU). For an HTPC we would definitely want as silent a system as possible, but Raptor doesn't recommend passive cooling of the rest of the components even with the 4-core because the fans will try to throttle down when they can and there may not be sufficient airflow. More about this in a moment.

One last gotcha is that you would think with one stick of RAM, it would go in slot A1. You would be wrong; the manual says it performs better in B1. But the single stick does work in A1. I won't tell you how I know, I just sense these things.

Anyway, let's put the lid on the case and install it in the home theatre rack. I put it on the bottom since it clears nicely there.

Connected to wall power, it took about a minute to bring the BMC up (until the rear status lights stopped pulsing). Now would be a good time to get in and change the OpenBMC default password provided on the slip of paper in the box. OpenBMC is accessible from the network on port 3, which is the one directly on top of the rear USB ports. Unfortunately, if you use a USB WiFi dongle, those ports will be powered down when the machine is, so there's no way to access it unless you set up some miniature Ethernet hardline network or plug into the serial port on the board. I suspected this would be an issue, hence the Vonets WiFi bridge which connects via Ethernet to the Blackbird and is USB-powered so I can power it independently from a wallwart. Because the BMC gets its address over DHCP, there may be a lag before it requests a lease and it may appear at varying addresses (I eventually tired of this and hardcoded a rule for its MAC address in the WiFi router). If your system will be hard-wired to Ethernet, though, there should be no problem. Note that even the Vonets devices are no panacea either because they will need separate configuration for the access point and password (I plugged it into my iBook G4 and used TenFourFox to set it up).

Once the BMC is ready, a quick press of the power button, and we have liftoff! Much as with the T2, the fans whir at full RPM on startup and then kick down at IPL as the temperature permits. Connected directly to the BMC's HDMI port, we get a graphical startup which is really quite cool. It shows all the steps, even some spurious meaningless errors which you can safely ignore. This image was displayed on the wall by my DLP projector, but the blinds were up, so apologies for the extra light.

And, about a minute and change later, here's our friend Petitboot:

Let's install the operating system. I don't know if I'll leave it on there and I'll probably experiment with some other OS choices, but I decided to install Fedora 30 on the Blackbird for review purposes so that I can compare the overall feel of the machine with my daily driver T2. (My T2 is a dual-4 with the BTO WX 7100 workstation card, 32GB of RAM and 1.5TB of NVMe flash.) So let's do that. It will also make a nice little stress test to see how it manages its own cooling.

I left it downloading Fedora Workstation and went to dinner, and came back about two hours later with the install complete but the fans now running full blast. That was not an encouraging sign.

However, that was not the biggest problem. The biggest problem was that the machine was almost unusable: besides sluggish animations, the mouse pointer skipped and the keyboard even stuttered keys, making entering passwords in GNOME a tedious and deliberate affair. This caused some, let's say, consternation on my part until it dawned on me I had not set this system up exactly like my full T2. Yes, it didn't have a discrete GPU, but it also was a direct install of Fedora Workstation rather than an install of Fedora Server that was turned into Workstation. This install was graphical turtles all the way down. That meant ... Wayland.

I opened up another VTY to tweak settings because typing into gnome-terminal was painful and error-prone, and a quick ps did indeed demonstrate the machine was in Wayland mode, which is the present default for Fedora Workstation on a graphical boot. That explained it: my T2 uses Xorg because it has a text boot and I run startx manually. I changed /etc/gdm/custom.conf to disable Wayland and restarted, this time into Xorg, and while animations were still not smooth they were much better than before. Best of all was that the keyboard and mouse pad were now working properly. If you don't have a GPU, and possibly even if you do, don't run Wayland on this machine.

Unfortunately, even with that sorted I couldn't increase the screen resolution to 1920x1080 (it was stuck at 1024x768), and audio wasn't playing through HDMI. Those could be addressed later but meanwhile I didn't want my new funbox to cook itself. Reported temperatures at idle looked like this:

Admittedly the location of the system is somewhat thermally constrained. The AV receiver doesn't run particularly hot and it's not flush on the top of the Blackbird, but mATX cases are cramped when they're loaded and the top vent is under the receiver (the side vents are where the fan mount points are). Coming from a system like the Quad G5 where temperatures over 70°C can be a sign of impending cooling system failure, this seemed worrisome. I asked Tim Pearson at Raptor about this and actually the POWER9 has a very wide temperature tolerance: 84°C as shown here falls well within its operating range and the thermal cutoff is, in his words, "well north of 100°C." This is reassuring but I would have preferred not to have a home theatre system that can also pop the popcorn while playing the movie, so I ordered a couple more fans and some splitters to take up the other two mount points ($22).

While waiting for those to arrive, the next order of business was the video, which was still stuck at 1024x768. Firefox from the Fedora repos worked fine for YouTube but only in 4:3.

After attempting to connect it directly to the DLP projector instead of through the AV receiver and getting nowhere, I started looking to see what the maximum resolution of the AST2500 BMC actually is, and bumbled into this Raptor wiki entry about getting Xorg to display in 1920x1080. Apparently you'll need to manually specify the settings for the time being because there isn't upstream support for the IT66121FN HDMI transceiver. Once this is done, though, it works:

HD playback from YouTube and Vimeo seems similar to the T2 in Firefox, maybe an occasional keyframe seek or dip in frame rate because of the smaller number of threads, but throughput is more than enough to be serviceable.

However, I couldn't say the same for VLC, which seemed strangely fill-rate limited trying to play commercial optical media. Playing both Atomic Blonde from DVD and Terminator 2 from Blu-ray in full screen 1080p generated roughly the same percentage of dropped frames in VLC's own statistics (around 6-8%). Disabling or changing interlacing settings didn't make a difference; turning down postprocessing, explicitly disabling hardware acceleration or trying different software framebuffer options for the video didn't help either. When reduced to a half-size window, however, no frames were dropped from either disc with VLC's default settings, suggesting that pushing pixels, not decoding, was hobbling playback.

This may have something to do with the fact that software LLVMpipe rendering at this resolution is a cruel, cruel joke. Xonotic, which runs at a ripping pace on my T2 with the WX 7100 card, is hobbled to a pathetic 5-10fps in software even with all the settings dialed down:

I spent a level or two getting whacked by the bots because running around and firing was a stuttery affair. Don't expect to game super hard on this either without a GPU unless you're talking about software-rendered classics. Unfortunately, it seems 1080p movie playback, at least on VLC, has similar limitations. Although mplayer doesn't seem to have any problems with full-screen scaling (I used mplayer -fs -x 1920 -y 1080 -zoom -ao alsa -afm hwac3), you have to know which title you want and it isn't as convenient or polished. I don't know why Firefox seemed to work okay and VLC didn't but I can live with that because streaming media will be this machine's primary task anyway.

Meanwhile, we still have the audio problem. My AV receiver does not have analogue 5.1 inputs, and what good is a home theatre without surround sound? The Blackbird does also offer S/PDIF, and my AV receiver has an input for that via TOSLINK, but being PCM only comes through in stereo. Tim suggested modifying /usr/bin/ on the BMC side to enable S/PDIF over HDMI, and provided a prototype script to do so. I'll post a partially working version as a gist, but my projector occasionally came up with a black screen from it and resolutions under 1920x1080 had a weird extraneous two pixels added, so I ended up reverting it and going back to TOSLINK. Apparently a reclocking step is needed per Tim which hopefully will occur in a future firmware release.

It turns out surround sound over S/PDIF is a perennial problem on Linux. The solution for me was creating a 5.1 lossy profile for libasound from this blog entry for Debian Wheezy, which more or less "just worked" on Fedora except I had to restart the machine to get it to stick. In pavucontrol I made sure that the S/PDIF profile was configured to stream AC-3 (Dolby Digital), DTS and MPEG, and having done so VLC was now able to play 5.1 surround from both Terminator 2 and Atomic Blonde with default settings. Even this didn't work absolutely flawlessly: if the audio stream was interrupted for some reason then the AV receiver went haywire and just played an annoying buzzing noise. But at least if you don't have analogue inputs on your receiver, you can still use a TOSLINK cable to get lossy 5.1.

A number of people have requested some idea of the system's power usage, so here are some rough unprofessionally obtained numbers. Remember, this is a low-end 4-core system with one RAM stick, no GPU, no PCIe cards and an SSD, so you should expect these numbers to reflect the Blackbird's minimum power usage (your loadout will not use less, and may use more). The Kill-A-Watt measured just over 3 watts with the Blackbird on standby plugged into the wall; powering it on, BMC bring-up and IPL topped out at around 60-80W, booting Fedora peaked at 127W, and idling in GNOME measured about 65W (I told you that 650W PSU I bought was overkill). Stressing the system vainly trying to play Xonotic only showed around 105W. The highest power draw I ever saw out of the Kill-A-Watt was 131W.

I installed the two new fans when they arrived and placed two fans (again, all Arctic F8 PWMs) using a splitter on each of the two zones for the full four this case will accommodate. With two fans on the CPU zone in the same environment, the cores could now be seen to visibly cool down at idle which wasn't happening before. Interestingly, the board didn't seem to be using the case fan zone much even though all four fans did indeed spool up at IPL as expected. After a few minutes letting it just sit there at the GNOME desktop, the cores dropped to 58°C and even the CPU fans gradually spun down to minimum. This wasn't silent, to be sure, though with a movie playing you'd never notice it. That said, making the machine work I could still get some 85°C-ish peaks out of it but nothing close to the thermal cutoff Tim mentioned.

In the five days I've had this so far, a couple of other trivial annoyances did pop up, though these are possibly unique to my specific circumstances. The first is that resolution switching periodically seemed to unsettle the projector, requiring some HDMI hot-replugging or even once or twice a full power down to get it to see anything. This could well be the projector, and may have absolutely nothing to do with the Blackbird, but it was still obnoxious and never happened before with the Blu-ray 3D player or the AV receiver's on-screen display. So far I can't discern an obvious pattern and this may disappear as kernel AST2500 support improves. Also, I completely cut power at the surge protector switch to protect the equipment when turning off the home theatre, but that means every time I fire it up and want to use the Blackbird I have to wait for the BMC to come back up again and then go through the startup sequence before I can even boot Fedora (about two to three minutes to a login prompt). Yes, this happens with a T2 as well, but in my case the T2 I'm typing this on is my daily driver and so it's always running anyway.

It's time to sum up, so let's answer the first of our two main questions: how cheap can you get for a decent POWER9 system? I think this is a decent POWER9 system and should meet many basic use cases as configured, but I think I've also demonstrated that it's at or near the functional minimum, and owing to the cost distribution in its particular bill of materials I don't think you can get a lot cheaper. Given that the board and CPU are about three-quarters of the total damages, there's not a whole lot more to economize on: you could cut down the RAM to 8GB or get a smaller PSU but you'd probably barely save $100, and even the SATA SSD, though not luxuriously large, wasn't all that expensive compared to a spinning disk.

In that case, let's answer the second, thornier question: how good will that low-cost system be? I'll be painfully frank and say I probably had unrealistic expectations when I chose to try to make this into a HTPC. Linux itself isn't a great choice, especially on a non-x86 platform. DVD playback on Linux is pretty much solved, but much commercial Blu-ray media is right out for lack of decryption keys, and because it's not x86 or ARM so is most DRM-ed content without closed-source black box binaries. While 5.1 mostly works over digital, the most trouble-free means of connecting it will be analogue, and I know I'm not the only person for whom that's not an option with their existing setup; the slow bring-up if you don't leave it on standby power all the time isn't a dealbreaker but is definitely annoying (I think shortening the startup time should be looked at for future lower-end designs), and at least for the time being the lack of a GPU in this loadout appears to limit the HD media options I can play comfortably. This box suffices fine for playing YouTube and Vimeo, which fortuitously will be the majority of its current tasks, and will be a nice training wheels system for my wife to experiment with Linux, work in LibreOffice and do other family computer tasks. If you choose something less fatty than GNOME you can probably wring a little more UI juice out of it, but I didn't notice much difference from my regular T2 in basic usage. More than that isn't really in the cards, and at least as-is, I felt I had to compromise a bit too much to make it into a credible media system.

As a deskside developer box, simple server or buildbot, though, even this relatively austere loadout is acceptable. The lack of a GPU isn't a major problem for such pursuits and 16GB of RAM is sufficient for even many intermediate tasks. With 16 threads and extrapolating from my dual-4 T2 which builds an optimized Firefox at -j24 in a little over a half hour, I can reasonably expect build times to be about double, which is good enough as a sidecar. It would be more than enough to serve data and media, even if (especially since) it's not playing it. In fact, because it's a secondary machine for me, I can run it full tilt since desktop responsiveness is comparatively unimportant. Plus, it's small and I don't care if I botch the installed OS, so this will likely be my test-bed machine for other distributions and hopefully the BSDs. If you want one purely as a secondary system or utility machine, then you're probably in the best position to make the most of this low-end config.

But it seems to me from my cursory observations of people waiting for their Blackbirds that they want this as a lower-cost way into the POWER9 ecosystem and to see how they like it as a primary system. In that case, let's just say it: you need to spend more than what I've spent here. I think based on this testing that to get a good experience of what POWER9 is and what it can do, you really need eight cores (currently, add about $330) and a GPU (you decide, though I stick with BTO choices since they're likely to be tested more, so that AMD WX7100 will set you back around $500 right now). I don't think there's enough threads for heavier usage with the 4-core, and gaming, media and desktop choices will expand dramatically with a discrete GPU rather than overworking poor old LLVMpipe. For that matter, 32GB wouldn't be a bad idea either. Due to what I consider the thermal constraints of such a larger spec, though (and given how constrained I found this machine to be with just 4 cores and no GPU), I would be leery of trying to get that all into an mATX case. I think you'd find it running louder and hotter than most people would like, and after my experiences so far, I wouldn't put a single-8 plus GPU in anything smaller than a regular ATX mini-tower.

With that done and dusted, now you're looking at a slightly larger footprint and a budget of about $3000 for what I would consider a system I could live with day-to-day. It won't be the same as my big T2, which because of its dual sockets has much more expansion capability (everything is NVMe and I still have room for the GPU, a RAID or SATA card, and FireWire), but I think the experience is comparable and it's still less than half of what a similarly equipped full T2 will cost. Just keep in mind if you find you like the POWER, and you might want a dual-18 or even a dual-22 someday, you won't get that on a Blackbird — you won't even get the fullest benefit of one of them. If you want dual GPUs, or lots of NVMe storage, forget it. Heck, if you just want a dual-8, forget it. What, you expected Raptor to cannibalize their own pro market? For someone like me, the big T2 would always have been my preferred option because it gives me the biggest bang and the most room to grow, but I've clearly got more money than sense and I already knew I wanted something with that level of power and flexibility. Even if they'd sold the Blackbird back then I'd still have bought the T2. But if there's no way you can spend that much today, even if you know you might in the future, then the Blackbird is what you should buy.

Don't get me wrong: even considering it didn't turn out as I planned, overall I like the Blackbird a lot, and while not a revolutionary design I think it's a strong, gutsy choice by Raptor to put their efforts into this sort of lower-end machine. It's clearly captured a lot of people's interest and I think Raptor will sell quite a few of them, which is great for the ecosystem and for the chances of having robust libre options on the desktop. I think it's certainly possible to get a lower-cost rock-bottom configuration like this to cover many people's uses, especially as a secondary computer, but I think you'll need to be a little more generous if you want this as your OpenPOWER workstation. Budget in the 8-core and a GPU and do it right, and you'll end up with a better priced option good enough to be your daily driver, yet running a performant architecture that's more open, less creepy, comparably specced and actually exists.

PowerPC to the people.

Flying the Blackbird

The Blackbird is now loaded into my home theatre rack. Just a couple tastes before the "full mini-review" later this week once I've ironed out a few other things.

The Mac Pro makes quesadillas again

It seems appropriate on the verge of the Blackbird's release that Apple would release the new "throwback" Mac Pro, back in a new smaller cheesegrater case that looks like a Power Mac G5 with an eating disorder. The "basic" eight-core system with 32GB of RAM, 256GB SSD and a Radeon Pro 580X will set you back $6000 plus tax and shipping. Funny, my eight-core Talos II with 32GB of RAM, a 500GB NVMe SSD and an AMD WX 7100 card shipped to my door cost me around $7300. Which would you rather buy? Which one would be a computer you actually, you know, own and can trust from the silicon up? Meanwhile, which one's OS is slowly moving to merge with its mobile version? Which one makes security choices for you? Do you want Raptor's T2 ... or Apple's?

Don't get me wrong: this is heaps better than the trash can, and might make a liveable system if I were still in the market for a Mac. But if you're actually considering buying one of these yet you think the Talos II is overpriced, I'm not sure what to tell you, plus you really have no excuse not to consider the Blackbird which can also offer a comparable loadout likely for less. My Blackbird arrives today after an untimely delivery exception, so I'm hoping to finish the review this week.

Void Comes Alive CD!

You don't have to be a racially confused puppet(*) to run Void Linux now; you just need a supported Power CPU and one of the brand-spanking-new Power ISA Void live ISOs with their interactive installer. As reported on Twitter (thanks reader Karl S), now that 32- and 64-bit efforts are getting unified you can run Void on pretty much any mainstream Power CPU from a G3 to a POWER9 and you can now bootstrap the system directly from the Live image instead of having to install something else first. Interestingly, while the 32-bit build will run without AltiVec, the 64-bit builds require it, meaning the only 64-bit system supported earlier than POWER6 is the G5 (POWER5 isn't). ISOs are available for any combination of 32 vs. 64 bits, big vs. little endian (on 64-bit) and musl vs. glibc. Package availability may temporarily vary based on your choice, though the goal is parity between all of them. Support for some of the unusual variant and console CPUs may come in the future.

Incidentally, got shipping notification for my Blackbird today!

(*apologies to Arrested Development fans)

Fedora 30 mini-review on the Talos II

It's upgrade time! We're Fedora users here at Floodgap Talospace because Fedora was one of the earliest distros to "just work" on the Talos II (we've run it since F28). This is not intended as a general review of Fedora 30, just to the things that are likely to matter to Talos users. There are relatively few things in F30 that are unique to Power ISA or the Talos particularly since 128-bit long double slipped to F31, but it's still an important update all the same.

Again, for those of you unfamiliar with Fedora, it is essentially the upstream for Red Hat (or Red Hat is its downstream, depending on your point of reference). It tends to incorporate new changes earlier than many distributions and is notable for having no LTS branch per se, since RHEL would in effect serve that role. Major releases come out roughly every six months and are maintained on an N+2 system; Fedora 28 will become unsupported about one month after F30 was released, which means this week. Fedora 31 is expected around the end of October 2019.

Although you can update through GNOME Software, I prefer to do it from a text boot at the command line to eliminate any variables. The steps are the same as for F29. Virtually everything should be available from the mirrors by now for ppc64le; I have a large number of packages installed and the only thing that didn't appear in the repos was my old Perl-Qt4 bindings (this required adding --allowerasing to dnf system-upgrade download --releasever=30 to remove rather than vainly attempt to update it). Remember that if you don't have your GPU's firmware loaded into Petitboot and you have the VGA disabled, you will not see the graphical installer. While the installation will proceed anyway, you will only be able to monitor it by pressing CTRL-ALT-F2 for an alternate console, logging in as root (other UIDs are locked out) and periodically issuing

dnf system-upgrade log --number=-1

which displays the log so far. The system will automatically reboot upon completion.

One improvement this time around was the black screen lockup didn't occur after the reboot as it did in F29; the system went straight through to the login prompt unattended.

As is usual for Fedora releases F30 includes an upgrade to GNOME, this time to version 3.32. This review isn't primarily about GNOME, which people have a love-hate relationship with, and I suspect the problems in this version aren't unique to the T2 even though I don't run Fedora on anything else. But this release does seem less polished and/or more troublesome than the GNOME update in F29. On the first start of GNOME, the Dock wasn't populated, items in Applications didn't show up and some extensions didn't load. Fortunately the second start fixed most of it. However, GNOME 3.32 also broke the Mac-alike theme I use, which now has weird proportions in gnome-terminal and gaps in Settings (and of course no one spends any time on documenting exactly what CSS theme selectors go with what widgets or I'd fix it myself). In addition, the UX changes are of dubious merit, particularly the new standard app icons which to my sensibilities are garish and ugly; the icons in 3.30 weren't particularly wonderful either but they were certainly easier on my eyes. Finally, GNOME Web just sits there and spins until you make the window close, so yeah, guess it'll just be Firefox now. However, I do suspect this is a Talos-specific problem; it also occurred in F29.

Ordinarily I use for windowing (a convenient startx brings me in from the text console). Since I hadn't really tried Wayland on Fedora on the T2 before, I started it up via dbus-run-session -- gnome-shell --display-server --wayland. It did seem to work fine for Firefox, GNOME Terminal and xterm, but oddly the Settings app wouldn't start, some games refused to run and window theming was inconsistent between Wayland-ready and XWayland apps. I didn't notice much of an advantage to it in terms of speed or stability, so while it's close to being a functional replacement I don't think it's quite ready for primetime. At least not for my usage, anyway.

Deepin is officially supported in F30 and is available for ppc64le from the standard Fedora repository (dnf install deepin-desktop), though GNOME hasn't made me hate it enough yet to try to hate something else.

Overall, F30 is some steps forward and some steps back, but if you're on Fedora you know the treadmill never stops. Fortunately, given that IBM owns Red Hat and Fedora is Red Hat's upstream, we can expect that Power ISA will be a well-supported platform on both Fedora and RHEL for the foreseeable future, and folks like Dan H who are more deeply embedded in the Fedora developer ecosystem help to maintain release quality. I look forward to F31 for further improvements but in the meantime F30 is still a solid release on the T2 and there's no good reason not to use it.

OpenSUSE updated to 15.1

OpenSUSE is updated to version 15.1. Among other updates are an improved graphics stack with backported later driver support, improvements to the YaST installation and configuration tool, and additional desktop refinements. OpenSUSE should "just work" on the Talos II; the ppc64le installation is available from the download Ports tab. See the release notes for more information on installation and upgrades.

Firefox 67 on POWER

Firefox 67 has been released and to my relief (though I now build smoketest builds of Firefox on ppc64le approximately weekly to find such problems) mostly builds uneventfully. It has a number of nice features, including enhanced content blocking, improved full keyboard accessibility and various performance improvements. The marquee GPU-accelerated WebRender isn't on most Linux systems yet but that's coming soon, hopefully. I haven't experimented with it yet myself but the existing GPU acceleration works fine on this Talos II with the BTO AMD WX7100 card (set layers.acceleration.force-enabled to true in about:config).

That brings up the first catch, because I did say mostly uneventfully: changes to profile handling. If you build from mozilla-release as I do and I recommend, you will end up with a "nightly release" version (assuming you don't pass --enable-release, which I advise you don't pass right now). Starting with Fx67 nightlies from any tree will try to create a new profile separate from your previous profile but the old one remains intact. You can explicitly select it from the Profile Manager (pass -P), or, if you know already which profile you want to use, you can specify it with -p (on my system the default profile is unimaginatively called default, ergo, -p default).

The second catch I haven't figured out the cause, whether it's a kernel or a Firefox bug, but periodically it will throw occasional but not infrequent warnings that look like this in dmesg (this is on a 5.0.x kernel):

[337262.237052] ida_free called for id=170 which is not allocated.
[337262.237089] WARNING: CPU: 6 PID: 12276 at lib/idr.c:519 ida_free+0x114/0x1e0

If you are on a distribution where kernel warnings get converted into notifications (like the Fedora machine I'm typing on), this can be rather obnoxious. If you are badly afflicted, you can temporarily turn them off with these instructions. I haven't found the root cause for it yet and it's hardly a great hardship, but it didn't occur in Firefox 66.

As far as the Firefox JIT for POWER9, I'm still plugging along, but other than a minor pull request to the documentation it's still 100% yours truly working on it. Of the remaining pieces the macro assembler is about 2/3rds written, leaving the low level assembler after that, and then trying to make it build. However, I'm also in the midst of a systems update for TenFourFox, which I still have a commitment to maintain in the short term, so any help will get it in your hands faster. Hopefully the commits make it clear how I'm translating the MIPS backend into POWER9, using all that 3.0B goodness (population count instructions! trailing zero count instructions! load PC in one instruction! it's an assembly language candy store!).

It's been a little while since I posted the .mozconfigs I use, so rather than direct you to old entries I'll just reproduce them here. Note that MOZ_PGO and MOZ_LTO don't seem to properly work and may generate defective binaries, thus their absence, and I explicitly pass --disable-release to an opt build because of various minor problems which hopefully we'll eventually smoke out. Adjust the number of cores as you like; this is a dual-4 system, so with 32 threads available I reserve 8 to let me still play Descent II during build runs. :)


export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd

export GN=/usr/bin/gn # if you have it


export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9"
ac_add_options --disable-release
ac_add_options --enable-linker=bfd

export GN=/usr/bin/gn # if you have it

What's missing in this picture

I've got the case (an inexpensive mATX Silverstone SST-ML03B), I've got the memory (16GB). The PSU, optical drive, wireless keyboard and WiFi should arrive next week. Now, what am I missing? Think think think!

Since the whole idea is a POWER9 system for the more price-sensitive, the trimmings cost about $500 on Amazon (minus tax and shipping) and could probably be found elsewhere for less. I also got in on the 4-core $999 Blackbird bundle special price, so with the 2U HSF and tooling that was $1090 before tax and shipping (now it would be roughly $1380) for a base outlay of about $1600. This is a nice attempt at a barebones 8-core for $1950, also apparently minus tax/shipping. Yes, I know you can get an Intel system for less, so don't even bother posting that. If price is your highest priority, you already know you're in the wrong place, but at least now price can still be a priority for what is a decent libre system regardless.

Obviously the aim for us here in the Floodgap household is to use it as an HTPC and that's how I'll be reviewing it. If you just want it as a workstation or to jam in a closet as a low-end server, you can almost certainly cut this parts list further.

ZombieLoad does not affect POWER9

If it's Tuesday, there must be yet another speculative execution attack debuting with a funny name and this Tuesday's entry is ZombieLoad. ZombieLoad works on the same conceptual basis of observable speculation flaws to exfiltrate data but implements it with a new class of Intel-specific side-channel attacks utilizing a technique the investigators termed MDS, or microarchitectural data sampling. While Spectre and Meltdown attack at the cache level, ZombieLoad targets Intel HyperThreading (HT), the company's implementation of symmetric multithreading, by trying to snoop on the processor's line fill buffers (LFBs) used to load the L1 cache itself. In this case, side-channel leakages of data are possible if the malicious process triggers certain specific and ultimately invalid loads from memory -- hence the nickname -- that require microcode assistance from the CPU; these have side-effects on the LFBs which can be observed by methods similar to Spectre by other processes sharing the same CPU core. Other internal buffers of potential value can also be sussed out by related MDS-style techniques.

Because of the limited bandwidth of the LFBs and the effectively streaming nature of the technique, an attacking process can't select arbitrary addresses and therefore can't easily read arbitrary memory. Nevertheless, targeting easily recognizable kinds of data can still make the attack feasible, even against kernelspace. For example, since URLs can be picked out of memory, this apparent proof of concept shows a separate process running on the same CPU victimizing Firefox to extract the URL as the user types it in. As the user types, the values of the individual keystrokes go through the LFB to the L1 cache, allowing the malicious process to observe the changes and extract characters. By its nature there is much less data available to the attacking process but that also means there is less data to scan, making real-time attacks like this more feasible combined with other attacks or social engineering.

However, ZombieLoad is pretty much irrelevant against POWER9 because the LFBs it attempts to monitor are specific to Intel's implementation of HyperThreading (which is true for really any other SMT implementation other than Intel's; the authors of the attack say they even tried on other SMT CPUs without success, almost certainly AMD, though it is not stated for certain that they tested on Power ISA). Even for unpatched Intel machines the actual risk from this (or even most speculative execution attacks, to be sure) is probably limited because it requires running a malicious process to do the snooping and such processes almost certainly have other, more reliable ways of pwning such machines. The decision to patch may simply come down to how much risk you're willing to tolerate: nearly every Intel chip since 2011 is apparently vulnerable and the performance impact of fixing ZombieLoad varies anywhere from Intel's rosy estimate of 3-9% to up to 40% if HT must be disabled completely.