Talospace

Posts

Showing posts from 2019

Posted by ClassicHasClass on December 21, 2019

Updates to Alpine Linux and SUSE

Alpine Linux has been updated to 3.11.0. One of the musl-based releases, 3.11 updates to Linux 5.4.5, musl libc 1.1.24, gcc 9.2.0, LLVM 9.0.0 and Busybox 1.31.1. The release also features "initial GNOME and KDE support": remember, one of its selling points is its size, achieved partially by not shipping with any desktop environment, though I imagine quite a few use Xfce. The release also adds Vulkan support, and Rust on all supported architectures except s390x. Multiple install options are available for ppc64le.

Also recently updated is SUSE Linux Enterprise 12 Service Pack 5. POWER9 has been supported on SUSE since 12 SP3, and should boot on a vanilla PowerNV system like the Raptor family (if you are running SUSE or its free tier, openSUSE LEAP on Raptor hardware, post in the comments). SUSE Linux Enterprise is available (both 12 SP5 and 15 SP1) with a 60-day free trial.

Posted by ClassicHasClass on December 07, 2019

Firefox 71 on POWER

Firefox 71 is out, not a major system upgrade, but some nice milestones such as improved developer tools, support for Media Session and native MP3 decoding. It pretty much works as is on Power ISA and I've noticed no new issues with it so far. (UPDATE: See comments. Apparently the extensions I'm using are unaffected by bug 1601424. However, this looks like a general Linux issue on all architectures.) The configurations I am using are unchanged from Firefox 67.

This is the last of the 6-week sprints, moving to a 4-week cadence for Firefox 72. As a result I will be doing smoke test builds about every 10-14 days to ensure early regressions on Power are intercepted. Unfortunately I have not had time to do much more work on the JIT because of the holidays, family responsibilities and $DAYJOB. I won't be offended at all if someone beats me to the punch especially as I'm starting to see WebAssembly becoming a hard dependency even for some add-ons (without the JIT there is no support for wasm).

Posted by ClassicHasClass on December 07, 2019

AMD Navi support coming to OpenPOWER

Most of us (including yours truly) are using Polaris or Vega AMD GPUs, and the BTO WX7100 option Raptor offers is Polaris, but Raptor has issued kernel patches to enable the latest Navi GPUs on OpenPOWER. This issue was traced back to code essentially locking Navi support to x86, and the new patches have been confirmed working on Void Linux's Power ISA port and Fedora 31. To make the most of these changes, you should also upgrade to at least Mesa 19.3 and LLVM 9.0.1 when these become finalized to avoid various other cross-platform issues with Navi. Still, more hardware support is good support, and the changes are straightforward enough that they should get accepted into the kernel relatively soon.

Posted by ClassicHasClass on November 27, 2019

How far we've come in Power ISA

We haven't done a lot of chip pr0n lately, so let's do a little as a fun aside. Recently I bought a naked 601 (in this case a 601+, the second revision) since I figured it would look nice in my computer room along with my POWER9 die, shown at right in their display boxes.

Just how does that old 601, the very first PowerPC, compare with the POWER9? With painstaking precision propping my Pixel 3 on the end of a cereal box for height and eyeballing the distances with a Home Depot yardstick, I lined up both the 601 and the POWER9 die at the same scale, shown here.

The 601+ at left was manufactured on a 0.5 micron CMOS process in 4 layers, measuring 74 square millimetres. This is a die shrink of the original, which was manufactured at 0.6 microns in 4 layers and measured 121 sq mm. Both the 601 and 601+ have a single 32-kilobyte unified L1 cache, which was dispensed with for the 603, and 2.8 million transistors. At 80MHz, the original 601 consumed 8 watts; the 601+ die shrink reduced it to 4 watts at 100MHz.

The POWER9 die at right, on the other hand, is rather larger at 693.37 sq mm. This is almost six times larger than the original 601 and 9.3 times larger than the 601+, but with multiple cores and L1, L2 and L3 cache all crammed onboard in 8 billion transistors — almost 3,000 times more. Shipping POWER9 chips are built on a 14nm FinFET on SOI process in 17 layers and consume between 90 and 190W depending on core count at clock speeds from 2.25GHz to 3.8GHz.

In this oblique view you can see the functional units a bit better. The amount of power in modern Power ISA chips would have been absolutely unthinkable in 1992, particularly at modern process sizes. I, for one, welcome our future POWER10 overlords.

Finally, I've been meaning to post this picture for awhile that Mikey Neuling sent, with Anton Blanchard and Hugh Blemings showing off Microwatt running on an FPGA board (and Hugh's well-traveled Blackbird demo board) at OpenPOWER Europe in October. Now that's a mighty fine logo. The person who came up with that should be congratulated.

If you're States-side, happy Thanksgiving, and if you're not, eat some turkey anyway because it's better than that burger you were going to have.

Posted by ClassicHasClass on November 24, 2019

MIPS Open Initiative isn't anymore

Less than a year after it launched, the MIPS Open Initiative has apparently terminated. Despite the happy talk still up on the main site, and no sign of public notification, Hackster is reporting that Wave Computing has terminated the offering with an extract from a legal letter stating it became effective on November 14. The complete letter is posted here. Indeed, the new sign up page simply 404s, though as of this writing the GNU tools downloads are still up.

It's debatable how open MIPS Open Initiative really was, compared to, say, OpenPOWER (or RISC-V or OpenSPARC). MIPS Open had a more legally problematic licensing structure called "open use" which only allowed registered participants to develop under the MIPS Open Architecture 1.0 license, requiring prior approval and developers under the MOA-1 license "to provide upon request by MIPS or its authorized MIPS Open Verification Partner, information demonstrating that such implementation in its current format is a MIPS Open CERTIFIED Independent Core." No community governance existed and anecdotal reports circulated that gaining access to critical developer resources was arduous and bureaucratic. For OpenPOWER, however, all an implementor has to do is meet the specification and obey the trademark requirements. More importantly, now that OpenPOWER is part of the Linux Foundation IBM can't backtrack even if they wanted to (and it's pretty clear they're happy with the response to OpenPOWER so far). And the free VHDL implementation Microwatt is just a Github clone away.

Meanwhile, clearly Wave doesn't want people to realize what's happened, given how delayed the news has been and how little announcement they themselves have made. Even less clear is what prompted the withdrawal, but the most likely theory is Wave is planning to put MIPS up for sale. Funded as an AI startup and lacking customers, the last thing Wave's backers want is IP they can't unload for dollars, which may explain why Wave CEO Art Swift was pushed out after four months. The question is whether anyone is commercially interested anymore.

Posted by ClassicHasClass on November 19, 2019

Bettering the BMC and bring-up

Pretty much every computer these days has a service processor of some sort for bringing up the system and the main CPUs, just sometimes under different names (the Intel Management Engine could be considered a form of one, albeit with a lot more black boxes tacked on). So do POWER9 systems from the smallest Blackbird to the biggest IBM E980, and for many of these systems that service processor is the BMC, or the baseboard management controller. (Systems capable of PowerVM use one or more PowerPC (405?)-based Flexible Service Processors, or FSPs, which are accessed over ASMI. This includes big Power boxes like the E980 and even some z/Machines but also many pre-OpenPOWER systems like the 8203-E4A POWER6 that runs Floodgap. Since this is not particularly relevant to OpenPOWER, I won't talk about it further in this article.)

BMCs are present on multiple systems from multiple manufacturers, including many POWER8s and, most relevant to us, the POWER9 Romulus reference platform on which the Raptor Talos II family (including the Blackbird) is based. Most of these have AST BMCs, typically either the ASpeed AST2400 or 2500; all shipping Raptor systems use the AST2500. This is an 800MHz ARM11 ARM926EJ CPU with a secondary 200MHz ColdFire core, a built-in 2D framebuffer, USB and dual GigE MACs. Among its many tasks it controls the power rails, provides firmware stored in the PNOR flash for the main system CPUs, provides services to other autonomous subsystems over IPMI, receives temperature data from the main system CPUs' On-Chip Controller (OCC) and manages fan and environmental controls during normal operation. During bringup it is the main processing element before the primary CPU(s) are enabled; it is what signals the Power cores to execute from their burned-in OTPROM which then transfers control to the self-boot engine (SBE) firmware SEEPROM.

Besides the fact that the BMC has overwhelming ability to affect virtually all system components, including while the system is running, the BMC also directly influences how quickly the system can be brought up. To improve the security and auditability of BMC-based systems, the BMC in just about every OpenPOWER machine runs OpenBMC (not to be confused with the identically-named and similarly functioning Facebook OpenBMC), a small open-source Linux distribution tailored to its unique tasks.

The open availability of both the OpenPOWER firmware and OpenBMC itself is what really makes our systems truly ours from the firmware up. You can download and audit these critical pieces, including Raptor's Talos OpenBMC, and you are encouraged (and actively supported) to build and install your own. The problem, however, is that OpenBMC is relatively slow to get the system going when power is applied and the machine can't be started until it does. On a server which is normally up this is generally unimportant; my POWER6 gets rebooted pretty much only when the backup power fails. Similarly, this T2 is usually running all the time. My Blackbird, on the other hand, boots when the projector is turned on and gets shutdown when I'm not in the home theatre room. With over two minutes to get from turning on the power strip to a Fedora login and almost a full minute of that just to get the ability to start main power, this is a major drag and harms the ability to dogfood POWER9 in smaller applications. There is also the small but non-zero risk that if a power failure occurs during access to the flash that it could brick the system. The longer the bring-up time, the longer that potential window of vulnerability.

Fortunately, it looks like further advancements are now finally making a dent in the BMC bring-up delay. Almost 25% of the boot time on a Witherspoon (AC922) system was shaved off by converting the mapper service from Python to C++, and further savings were realized with straightforward wins such as eliminating other older Python components and adjusting the priority of the system service. Another big winner was apparently moving to dbus-broker, which is D-Bus compatible but higher performance. With all of these the OpenBMC bring-up on their Witherspoon box has reduced substantially and the upcoming AST2600 is reportedly three to four times faster.

This is a nice improvement even if it's probably most of the low-hanging fruit, and the OpenBMC team should get a solid thumbs up for the work here. I look forward to this appearing in a future firmware update for the T2 family. However, OpenBMC start time is only just one (albeit significant) piece of the startup puzzle: once main power is on, from the time the Blackbird boot screen appears (i.e., IPLs Hostboot) through Skiboot to the Petitboot menu the delay is still a hair over one minute, which compared to other platforms still seems way too long. Much of the time spent seems to be in Hostboot before Skiboot even gets initialized, but even Skiboot adds some overhead. Again, if you're like me and this is your primary computer, you won't deal with this often. But there's lots of Blackbirds and T2 Lites out there which are sidecars and while this is an obvious first world problem it's still a useability penalty to be paid.

None of this is a crippling fault with the platform, but particularly for the workstation market many of us are in, it's suboptimal. Therefore, continued improvement in basics like these makes the liveability of OpenPOWER on the desktop even better than it already is. And these improvements in OpenBMC hopefully should be just the beginning.

Posted by ClassicHasClass on November 16, 2019

Debian 10.2 available

Debian and Fedora (and their various downstreams and derivatives) are probably the top two Linux distributions on Raptor hardware, and now Debian is updated to 10.2. This is a maintenance release primarily addressing security updates and some critical fixes. The ppc64el images are already available for download.

Posted by ClassicHasClass on November 11, 2019

Fedora 31 mini-review on the Blackbird and Talos II

As promised here's my periodic mini-review after upgrading both our Blackbird and Talos II systems to Fedora 31, the most current release, typed up in Firefox 70 running on Fedora 31 on my T2. Even though there are many of you who don't run Fedora on OpenPOWER, these reviews are still relevant because Red Hat does a lot of the work on the components you do use, and problems are likely to turn up here first. Much to my disappointment, one late breaking note is that 128-bit long double still isn't in Fedora ppc64le, and didn't make 31 either. I can't tell from the bug or the wiki page why the deadline keeps slipping.

I did the upgrade first on my home theater GPU-less 4-core Blackbird because it was already bitten by the librsvg2 issue and early reports indicate the updated LLVM-rustc pair in F31 fixed it. The steps are the same as I used for F29 and F30 except for changing the parameter to --releasever=31 (duh). A quick check demonstrated updating librsvg2 to the latest available for F30 didn't solve the problem, so I went on to downloading the packages for F31.

When I rebooted into the F31 installer, however, the projector freaked out and went into an endless loop of trying and failing to sync to the display. I don't know if it was unhappy with the video mode the installer set, but even the A/V receiver wouldn't pass through the HDMI video (the T2 did something similar which I'll note in a moment). I eventually had to pull up a second VTY and then and only then would the projector display anything. I then logged in as root and monitored the messages from dnf with periodic dnf system-upgrade log --number=-1 | tail -10 until the machine rebooted on its own.

Fortunately, F31 came right back up. I've done only minimal customization on the Blackbird, so pretty much everything transferred over unchanged, and no packages had to be dropped to do the installation. F31 comes with GNOME 3.34, which is alleged to have performance improvements, and actually I was very pleasantly impressed as you can see from the screen shots:

Video playback on this GPU-less Blackbird was a lot better in this release; in fact, Firefox 70 didn't drop any frames or audio at all (though I'm sure the rapidly improving VMX/VSX support has something to do with it ;). Although VLC on the unaccelerated Blackbird is still not perfect and playback was not completely smooth, pixel pushing was much improved in both DVD and Blu-ray playback and there were fewer dropouts with the TOSLINK surround sound (mplayer of course still played everything just fine). Unfortunately, I think the improvements are strictly in Mutter and GNOME itself, not llvmpipe, because Xonotic was still only ekeing out a bare 5fps at 1920x1080 as in F30.

As advertised, librsvg2 was working again. If you have exclude=librsvg2 in /etc/dnf/dnf.conf, you should remove it before you do the update. Every GNOME release has some vanity changes for no good reason and the new icons and minor UI tweaks seemed largely unnecessary but they weren't objectionable. On the apps side, GNOME Videos doesn't seem to grok the length of my AIFF music files correctly, though it does play them (MP3 was fine). GNOME Web was also working again after a long hiatus but it seemed to have minor glitches, and since there are people who use POWER9 now who are actually helping to maintain Firefox, you should just use Firefox.

Since I didn't find any obvious major regressions in my normal usage, the next step was to update the Talos II. The T2 does not have the WX7100 firmware in the BMC PNOR, so I expect to run the installer "blind," but interestingly my LCD would not sync to the display either just like the projector wouldn't. The LCD synced fine when I popped open a VTY, just as with the Blackbird, so I'm thinking there's something up with the installer's video mode. Otherwise, the install proceeded unattended and rebooted uneventfully.

As my daily driver the T2 is rather more customized than the Blackbird. It's pretty much a given that I'll lose some of my GNOME extensions in the upgrade or the custom "classic" OS X-like theme I use will have some odd breaking edge case, and that happened here as usual. In this case Dash to Dock was the casualty and the GNOME Extensions Manager refused to update it, requiring me to manually install it. Tweaks and Settings still have visual issues with my theme, but didn't seem worse, just annoying. The only thing installed that didn't transfer over were my custom Perl libraries which got eaten and needed to be reinstalled. I know, I know, I'm the last person on Earth who still likes Perl apparently.

On the T2 with its WX7100 workstation card, the graphical performance improvements were not as notable as with the Blackbird, but some things seemed better, and a few 3D games that chugged a bit on F30 seemed faster on F31. I'd still say performance was a net win, just smaller.

Both systems use X.Org, but I do try to at least test Wayland. My T2 is configured to come up in a text boot so that I have a console to fall back on, an artifact of originally being Fedora Server and converted to Workstation. I was able to start GNOME in Wayland from the command line with XDG_SESSION_TYPE=wayland exec dbus-run-session gnome-session (instead of startx). I would call it incrementally improved from before. Some apps (mostly games) still don't start, and some games that do start have odd aspect ratios, but more at least work. The issue with some apps, particularly XWayland ones, not obeying GNOME theming seems to be fixed, and while it didn't feel quite as snappy as X.Org it was still better than previous releases. However, my custom appmodmap tool for dynamically remapping the keyboard only works with things that run in XWayland, because it watches X events to know which window is up, and the GNOME Wayland compositor currently has no plans to offer this information. So back to X.Org.

However, the situation was even worse with the Blackbird. Since overall graphical performance seemed better I decided to push my luck and see how it worked in Wayland (which previously ran like treacle on Thorazine in January north of the Arctic Circle), but as soon as I switched to Wayland and rebooted, this time the Blackbird would not come up in a graphical boot at all. On several test boots after the kernel messages it immediately went to a grey screen with the mouse pointer and then froze hard, requiring me to power cycle it because I couldn't open any VTYs or get to the OS. Since this is a workstation installation rather than a server installation converted into workstation, I had to boot the Fedora rescue installer to fix /etc/gdm/custom.conf because I couldn't get the machine to come up otherwise. If you are installing F31 from scratch, you may want to make sure that WaylandEnable=false is uncommented before you try your installation out.

Overall Fedora 31 is both a good release and a bad omen. Performance (at least in GNOME under X.Org) is overall much improved, especially if you don't have a GPU, but it's still obviously better even if you do. Some bugs were fixed and packages installed uneventfully. There were the regular growing pains in GNOME, but I didn't lose anything irreplaceable, and other than the usual bumps one experiences with custom themes and extensions pretty much everything just worked.

But Wayland on ppc64le continues to be worrisome. I must concede that at least on the T2+WX7100 things has improved since F30, and since I freely admit I'm a Wayland sceptic those of you who are heavily invested in it probably don't care about my opinion. But overall it's still a step backwards because there are still things that won't run in it, a big part of my own personal workflow may never work with it, and on the GPU-less Blackbird beforehand I couldn't use it and now I can't even start the machine in it. Meanwhile, Red Hat's made some very public signals that Wayland is the future and X.Org will be going away. In their rush to do so not much attention is being paid to people using 2D framebuffers with Wayland, and this is a real problem because no currently available GPU is libre and not supporting the built-in BMC in every shipping Raptor system is a waste (not to mention requiring people to incur additional expense just to get something to work that was "already working"). If you want a truly blob-free system, right now you just plain can't use Wayland, and it doesn't seem like they care.

Posted by ClassicHasClass on November 07, 2019

DD2.3 POWER9 steppings now available

Raptor now has SKUs for the Sforza DD2.3 POWER9 chips, which they're calling "POWER9 v2". Currently just the 4-core and 8-core are available, but the higher core counts are presumably soon to come. There is a slight price premium of around 15-20% for these over the DD2.2 CPUs, but they fix a number of errata including functional hardware watchpoints (no more YOLO mode) and add the new Ultravisor mode for enhanced security (which will be the subject of a future article). In addition, although TDP, clock speed and cache specifications are the same, improved Spectre v2 mitigations in this stepping (specifically count cache flushing with hardware assist) mean possible performance improvements particularly for branch-heavy workloads. Support for this feature should already be in current Linux kernels.

If you have a T2 family system, you can order these today, and the SKUs are reported as in-stock. They are drop-in replacements for all T2s and Blackbirds and because their TDPs are the same can use the same heat sinks and HSFs. Systems shipping now may still have DD2.2 chips in them, though Raptor says you can get a DD2.3 for a slight upcharge.

Posted by ClassicHasClass on November 07, 2019

Talos II and Talos II Lite officially FSF Respects Your Freedom products

No one disputes the Free Software Federation practices what they preach, and no one disputes that their standards are strict. So hats off to Raptor, who today officially received FSF Respects Your Freedom designations for both the Talos II and T2 Lite (here's the official announcement).

The designation recognizes that the T2 and T2 family have full system schematics and source code available for the entire firmware stack from the BMC up, and no keys are needed to update or replace any firmware component unless you require your own. (The same applies to the Blackbird, too, of course; presumably its own FSF RYF certification is soon to follow.) Naturally the designation presupposes you are using a free distribution, as the FSF defines it.

The T2 family joins a relatively small number of complete systems that have RYF endorsements and given those systems' loadouts is easily the most powerful, at least of this writing. Not only is this a nice win for Raptor, who have made libre computing a cornerstone of their company, but it's also a great validation for OpenPOWER. A designation like this from the FSF, who stakes their entire reputation on libre computing, is no small matter no matter how you slice it. Congratulations!

Posted by ClassicHasClass on November 05, 2019

FreeBSD 12.1 available

FreeBSD 12.1 is now available. This is largely a maintenance release. To the best of my knowledge this is the BSD with the best track record on OpenPOWER so far; it is otherwise a relatively straightforward 64-bit big-endian Power implementation. I'm still a NetBSD dweeb personally (on mac68k, macppc, cobalt and hpcsh) and I'm looking forward to someone porting it sooner or later, but if you want a BSD on your Blackbird or Talos II right now this is probably your best bet.

The installation directions for the Blackbird should work as is for the Talos II. However, if you've already got the ISO (not the .img) dd'ed to a USB stick, it seems to me that it should "just work" in Petitboot without all the goofing around at the BMC prompt (if you don't, though, then these instructions will allow you to bring the machine "up from nothing").

If you are already running FreeBSD, unfortunately it does not seem that the PowerPC port of FreeBSD supports freebsd-update(8) yet, though I imagine this is planned. FreeBSD 13-CURRENT boots and runs fine on the Raptor family as well, but no clear word on when that will reach release yet.

Posted by ClassicHasClass on October 30, 2019

Fedora 31 available

Fedora 31 is now available, the next iteration of the somewhat bleeding edge of Red Hat (the totally bloody-all-over-the-floor edge is of course Rawhide). It is of particular interest to me personally since the Talos II I'm typing on is running Fedora 30, and it's a useful canary for future hiccups on Power ISA especially because Red Hat is an IBM thing now. Even if you don't use Fedora personally, its relatively rapid update schedule can help identify and fix architecture-specific issues well in advance in your own distro of choice.

F31 moves to GNOME 3.34 (presumably with performance improvements, so I look forward to seeing how this performs on my GPU-less Blackbird) and glibc 2.30. This last is particularly important to Power systems because it may finally mark the end of the 128-bit long double saga by transitioning to the new float ABI. Fedora is also encouraging the use of toolbox, a workspace container system; this too is supported on ppc64le, though I haven't messed with it much yet. Finally, based on this report, F31's use of LLVM 9 should also solve the codegen and faulty assertion issue plaguing librsvg2. Since I now have two POWER9 systems here, I'll do a test upgrade on the Blackbird and then the Talos II once the package mirrors have caught up, reporting back as in our prior reviews, but if these improvements in fact live up to the release notes this actually sounds like a really nice release for us especially.

In miscellaneous notes, F29 will be unsupported one month after this point, so make sure you're upgrading if you're still on that, and 32-bit i686 is no longer a thing on Fedora. (32-bit PowerPC was unsupported long ago in F22, just to desperately keep on topic.)

Posted by ClassicHasClass on October 23, 2019

Firefox 70 on POWER

Firefox 70 is out and about. This is a very important release particularly for Power ISA because this includes a repaired 64-bit xpconnect and build system support for VMX and VSX (with VMX support in parts of the DOM and for libjpeg). VMX/VSX support is determined at runtime but I still advise if you build yourself to manually specify your CPU to the compiler (such as -mcpu=power9) to make sure everything is detected and better code can be generated. All these features work on both big and little endian configurations.

Fx70 is also the first release to officially enable the Quantum Render GPU-accelerated 2D compositor on all Windows-supported GPUs, which emerged from the Servo browser testbed as WebRender and has been gradually translated to Firefox. This is clearly the intended future of the browser, so we need to ensure it's operational on our platform.

AMD has been a supported GPU since Fx68 (Northern Islands, i.e., Radeon HD 6000 et al., and newer), so while Linux is not currently an officially supported Quantum Render target the WX 7100 sold with the Talos II should work. And, well, it does.

Performance is a bit sprightlier and I see better FPSes in demos, though our FPS rate is now increasingly JavaScript limited (yes, I know) as the rest of the rendering chain gets faster and faster. I have not encountered any stability or rendering issues with it so far. To enable WebRender, you need to enable hardware GPU acceleration in general and make sure that's working first; go to about:config, set layers.acceleration.force-enabled to true and restart the browser. I've been running with GPU acceleration myself for the past several releases, so I know it should work on at least the WX7100. Verify it's enabled by going to about:support and making sure that acceleration does not appear as "Blocked."

Once you have established GPU acceleration is enabled and operational, then go back to about:config, set gfx.webrender.all to true and restart the browser again. Go back to about:support; the window should look like the smaller one in the first screenshot. If sites go haywire, don't render right or seem to animate improperly, please flip those prefs back and compare so we can figure out why.

Northern Islands is a pretty low bar for WebRender and frankly if you're trying to run this on an even older AMD (or ATI??) GPU, you'll probably have lots of problems with almost certainly no benefit. Likewise, if you try to do this on Nvidia with nouveau, you're crazy. I don't see any reason why this wouldn't work on the *BSDs but I'd be interested to hear from anyone who has tried.

Meanwhile, more VMX and VSX improvements are in the pipeline and are certain to reach you faster with the increased release cadence in 2020. The .mozconfigs I personally use and support are unchanged from Firefox 67.

Posted by ClassicHasClass on October 20, 2019

Is the warrant canary still warranted?

UPDATE: Raptor will keep the canary but reduce the frequency to every six months. There appears to be some significant cost to them, so this seems like a good compromise to me.

Somebody is actually watching Raptor's warrant canary, and mentioned it hasn't been updated in 6 months (as of this writing the last date is March 3, 2019). Although my usual tendency is to glance at it before installing a firmware update, 1.06 is over a year old, so I hadn't noticed myself.

Conceptually, the warrant canary helps to protect purchasers by acting as a negative indicator if they are under a gag order regarding a subpoena or other state actor legal action: if the canary disappears or isn't reupped, then caveat emptor. Raptor's response in the Twitter thread suggests that the failure to update was inadvertent and my gut impression is this is probably true, but the real question is how likely Talos or Blackbird owners are to be targets for state-level threats. We're using niche machines here but the OpenPOWER workstation userbase tends to be more cognizant of how it can be monitored, and if we weren't on watchlists before OpenPOWER started getting more popular, especially in certain countries the number of workstations may now be at a level where such concerns are no longer preposterous.

Raptor, in the same thread, is asking users to speak up about whether the warrant canary is still useful. (They mention a cost/benefit ratio; I'm interested to hear what the cost is. Is it time, money, both?) Lest one think a smaller company could be pushed around more easily, I don't think size is really a factor here; in fact, I'd argue that a bigger company is even less likely to care about such things because of increased bureaucracy and potentially competing internal priorities over government contracts. I agree their point we really should get comfortable with rolling our own firmware is very well taken, but by the same token it's not necessarily a small task for an individual to audit Raptor's tree either. Particularly for critical or time-sensitive updates we will still have some level of vendor dependency and it would be nice to have the canary in those circumstances when using a pre-built firmware package becomes necessary, so put my vote down as "please keep it." We're using these machines for a reason, and the more failsafes there are, the more we're better protected from Mayhem — like meow.

(*not sponsored or endorsed by Allstate)

Posted by ClassicHasClass on October 18, 2019

Ubuntu 19.10 available

Ubuntu 19.10 is now available with the vaguely unwieldy name "EoanErmine" based on kernel 5.3 and GNOME 3.34. An interesting improvement in this release is their expanded cross-compilation toolchain allowing building for s390x, ppc64le, riscv and ARM targets, which hopefully will expand the number of ports and pre-compiled packages on this platform; another interesting one is experimental ZFS on root support. Although an official desktop release of Ubuntu for ppc64le still doesn't exist, the release notes do say that "[t]he ppc64el [sic] ... live-server ISO images are now considered production ready and are the preferred media to install Ubuntu Server on bare metal" (excellent!), so download the server ISO, and then for your workstation you can convert it to desktop Ubuntu.

Posted by ClassicHasClass on October 10, 2019

librsvg2 issue on ppc64le

If you are using Fedora, keep an eye on bug 1756838 where an LLVM 8 codegen issue is suspected with ppc64le causing an apparently faulty assertion in librsvg2. Unfortunately, this library is heavily used by (at least) GNOME and Xfce, meaning the issue may well make your desktop environment unusable -- for example, my Blackbird with the faulty library couldn't open the Applications drawer without crashing gnome-shell. Unfortunately, reducing the codegen issue has not been trivial.

The faulty build is librsvg2-2.46.0-2. If you keep, or downgrade to, librsvg2-2.45.90-1, this version is unaffected because it was built with an earlier toolchain. At least for Fedora, there appear to be no ABI changes between 2.45.90 and 2.46.0 (thanks to Dan Horák for confirming this) and there are no known or at least visible security issues in the earlier version, so it is currently safe to stay there.

On Fedora, if you are on F30 and have not yet been affected, you may wish to consider putting exclude=librsvg2 into /etc/dnf/dnf.conf to inhibit updates to it until further notice. If you have been affected, you can attempt to downgrade to 2.45.90, though interestingly on my (unaffected) Talos II that has not updated to the bad version,

% strings /usr/lib64/librsvg-2.so.2.46.0 | fgrep 2.4 | grep fc [...] librsvg-2.so.2.46.0-2.45.90-1.fc30.ppc64le.debug

It is possible F31 may smooth this over with LLVM 9, which should arrive later this month, and doesn't appear to suffer from this problem.

This may not affect other distributions with older toolchains. If your distribution is also affected, please post in the comments.

Posted by ClassicHasClass on September 24, 2019

CentOS 8 and CentOS Stream (The Freshmaker)

Yeah, okay, we've had a lot to say about Red Hat derivatives lately. On the heels of CentOS 7's latest service release now comes CentOS 8 in a new ~~minty~~ flavour CentOS Stream, "a midstream distribution that provides a cleared-path for participation in creating the next version of RHEL," rebranding the "classic" CentOS build from RHEL as CentOS Linux. Mentally translating, the intention appears to be as a staging area for updates from Fedora mainline to trickle into minor releases of RHEL (and thence to mainline CentOS), using CentOS Stream to more gradually introduce updates and incorporate user feedback in a rolling release fashion rather than the typical all-at-once version churn that previously resulted. You know, like chewy candy mints that make things fresher the moment you pop one in your mouth.

That said, mainline Fedora is plenty stable for (my) daily use on POWER9 and elsewhere (we're not talking bleeding-edge saddle sore Rawhide, kids), and Fedora will still be the ultimate upstream, so while I think this will help CentOS developers dogfood changes more gradually I'm having difficulty envisioning the small slice of conservative-but-not-that-conservative users this will appeal to as a daily driver. More likely people will simply regard it as the "public beta" channel for CentOS and RHEL, and I think that will be the actual role it serves regardless of the frilly language.

The CentOS Download site is not currently showing Power ISA (or other AltArch) builds for either CentOS 8 or CentOS Stream yet, but I expect these to emerge soon. It will be interesting to see if big-endian ppc64 is still supported when they do, but there should be POWER9 and "generic" little-endian builds at minimum.

Posted by ClassicHasClass on September 22, 2019

Low-level change to Firefox 70 and ESR coming

If you are using Firefox on 64-bit Power, you'll want to know about bug 1576303 which will be landing soon on the beta and ESR68 trees to be incorporated into 70 and the next ESR respectively. This fixes a long-standing issue with intermittent and difficult to trace crashes (thanks to Ted Campbell at Mozilla for figuring out the root cause and Dan Horák for providing the hardware access) due to what in retrospect was a blatant violation of the ELF ABI in xpconnect, which glues JavaScript to native XPCOM. This needed several dodgy workarounds until we found the actual culprit.

The patch is well tested on multiple little-endian systems including this Talos II, but because it's an issue with register allocation in function calls the issue also theoretically affects big-endian Power even though we haven't seen any reports. I'm pretty sure the code I wrote will work for big-endian but none of my big-endian Power systems run mainline Firefox (and TenFourFox even on the G5 is 32-bit, where the problem isn't present). If you're using a big-endian system, you may want to pull a current release and make sure there is no regression in the browser with the changes; if there is and you can bisect to it, post in the bug so we can do a follow-up fix. On the other hand, if you're building from an old ESR such as 52 (the last non-Rust-required one), you may want to backport this fix because the problem has been there pretty much since it was first written.

Stuff like this actually proves Linus Torvalds' point that "as long as everybody does cross-development, the platform won't be all that stable." Linus was talking about ARM-based servers being undercut by a dearth of ARM-based PCs, but the point is also true here: 64-bit Power may do well in the data center but it was rarely used for workstations other than the Power Mac G5 and the small number of non-Apple PowerPC 970 towers, meaning this bug went undiscovered until people like us finally started dogfooding Power-based desktops again. (For that matter, the official PowerPC Mac OS X builds of Firefox were also always 32-bit, even on the G5, so no one would have noticed it there.) There's just no substitute for improving the quality and quantity of software for Power ISA like having one under your desk, and as the number of machines increases I expect we'll get more of these ugly corner bugs ironed out in other packages too.

Posted by ClassicHasClass on September 18, 2019

CentOS 7-1908 available

CentOS 7-1908 is now available; this is a maintenance release with multiple updated components derived from Red Hat Enterprise Linux 7.7. Particularly interesting is that there are no less than three Power ISA downloads available, one for big-endian ppc64 (though POWER7 and up only: sorry G5 owners), one for ppc64le and a special build for POWER9 (which appears to also be little-endian), each with its own Everything, NetInstall and Minimal flavours.

Posted by ClassicHasClass on September 16, 2019

Linux 5.3 for POWER, and ppc64le gets a Fedora Desktop

Linus always says that no Linux release is a feature release and numbers are purely bookkeeping instead of goalposts, but Linux 5.3 has landed. There are many changes for the x86 side of the fence that I won't mention here, but in platform-agnostic changes, 5.3 adds support for the AMD Navi GPU in amdgpu, allows loading of xz-compressed firmware files, further improves the situation with process ID reuse with additional expansions to pidfd (including polling support), refinements to the scheduler by supporting clamped processor clock ranges, and support for 0.0.0.0/8 as a valid IPv4 range, allowing another 16 million IPv4 addresses while IPv6 continues to not set the world on fire.

Power ISA-specific changes in this release are relatively few but still noteworthy. Besides support for LZMA and LZO-compressed uImages, there is now Power ISA support for HAVE_ARCH_HUGE_VMAP, which enables (as the name would suggest) huge virtual memory mappings. With additional code in a future kernel, this should facilitate upcoming performance improvements. There is also additional /proc support for getting statistics on how virtual CPUs are dispatched to physical cores by systems using the Power hypervisor.

Meanwhile, this won't make much difference to people like me who have been using Fedora for awhile, but if you want to experiment with other distros on your POWER9 system Fedora is working on Live and Workstation ISOs for ppc64le. Currently this is Rawhide only (which is what will become F32) and you can of course already install from Server and switch to the Workstation flavour, or install over the network. However, it's just another positive indicator that IBM's purchase of Red Hat will continue facilitating improvements in Linux in general and Fedora/RHEL support for OpenPOWER in particular, especially as the installed base of POWER9 workstations like our T2s and Blackbirds continues to grow in numbers. In fact, although we don't have statistics, it's still quite possible (counting box for box) that there are now more discrete POWER9 workstations in operation out there than there are servers.

Posted by ClassicHasClass on September 14, 2019

A beginner's guide to hacking Microwatt

Many improvements have occurred in Microwatt, the little VHDL Power ISA softcore, so far the easiest way — particularly for us hobbyists — of getting an OpenPOWER core in hardware you can play with. (The logo is not an official logo for Microwatt, but I figured it would be fun to try my hand at one in Krita.) Even though it still has many known and acknowledged deficiencies it's actually pretty easy to get it up and running in simulation, and easier still on POWER9 hardware where the toolchain is already ready to go.

I'm no VHDL genius personally, but this seemed like as good a time as any to learn. ghdl is available for most distros, though Fedora 30 and earlier curiously lack it for ppc64le; fortunately, Dan Horák's builds work fine. So let's get the basics up. If you're on F30 as I am, install ghdl from his repo first. These URLs may vary; they are what was current at the time of this article.

% sudo dnf install https://copr-be.cloud.fedoraproject.org/results/sharkcz/danny/fedora-30-ppc64le/01028671-ghdl/ghdl-grt-0.37dev-1.20190820gitf977ba0.fc30.ppc64le.rpm [...] % sudo dnf install https://copr-be.cloud.fedoraproject.org/results/sharkcz/danny/fedora-30-ppc64le/01028671-ghdl/ghdl-0.37dev-1.20190820gitf977ba0.fc30.ppc64le.rpm [...]

For Debian and other distros, install from your package manager as appropriate.

Next, let's install Microwatt and MicroPython and make sure all that works. This is essentially the same demo Anton showed at the OpenPOWER summit. If you are doing this on an inferior x86_64 system (or at least something that isn't POWER8 or POWER9), you will need to have a Power ISA C cross-compilation toolchain installed to properly build MicroPython. Adjust the make -jXX to your number of threads. This sequence of commands will end up with microwatt/ and micropython/ installed in separate directories at the same filesystem depth (in my case, ~/src). Keep it this way because we will be adding one more project at the end.

% git clone git://github.com/antonblanchard/microwatt.git Cloning into 'microwatt'... [...] Resolving deltas: 100% (818/818), done. % cd microwatt % make -j24 ghdl -a --std=08 decode_types.vhdl [...] % cd .. % git clone git://github.com/mikey/micropython.git Cloning into 'micropython'... [...] Resolving deltas: 100% (52248/52248), done. % cd micropython % git checkout powerpc Already on 'powerpc' Your branch is up to date with 'origin/powerpc'. % cd ports/powerpc % make -j24 mkdir -p build/genhdr [...] MISC freezing bytecode CC build/_frozen_mpy.c LINK build/firmware.elf [...] % cd ../../../microwatt % ln -s ../micropython/ports/powerpc/build/firmware.bin simple_ram_behavioural.bin % ./core_tb > /dev/null MicroPython v1.11-320-g7747411e9 on 2019-09-14; bare-metal with POWERPC Type "help()" for more information. >>> 1+2 3

The simulation is rather slow, made worse by all the copious debugging output (which here is sent to the bitbucket), but it does work as advertised. To make core_tb stop, you will probably need to kill it from another terminal session, depending on your shell. (I had to.)

Let's now turn to adding instructions to Microwatt. Since having to manually kill the simulation is annoying — it would be nice if the simulation could gracefully halt under program control — we'll implement a wait instruction as an educational example. This instruction is new in ISA 3.0B; the ISA book explains its operation as that it "causes instruction fetching and execution to be suspended. Instruction fetching and execution are resumed when the events specified by the WC field [the wait condition, its sole constant parameter] occur." Strictly speaking probably the stop instruction would have the most authentic semantics — "The thread is placed into power-saving mode and execution is stopped." — but for obvious reasons this is a privileged instruction because this would completely halt that hardware thread until a system reset or other system-level event. Also, it doesn't take any parameters, so it's not as nice an illustration.

wait's sole supported WC field code is 0b00; this causes the instruction to "[r]esume instruction fetching and execution when an exception, an event-based branch exception, or a platform notify occurs." In practical circumstances, if you execute wait from a userspace program, these events happen all the time and the instruction seems like a no-op.

% more test.c #include <stdio.h> int main(int argc, char** argv) { __asm__("wait 0\n"); fprintf(stderr, "ok\n"); return 0; } % gcc -o test test.c % ./test ok

However, on a little core doing nothing else, it well might be a terminal instruction sequence, so since we can run it from userspace anyway let's go ahead and implement a hal-fassed version of it which will cause the simulation to conclude gracefully. This is the diff that does so, applied against ab34c483. Let's analyze it piece by piece.

First, let us note that there's already code in Microwatt for an ungraceful exit, such as when you execute an undefined instruction; this terminates with an error. We could simply use that, but I'd prefer to do something cleaner, so we'll define a new signal for halting.

Next, we will define the opcode format in the instruction decoder. Conveniently, the instructions td and tdi (trap doubleword and trap doubleword immediate, respectively) have a similar encoding where their common constant argument — the "TO" or trap operation bits — occupies the same bit field. (Note that td et al. allow five bits here but wait only takes the two least significant bits with the other three reserved. We will handwave this away since they are invariably encoded as zero.) To get these bits decoded for us, we specify that the first constant argument is encoded as TOO. You can see other encodings for registers and immediates in the surrounding templates.

Next, we tell Microwatt how to identify the opcode. The bit fields for the opcode pieces are simply cribbed from the ISA book.

Next, we add the actual symbols for the instruction and the operation, thus linking them up with the decoder.

Then, we write the operation's logic. For illustrative purposes, since only 0b00 is allowed and other bit combinations are reserved, we will have the simulation assert and ungracefully terminate on other values using the existing code. Otherwise, we set the halted signal.

Finally, we write the code to actually gracefully halt when the halted signal appears, using the built-in VHDL test bench function stop() (coincidentally named, as it happens).

With this patch applied, rebuild Microwatt with a make. To test it, we'll need something that actually executes this instruction, so let's make a simple "hello world" type example using pieces from MicroPython and Microwatt's own built-in "hello world." A small assembly language stub (in both of these examples, head.S) acts as a trampoline into whatever our main() is, detecting if we are running it within QEMU or from the VHDL test bench. However, we won't have a libc and we'll need routines to actually send and receive data with the "serial console" presented by the core. We also need a couple hints for the linker to make a binary we can actually run in the simulator.

I've compiled all of these pieces into a Github project "Microhello," which you can use as a scaffold for your own programs to run on the core. I've tried to make it a little more modularized than the Microwatt "Hello World" example as well. Clone it at the same depth as microwatt/ and micropython/, then do make runrun to replace the symbolic link to the MicroPython binary with Microhello:

% git clone git://github.com/classilla/microhello.git [...] % cd microhello % make runrun cc -I. -g -Wall -std=c99 -msoft-float -mno-string -mno-multiple -mno-vsx -mno-altivec -mlittle-endian -fno-stack-protector -mstrict-align -ffreestanding -Os -fdata-sections -ffunction-sections -c -o build/main.o main.c cc -I. -g -Wall -std=c99 -msoft-float -mno-string -mno-multiple -mno-vsx -mno-altivec -mlittle-endian -fno-stack-protector -mstrict-align -ffreestanding -Os -fdata-sections -ffunction-sections -c -o build/uart_core.o uart_core.c cc -I. -g -Wall -std=c99 -msoft-float -mno-string -mno-multiple -mno-vsx -mno-altivec -mlittle-endian -fno-stack-protector -mstrict-align -ffreestanding -Os -fdata-sections -ffunction-sections -c -o build/string.o string.c cc head.S -c -o build/head.o ld -N -T powerpc.lds -o build/firmware.elf build/main.o build/uart_core.o build/string.o build/head.o powerpc.lds size build/firmware.elf

text   data    bss    dec    hex filename
6508      0     24   6532   1984 build/firmware.elf

objcopy -O binary build/firmware.elf build/firmware.bin ( cd ../microwatt && rm -f simple_ram_behavioural.bin ) /usr/bin/make run make[1]: Entering directory '/home/censored/src/microhello' ( cd ../microwatt && \ ln -s ../microhello/build/firmware.bin simple_ram_behavioural.bin && \ ./core_tb > /dev/null ) PowerPC to the People

We neatly came to a halt. Yay!

The serial console library is in uart_core.c and a basic implementation of puts() (and strlen()) is in string.c. The main() is very simple. Minus the comments, here is main.c in its entirety:

#include "uart_core.h" #include "string.h" int main(int argc, char** argv) { uart_init_ppc(argc); puts("PowerPC to the People"); __asm__("wait 0\n"); return 0; }

The trampoline uses the start of execution to determine what mode to initialize the serial console in, passing that to main() in r3, which in the Power ABI is the first argument to the function (argc). We then puts() the string and execute a wait 0 to terminate. Easy.

To prove the argument is being evaluated, change the instruction to wait 3 and re-run with make runrun. Notice how it terminates:

PowerPC to the People make[1]: *** [Makefile:16: run] Error 1

If you run ./core_tb (in the microwatt/) directory without sending the output to /dev/null, you will see the message from our implementation in the log with the invalid wait condition.

Lastly, if you remove the wait instruction entirely and re-run with make runrun, then the test bench will loop forever echoing our string repeatedly, bouncing in and out of our code on the trampoline, until you kill it.

Microwatt is fun, simple, easy to experiment with and a great way to better understand what Power ISA does under the hood. While its performance is no barnburner, as a pedagogical aid it's a great little proof of concept, and it can certainly be the basis for something bigger. In a future article we'll actually synthesize this core and do a little more with it in actual hardware.

Posted by ClassicHasClass on September 06, 2019

Firefox 69 on POWER

A brief note to say so far no major issues with Firefox 69 on Power ISA and this post is being made from it on my T2. (We're still dealing with bug 1576303 for Firefox 70, however.) As with Fx68, the working build configurations for ppc64le are unchanged from Fx67.

Posted by ClassicHasClass on August 30, 2019

The VMX eagle is landing in Firefox 70 (plus: which core should open the door?)

Hugo Landau has an interesting take on the new open-OpenPOWER world. He points out, correctly, that Power ISA is a big win for open architectures because it has maturity in both the embedded and server spaces, but he'd like to see an actual production core opened as well (Microwatt is a lovely MVP and a great proof of concept but it is clearly for experimentation, not for production).

His suggestion is a softcore version of the PPC 405. PowerPC 4xx is a very common embedded CPU family indeed (the POWER8 OCC even has one inside of it), and in the Power.org days IBM was even willing to make it available to academia and researchers. He also suggests open-sourcing Mambo, IBM's currently proprietary simulator.

Open-sourcing Mambo is especially appealing to me trying to do simulation work of my own and not being able to do it on a POWER9! (It claims there is a POWER9 version for Debian, but the install directions and download area strictly show x86_64.) I also think there would be little non-IBM IP to stand in the way of doing so. On the other hand, although opening up the 405 would be admirable, I'm not sure how much it would accomplish in practice: it's 32-bit, not 64-bit; it's strictly big-endian (I like big-endian personally and three of the five systems on this KVM are big-endian, but we all know where the market's going and OpenPOWER in particular emphasizes LE); and it lacks VMX, a/k/a AltiVec. That brings us to Firefox.

In the TenFourFox world myself and several contributors did a fair bit of work on AltiVec acceleration to beef up performance on G4 and G5 systems. (Editorial note: I only use the term AltiVec for Apple systems and chips made by Motorola/Freescale, since Apple used both Motorola/Freescale and IBM parts, and Motorola/Freescale (now NXP) owned the trademark. IBM never owned nor licensed this trademark and always called it VMX, so in OpenPOWER, it's VMX. For that matter neither did P.A. Semi, so the PA6T has VMX too. Even with the G5, although its vector unit was popularly called AltiVec, IBM never officially referred to it by that name.) There are also many opportunities for VMX acceleration in mainline Firefox; depending on your compiler settings, these might get silently enabled already (such as qcms). libpng even has support for VSX. However, many in-tree components either never had the build-system glue written to turn on VMX support (libjpeg, libpng) or they're based on custom SIMD code Mozilla wrote that has no Power ISA equivalent.

For Firefox 70, build system support for VMX, VSX and VSX-3 compiler flags plus runtime detection is now available, written by yours truly, along with the first of the TenFourFox patches I updated and upstreamed to mainline Firefox (this one for fast scanning of text fragments for wide characters). I'm also hoping the libjpeg VMX enablement patch lands in time for merge with several more VMX patches to come. My work on the Firefox Power JIT is somewhat slowed by my continuing responsibilities to TenFourFox and compiler issues such as bug 1576303, which is why I wanted to get a couple quick wins with VMX stuff I already had on the shelf.

Allow me to close the loop on our core digression, though. In bug 817058 I'm asked a question by one of the Mozilla devs: can they just assume every Power chip someone is running Firefox on would support VMX? The answer, even for 64-bit Power, is no, because of poor choices like the AmigaOne X5000 running the QorIQ P5020 which has no SIMD. However, Rust supports compiling for SIMD and Power is a supported architecture, which means Rust supports VMX too, and Mozilla would be foolish not to take advantage of that. Assuming SIMD features are "just present" will become increasingly common and that means that continuing to run parts that don't have VMX (let alone VSX) will become an even bigger losing game on the desktop than it already was. Rather than the 405 I'd personally like to see something like the G5 itself be made openly available: it's POWER4, so it's 64-bit and largely upwardly compatible but wouldn't be a commercially competitive product at the high tiers IBM cares about, it has a VMX unit but it's IBM's (co-developed from the G4/7400), and it's fairly well-understood. Downclock it to reduce power consumption and it could even be a credible upper-end embedded chip. The only thing it lacks is a true little-endian mode.

More on Firefox 69 when it is officially released next week.

Posted by ClassicHasClass on August 28, 2019

Support from a silicon turnip

Raptor is asking users to do a brief checklist before asking for support to help streamline problem determination. I think this just makes good sense, but although it's reasonable to assume most users would have another system around that can talk to the BMC (I use my Quad G5 for this), it might be nice in a future firmware version to have some sort of confidence testing. I'm not sure how that would look necessarily in implementation but I know when I was trying to determine why my kernel was freaking out that eliminating hardware as a cause, however unlikely, would have been helpful.

Ordinarily this would merit merely a brief informational item, except let's consider it in the context of my earlier underdeveloped pontification: Raptor must now have enough of an installed base that streamlining support is now necessary. I'm an early adopter; my Talos II is serial #12 and my Blackbird is serial #75. Back in those distant bygone days of 2018 with the early firmware that ran like a wind tunnel, I pretty much conversed with support directly over E-mail (I suspect it was Tim himself) and handled everything that way, but that clearly wouldn't scale beyond a certain number of even technically adept users. (I did comment at the time that it was the best support I'd ever had with any computer system and I still think so.)

We don't know how many Talos-family systems are out there, and Raptor is not a public company, so sales figures are kept close to the vest. I don't really begrudge them this, either, because pro-Power bigots like me would still use the platform even if we were the only ones out there and haters gonna hate whether there's 10 or 10 million. (However, if people want to post serial numbers in the comments, we can find the highest one and make an educated guess.) I think we can safely assume the support volume is not being driven by poor quality, so if support volume has increased to a critical mass where changes must be made, then that must be due to enough machines out there actually being used. And as I said in the prior article, moving enough machines is the only sane way to get the cost down. I stand by my musings that a good second workstation-market supplier could have advantages for both volume and market stability, but we also don't want some knockoff company that isn't beholden to open libre computing principles sucking the life from this segment with a race to the bottom.

Having said that: I hear things and people tell me stuff, and while I'm sworn to secrecy right now, I am permitted to say obliquely that a promising development in moving more machines is afoot. I think that's as much as I should say on the subject but if it pans out, I think all of us in the OpenPOWER world will be very, very pleased.

Posted by ClassicHasClass on August 25, 2019

Blood from a silicon turnip

Now that we all have the hangover from hell after the big OpenPOWER-is-open party and are sitting around nursing headaches and sipping raw eggs from brandy snifters, let's talk about squeezing blood out of silicon turnips.

In general my cursory view of the Internet demonstrates two, maybe three, reactions to the OpenPOWER announcement:

"Hey, cool!" (or, less commonly but frequently enough to be obnoxious, "Didn't PowerPC die years ago?")

and

"It's too expensive."

Uniformly these two statements are being said by individual developers talking about getting one of their own systems, at least publicly, anyway (enterprise customers may also be complaining but I haven't seen very much in the places that are publicly visible). For the sake of the discussion let's ignore both the fact that people who skimp on privacy and owner control for a cheaper system are slowly boiling themselves alive in their own cauldrons, and the fact that you can go get a (often substantially) cheaper Intel or AMD system and have similar performance if not better because the CPU optimizations already exist.

The problem really isn't the CPUs. You could cheap out and buy some lower binned consumer part, and I'm sure some of you are very happy with those, but realistically POWER9 is meant to complete against server-grade tiers. At the Cascade Lake level, Intel's most similar 16-thread part is the 8-core Xeon Silver 4209T, with 11MB of L3 and clocked from 2.2 to 3.2GHz for $500 MSRP, or you can go Coffee Lake and get its 8-core/16-thread part, clocked between 3.7 and 5.0GHz and with 16MB of L3 as the E-2288G for about $540 MSRP as of this writing, though the E-2288G also has a GPU. AMD has a 16-thread Rome Epyc (the 7232P) with clocks from 3.1 to 3.2GHz and 32MB of L3 for about $450 MSRP. I think we can agreeably stipulate that both of those are ballpark comparable with a Sforza 4-core POWER9, also 16 threads, with 40MB L3 (10MB per core, unpaired); Raptor is the only retail source for this right now and they sell a 3.2-3.8GHz clocked part (CP9M01) for about $440.

As for a 32-thread Xeon, Intel doesn't sell a 32-thread Coffee Lake. You'll have to buy Cascade Lake, and your closest option is the 16-core/32-thread Xeon Silver 4216, also 2.1-3.2GHz, with 22MB of L3 for $1000. AMD offers the 16-core/32-thread 7302P, 3-3.3GHz and 128MB of L3, for $825 MSRP. Raptor, again, is the only retail source for the 8-core/32-thread POWER9 and they sell a 3.45-3.8GHz clocked part with 80MB L3 (CP9M02) for $690. In fact, let's be ridiculous and comparison-price the 22-core, 88-thread monster. Raptor sells this 2.75-3.8GHz part with 220MB L3 (CP9M08) for $2800. Coffee Lake, sir? Sorry, sir. Intel does list a Cascade Lake Xeon Platinum 9242 with 48 cores, 96 threads, 71.5MB of L3 and clocks from 2.3 to 3.8GHz, but the MSRP for such systems is atrociously high (estimated north of $25,000). The closest Epyc is probably the 48-core/96-thread 2.2-3.3GHz 7552 with 192MB L3; even that will set you back $4025.

Not only can we conclude that POWER9 CPUs are reasonably priced, but I think there's also a credible argument that they're competitively priced. There's a reason for this: Raptor doesn't make them. They're shipped in from IBM's supply chain (presumably from GlobalFoundries) and IBM not only has them made in volume, but higher-cored parts where not all the cores are working can be binned lower for this market and increase overall yield, thus improving the economy of scale.

All right, so what about the logic boards? A quick survey of LGA 1151 (Coffee Lake) server-grade boards on Newegg averaged around $250 and LGA 3647 (Cascade Lake Gold/Silver) around $400, with varying numbers of expansion and RAM slots, though I have no idea what a BGA 5903 board for that Xeon Platinum part would run. SP3-socket boards for the Epyc look comparable. Meanwhile, the cheapest Raptor motherboard (as an item) is the basic Blackbird starting at $1100. Is this justified?

As it happens, we actually do have other PowerPC small-volume systems to compare against. They're called Amigas, or at least the AmigaOne. Even as an Amigaphile I have never been shy about voicing my displeasure with their running embedded parts as entire systems and the P5020 they're using in the current X5000 is basically at a G5 level of performance (until you factor in its loss of AltiVec, and then the G5 stomps it on such tasks), but they're out there and you can buy one. At £1800 from AmigaKit, that's about US$2200 right now prior to Brexit, plus shipping. It includes the case, Radeon GPU, 2GB RAM, optical and spinning disk, CPU and board. Ignoring the obvious performance differences, I paid about $2100 for my 4-core Blackbird system with everything there minus the GPU, but more RAM and an SSD. (By the way, the parents were visiting not too long ago and we watched Glove and Boots videos from YouTube on the home theatre with it. Worked fine. I may not install a GPU in it after all.)

We can extrapolate prosumer pricing too. While I couldn't find my Quad G5's original sales receipt, I seem to recall that I paid around $3600 in 2006 for it (4 cores, no SMT), 4GB RAM, a hard disk and an ATI 7800GT video card. That's about $4500 in current money for a system that was not massively high volume, but not particularly niche, and could be readily bought at the consumer level. IBM provided the chips for that too. Currently the Talos II with a single-4 (16 threads), 16GB of RAM, 500GB NVMe and a WX7100 is selling for $6500, despite in much smaller volumes than the Quad.

This is to say that Raptor's pricing is by no means out of whack for boutique low-volume sales, and again, arguably even competitive. Let's remember that first and foremost Raptor is a small company. Many of the people coming new to the platform don't remember the original POWER8 Talos crowdsourcing attempt, but I do, because I was one of the people who had my money in. They needed about $3.7 million to do the job and unsurprisingly that went aground as you might recall, but Raptor refunded people's money and this went a great deal to establishing their trustworthiness. As such, I imagine there was no small amount of internal investment required to launch the Talos II (which I was delighted to preorder as soon as I could do so). Even though the T2 (and the T2 Lite and Blackbird, by extension) is strongly based on the Romulus reference platform, that doesn't mean there wasn't any R&D required on their part, and there is still manufacturing, QA and support costs as well as the need to actually turn a profit. I mean, seriously, some of you actually seem to expect Raptor to sell these things at a loss. How long do you think they'd stay in business?

Now, with all that said, none of you who have bought one (or several) of these systems will need any convincing that the price is worth it, and it won't convince those of you who have heard these arguments before and discount them. This is a fair criticism because frankly there's no getting around the sticker price, even if I think I've made the case that Raptor cannot easily make it cheaper. So how will they ever get cheaper?

Raptor has a more or less natural monopoly on the OpenPOWER workstation market. Don't get me wrong: I am not accusing them of gouging. As monopolies go, this is about as benign as you can get because not only are they good stewards of the ecosystem but frankly they were simply the first ones in the pool. Look at the OpenPOWER membership list. Do you see anyone else catering to workstation users? (I nearly choked on my Mr Pibb when they talked about Raptor's "low end" systems at the Summit. This is, of course, purely by comparison.) There are some people running some of the Tyan POWER8 systems as workstations but they are clearly not designed as such, and the effect is not unlike running an Xserve G5 instead of the regular Power Mac tower. My POWER6 may be a "tower" system but I sure wouldn't want it under the desk. No one else makes OpenPOWER workstations. No one is even talking about it.

Raptor management may not like me saying this, but this is an independent blog, and if the OpenPOWER workstation market is going to grow and stabilize then there's going to have to be someone else. It's not a situation like the Mac clones where all Umax and Power Computing did was eat Apple's lunch (which is why Steve Jobs canned the whole thing), because Apple was big enough to saturate the market such that anyone who wanted a Mac had one and thus all the clones did was steal sales. By contrast Raptor is not big enough to saturate the OpenPOWER workstation market because they can't move enough units: there is pent-up demand waiting for the price to come down, and they can get backordered even on the systems that people do purchase. Yes, I'm hopeful that an open ISA will lead to new and more exciting chip designs, but as far as the actual cost of the chips themselves, the "big reveal" probably changes the retail cost of the actual CPUs very little if any because they were never priced out of the market to begin with. Where we need improvement is in the cost of the actual systems so that people can get them and there can be more of them. And Raptor cannot do this by themselves.

I like Raptor because I like their people, I like their products and I like the way they do business. If someone else entered this space I would probably still buy from them. But someone else in that space also means new ways of looking at the market, presumably newer niches to distinguish themselves, and hopefully more investor interest in the sector to increase the available capital needed to enable volume production and sales in a way that would actually then start lowering prices.

Plus, more players in the workstation market also means market resiliency. Raptor seems to be a stable company, but what if they weren't? What would we do if they had to close their doors? CPU manufacturers back in the dark ages (around 1978) had to have second sources to get design wins. We need second sources to make the market survive a loss of the primary manufacturer, however unlikely that would be, because their exit with no replacement would doom the Talos family to being a modern Power Mac G5: a dead end.

In the meantime, we want Raptor to do well and their success to attract other players to this market, because right now they're the only (though best) game in town. The cost for the CPU is reasonable, and criticism of their logic board pricing is unjustified, especially as you reach higher core counts (the same Talos board takes a single-4 or a dual-22). If people believe that a non-x86 system is valuable to have, if people (also) believe that supporting an open architecture is valuable to do, and if people (also) believe that a truly owner-controlled workstation is valuable to use, then people need to understand where the market is now and put their money where their mouths are. Best of all, you're not buying an underpowered conversation piece; you're getting a competitive system you can actually use. I don't see how we get much more blood out of that silicon turnip otherwise.

Posted by ClassicHasClass on August 20, 2019

Day 2 keynote and OpenPOWER blows the doors off: royalty-free, open soft-core (RISC-V sweating gallons)

Holy monkeys of Mars. What a morning at the OpenPOWER Summit Keynote (Day 2)! I swear I'm not paid to write this stuff except for the trivial pittance from ads that goes to maintain the domain name (I'm writing this on my lunch break!). I'm just an old-timer Power ISA bigot who's finally seeing the faith pay off. And boy howdy did it.

Let's hit the big news right now. A reasonable criticism I hear of the OpenPOWER movement is that the ISA isn't, or at least wasn't (oops, spoiler), the open part. This is something that RISC-V in particular could claim superiority on. Somebody at IBM was listening, because today Ken King, general manager of OpenPOWER at IBM, announced "we are licensing [the ISA] to the OpenPOWER Foundation so that anyone can implement on top of it royalty-free with patent rights" (emphasis mine). That's a quote right off the livestream. ISA changes will be "done through the community" with "an open governance model" and a majority vote for ISA expansions and changes.

Let me spell out what this means: you, yes, you, can go out and make your own Power ISA chip and not have to pay IBM. OpenPOWER is now truly open.

The other surprise wasn't OpenCAPI; the announcement that it and the Open Memory Interface are moving into the OpenCAPI Consortium is welcome, but expected. What was the other big news is that the OpenPOWER Foundation is moving into the Linux Foundation. There were already close ties between them before but now the OpenPOWER Foundation will be a component of it, albeit still with its own board, governance structure and decision making.

This announcement was definitely not all talk, because they also introduced Microwatt: a Power ISA soft core. Yes! You can drop it in your design as soon as they upload it!

Anton Blanchard from IBM OzLabs in Canberra announced this one, which was actually demonstrated at the show. Now, this is a very basic core: it's single issue in-order (so your old clamshell blueberry iBook will thrash this), and it doesn't even have hardware divide or cache support yet, though this is planned. In fact, the gcc they used was even hacked to not issue divide instructions. But the darn thing actually works. Here's the super-polished block diagram:

MicroPython is provided, so you can drop this into your design and then talk to it. Here it is in the simulator (which took a couple seconds to compute the answer):

On real hardware it is definitely quicker. Here's the core running on an old Xilinx Artix-7 he found doing nothing in the office computing the Fibonacci sequence:

Xilinx was on stage as a sort of sponsor thing, naturally, so they also gave Anton an Alveo to try this on. They crammed forty cores onto it, and then made it say "Hello World" over and over, because that's exactly what I would do with an expensive programmable piece of hardware. (This is where the name "Microwatt" is kind of crummy, because saying "40 microwatts on an Alveo" sounds like a power consumption benchmark.)

The repo as of this writing is not yet live on Github, but should be within the next day or so.

I'm giving Anton a hard time here because his segment actually was the part of today's keynote that impressed me most. Microwatt is real and tangible and you can work on it, and it can scale from hobbyist to enterprise. This is what really put the "open" into OpenPOWER and I was so delighted to see it run.

I will say I see perhaps a little worry from IBM that RISC-V is going to steal the initiative and momentum, and this move (and the open soft core) is their attempt to recapture the vanguard. RISC-V people should actually be happy about this move: at minimum it means they're being taken seriously at the corporate level, it gets more people thinking about open architectures, and the more truly open architectures out there, the more viable and expected the concept becomes. OpenPOWER is the biggest fish in this sea and (with my bias showing) the most powerful, the most ready for migration and the most well-rounded of all of them, but with more water in the pool everyone can swim farther.

After all of that the rest of it was comparatively pedestrian. Red Hat was also there; Michael Cunningham gave a speech which was largely corporate happy talk, but I think he meant it, and I'm hopeful the big blue and little red merger will generate something of the same rich burgundy shade of my SGI Indigo2. Facebook was there too but their presentation was cloyingly light on tech and heavy on smarm, and I think Facebook is ruining the Internet and the psyche of all who touch it, so that's all I'm going to say about that.

The panel at the end was asked to react to the news, which was a little silly, because what else were they going to say? On stage were Derek Chiou, partner system architect at Microsoft and associate professor at UT-Austin; Alan Clark, CTO for SUSE; Tim Pearson, CTO for Raptor; Bapi Vinnakota, engineer from Netronome; Steve Hebert, CEO for Nimbix and Peter Rutten, research director within IDC's Enterprise Infrastructure Practice. They all thought it was cool, because it is cool.

Microsoft was an interesting choice, but Dr. Chiou was complimentary, saying, "we're very supportive of the open source ... Microsoft sees that's where things are going." He also observed, to my interest, that "the interconnect is more important than the ISA." I'm not sure how true that is but I do agree with him that the ability to openly connect is certainly something that's been overlooked, and we need open tooling to make all of this possible. However, the best panel quote was this one, name censored to protect the innocent: "I'm a pretty incompetent developer, so ... [pauses] Python." Yep. Python definitely is the language of incompetent developers. :D (Hey, I got honourable mention in the obfuscated Perl contest one year! I couldn't resist.)

Tim put it best, though, when he said that "it's going to allow people to trust their computers again." That's why we're using OpenPOWER hardware in the first place. Mendy Furmanek, president of the OpenPOWER Foundation, closed up and said that "Christmas has to end sometime," but we got a whopper of a present today. The party's about to get started and IBM deserves all the credit for a move that really is courageous.

Read yesterday's Day 1 coverage for more if you haven't already.