Posts

Showing posts from September, 2020

FreeBSD swings both ways


They say there's an xkcd for everything, but me, I say it's Friends GIFs. Anyway, hat tip to developer Piotr Kubaj who reports that, if you don't like big endian and cannot lie, FreeBSD's covered you got with a new little endian ppc64le port to complement the existing (and by now practically mature) big endian ppc64 flavour.

Raptor themselves actually give material support to the project by providing a remote instance for development, powering a build server that continuously runs poudriere bulk -a to test ports. Plus, looking in the source tree, the commits to add little-endian support are all tagged as "Sponsored by: Tag1 Consulting, Inc." This company apparently has OpenPOWER alumni from the Oregon State University Open Source Lab (.pdf). It's nice to see the cross-pollination at work!

Although there are no .iso images yet, they should start appearing with the -CURRENT snapshots next week. Note that official ports support doesn't exist yet either, so you'll need to compile packages on your own for the moment, and there are other minor to moderate deficiencies relative to the big-endian port which are still being rectified. Still, choice is a good thing, especially since per Piotr there are no plans to decommission the big-endian port and both will coexist. How's that for playing on both teams?

Firefox 81 on POWER


Firefox 81 is released. In addition to new themes of dubious colour coordination, media controls now move to keyboards and supported headsets, the built-in JavaScript PDF viewer now supports forms (if we ever get a JIT going this will work a lot better), and there are relatively few developer-relevant changes.

This release heralds the first official change in our standard POWER9 .mozconfig since Fx67. Link-time optimization continues to work well (and in 81 the LTO-enhanced build I'm using now benches about 6% faster than standard -O3 -mcpu=power9), so I'm now making it a standard part of my regular builds with a minor tweak we have to make due to bug 1644409. Build time still about doubles on this dual-8 Talos II and it peaks out at almost 84% of its 64GB RAM during LTO, but the result is worth it.

Unfortunately PGO (profile-guided optimization) still doesn't work right, probably due to bug 1601903. The build system does appear to generate a profile properly, i.e., a controlled browser instance pops up, runs some JavaScript code, does some browser operations and so forth, and I see gcc created .gcda files with all the proper count information, but then the build system can't seem to find them to actually tune the executable. This needs a little more hacking which I might work on as I have free time™. I'd also like to eliminate ac_add_options --disable-release as I suspect it is no longer necessary but I need to do some more thorough testing first.

In any event, reliable LTO at least with the current Fedora 32 toolchain is still continuous progress. I've heard concerns that some distributions are not making functional builds of Firefox for ppc64le (let alone ppc64, which has its own problems), though Fedora is not one of them. Still, if you have issues with your distribution's build and you are not able to build it for yourself, if there is interest I may put up a repo or a download spot for the binaries I use since I consider them reliable. Without further ado, here are the current .mozconfigs that I attest as functional.

Optimized Configuration

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9"
ac_add_options --disable-release
ac_add_options --enable-linker=bfd
ac_add_options --enable-lto=full

#export GN=/uncomment/and/set/path/if/you/haz
export RUSTC_OPT_LEVEL=2
Debug Configuration
export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9"
ac_add_options --enable-debug
ac_add_options --disable-release
ac_add_options --enable-linker=bfd

#export GN=/uncomment/and/set/path/if/you/haz
export RUSTC_OPT_LEVEL=0

The first production RISC-V workstation?


No, not the RiscPC, a RISC-V PC. And, not counting the various one-offs, it appears to be the very first production RISC-V workstation available. SiFive is announcing the RISC-V PC at the Linley Group Fall Virtual Processor Conference, based on the Freedom U740 ("FU740") to be introduced at the same time next month.

Precious little details are available, such as loadout, options, availability and most of all cost, but when has that stopped us from idly speculating before, eh? It is virtually certain the machine will be composed largely of off-the-shelf components other than the CPU, which is the real mystery of interest. The FU740 appears to be an evolution of the FU540, which is a 64-bit 1.5GHz+ part with four U54 "little" cores combined with one S1-series "big" core and 2MB of L2 cache on a 28nm process. Plainly, neither of these cores are even remotely in the ballpark with OpenPOWER: SiFive quotes CoreMark/MHz scores of 3.01 for both the U54 and S54, whereas the POWER9 easily achieves over 160. While the FU740 will almost certainly be faster due to its probable basis on the U74, it is difficult to imagine that the performance gulf will be narrowed significantly (the U74 edges up to around 5). You should not buy one and expect it to compare favourably with x86 or a Raptor system.

On the other hand, there's a good chance this will be another truly open system based on the fact that the Freedom E300 and U500 series are open source under the Apache license. While some parts of SiFive are proprietary, this line is not, and we presume that the U700 series will be likewise. RISC-V still lacks firm specs for vector and bit manipulation instructions, and this certainly hurts them for desktop and mobile applications, but this is a known deficiency and is being worked on. Assuming no shenanigans with the firmware, there's encouraging potential even in this early form.

I'm unambiguously on Team Power because of my long history with the architecture, but this blog is certainly interested in all kinds of free vendor-unencumbered computing, and this machine may well represent another such system. And it's newsworthy as the first RISC-V system that's at least workstation form factor even if its likely performance doesn't currently make it a credible daily driver. But maybe that's not the point: the point is to get developers on the architecture in a way that's bigger than an evaluation board (cf. Linus Torvalds and ARM), meaning it doesn't have to be their only daily driver; it just has to "be there" so people think about it. More on cost and specs and "how open is it" when we actually see it in October.

Moar OpenPOWER cores plz


More news from virtual OpenPOWER Summit 2020: I mentioned it would be interesting to see what other cores would pop up on the OpenPOWER Github and indeed following on from the PowerPC A2I comes another A2 variant, the PowerPC A2O.

Announced today by IBM and released under the standard OpenPOWER license, the A2O is an evolved 64-bit PowerPC A2 compliant to ISA 2.07, comparable to POWER8 (the A2I was 2.06) under the embedded-focused Book III-E, and can run both big or little endian. At 45nm it was intended for 3GHz+ speeds; at 7nm it is expected to achieve 4.2GHz speeds at 0.85W, or 3GHz at 0.25W. Unlike the strictly in-order and slightly more power-thrifty A2I the A2O is out-of-order and prioritizes single-threaded performance, but it's only SMT-2 versus the A2I which is SMT-4. Even this is theoretical, however, because the documentation notes that only single-thread generation has been attempted so far. Each core has an AXU similar to the A2I that appears to offer FPU operations in the Verilog code, plus a branch unit, FXUs for single and complex integer operations respectively, and a load/store unit. There also appears to be a basic MMU, though the core allows running without one relying entirely on ERATs, but unfortunately I couldn't find a vector unit (the A2I as released didn't come with one either).

IBM casts the A2O as being more appropriate for artificial intelligence, autonomous driving and security, whereas the A2I was meant for streaming, network processing and data analysis. I'm not sure I believe either of those claims, but despite apparently being just an evolutionary improvement over the A2I I think the A2O is more promising especially for smaller-scale systems. By being 2.07-compliant it's already almost a mainline POWER8 and the interest that has bubbled up around A2I should find even more to like in A2O. Adding a radix MMU implementation and vector operations wouldn't be trivial, and even this single-thread implementation has high FPGA utilization, but I think this would be a better basis than A2I for that hypothetical OpenPOWER developer board everybody seems to want or even a mythical modern PowerPC laptop. Like A2O, A2I still doesn't replace Microwatt, which is much better documented, better supported, can actually boot a Linux kernel, and if for no other purpose than pedagogy is a far more purposeful model for OpenPOWER systems. That said, A2I's very presence is yet another choice and yet another great reason to be on board with OpenPOWER.

IBM open-sources PowerAI as OpenCE


News from today's COVID-19 socially distanced virtual OpenPOWER Summit: IBM announced the open-sourcing of their PowerAI package today as OpenCE, the Open Cognitive Environment for deep learning and machine learning applications. The code should build on any Linux-based OpenPOWER system, including Raptor-family workstations and servers, and the Github repository contains everything needed to build Tensorflow, Pytorch, XGBoost and related projects and dependencies. If building binaries from scratch leaves you cold waiting for the goodies, Oregon State University simultaneously announced plans to offer pre-built ppc64le binaries for each upcoming tagged release both with and without CUDA support. Unfortunately, not everything is open: you'll still need to register and download a separate blob from Nvidia if you intend to use CUDA, even though it can be reportedly downloaded at no cost afterwards, and if you do you'll naturally be limited to Nvidia GPUs (which you can't use for 3D acceleration on OpenPOWER currently due to the lack of a working open-source driver). Still, here's a high-power option for your machines coming from someone who knows how to optimize for the platform, and Raptor's PowerAI-specific SKU is a turnkey package configured expressly for that purpose (and it's even in stock). Perhaps OpenCE is something they could preinstall for even greater value now that it's available.

Microwatt floats


When we last visited Microwatt, the little synthesizeable OpenPOWER core that could, we looked at how you could hack instructions in. Or, you can sit back and wait for the PRs from IBM, including now a simple FPU. While this pull request describes its performance in modest terms, impressively it operates exactly the same (and even authentically "fails" the same tests in the same fashion) as the FPU in the POWER9. There is still no (full) supervisor mode, and no vector unit, but Microwatt is now advanced enough to boot a Linux kernel. The possibility of a single-board Microwatt-based system (and fully reprogrammable, too) gets closer every day.