The VMX eagle is landing in Firefox 70 (plus: which core should open the door?)


Hugo Landau has an interesting take on the new open-OpenPOWER world. He points out, correctly, that Power ISA is a big win for open architectures because it has maturity in both the embedded and server spaces, but he'd like to see an actual production core opened as well (Microwatt is a lovely MVP and a great proof of concept but it is clearly for experimentation, not for production).

His suggestion is a softcore version of the PPC 405. PowerPC 4xx is a very common embedded CPU family indeed (the POWER8 OCC even has one inside of it), and in the Power.org days IBM was even willing to make it available to academia and researchers. He also suggests open-sourcing Mambo, IBM's currently proprietary simulator.

Open-sourcing Mambo is especially appealing to me trying to do simulation work of my own and not being able to do it on a POWER9! (It claims there is a POWER9 version for Debian, but the install directions and download area strictly show x86_64.) I also think there would be little non-IBM IP to stand in the way of doing so. On the other hand, although opening up the 405 would be admirable, I'm not sure how much it would accomplish in practice: it's 32-bit, not 64-bit; it's strictly big-endian (I like big-endian personally and three of the five systems on this KVM are big-endian, but we all know where the market's going and OpenPOWER in particular emphasizes LE); and it lacks VMX, a/k/a AltiVec. That brings us to Firefox.

In the TenFourFox world myself and several contributors did a fair bit of work on AltiVec acceleration to beef up performance on G4 and G5 systems. (Editorial note: I only use the term AltiVec for Apple systems and chips made by Motorola/Freescale, since Apple used both Motorola/Freescale and IBM parts, and Motorola/Freescale (now NXP) owned the trademark. IBM never owned nor licensed this trademark and always called it VMX, so in OpenPOWER, it's VMX. For that matter neither did P.A. Semi, so the PA6T has VMX too. Even with the G5, although its vector unit was popularly called AltiVec, IBM never officially referred to it by that name.) There are also many opportunities for VMX acceleration in mainline Firefox; depending on your compiler settings, these might get silently enabled already (such as qcms). libpng even has support for VSX. However, many in-tree components either never had the build-system glue written to turn on VMX support (libjpeg, libpng) or they're based on custom SIMD code Mozilla wrote that has no Power ISA equivalent.

For Firefox 70, build system support for VMX, VSX and VSX-3 compiler flags plus runtime detection is now available, written by yours truly, along with the first of the TenFourFox patches I updated and upstreamed to mainline Firefox (this one for fast scanning of text fragments for wide characters). I'm also hoping the libjpeg VMX enablement patch lands in time for merge with several more VMX patches to come. My work on the Firefox Power JIT is somewhat slowed by my continuing responsibilities to TenFourFox and compiler issues such as bug 1576303, which is why I wanted to get a couple quick wins with VMX stuff I already had on the shelf.

Allow me to close the loop on our core digression, though. In bug 817058 I'm asked a question by one of the Mozilla devs: can they just assume every Power chip someone is running Firefox on would support VMX? The answer, even for 64-bit Power, is no, because of poor choices like the AmigaOne X5000 running the QorIQ P5020 which has no SIMD. However, Rust supports compiling for SIMD and Power is a supported architecture, which means Rust supports VMX too, and Mozilla would be foolish not to take advantage of that. Assuming SIMD features are "just present" will become increasingly common and that means that continuing to run parts that don't have VMX (let alone VSX) will become an even bigger losing game on the desktop than it already was. Rather than the 405 I'd personally like to see something like the G5 itself be made openly available: it's POWER4, so it's 64-bit and largely upwardly compatible but wouldn't be a commercially competitive product at the high tiers IBM cares about, it has a VMX unit but it's IBM's (co-developed from the G4/7400), and it's fairly well-understood. Downclock it to reduce power consumption and it could even be a credible upper-end embedded chip. The only thing it lacks is a true little-endian mode.

More on Firefox 69 when it is officially released next week.

Comments

  1. The AmigaOne X5000 is quite literally the most ridiculous and useless paperweight we have had in forever, on top of being stupidly overpriced. Firefox definitely shouldn't be held back by lesser machines like that one. To truly move forward, it has to be both SIMD-aware and, if possible, 64-bit at all times (sorry, G4s), be it BE or LE.

    As for VSX, how much exactly could it benefit Firefox compared to, say, VMX? Regardless, I hope all future PowerPC processors will also come equipped with it, and become a standard with the architecture.

    ReplyDelete
    Replies
    1. VSX does indeed not add as much to VMX as VMX added to scalar code. But it's not nothing, and it does fill certain holes.

      Delete
  2. VMX as a requirement is probably inevitable, similar to how the SSE2 requirement made V8 / WebEngine unusable on Athlons. And that SSE2 requirement was introduced three years ago.

    I'd still love to see runtime support for the chips that don't have it, but of course, that isn't necessary.

    Also, OpenPOWER cares about both endians: https://twitter.com/hughhalf/status/1027109176563064833 so Firefox should, too. I say this as the owner of a Talos that has never run LE code bare-metal (other than petitboot).

    ReplyDelete
    Replies
    1. I'm sure they do care about it in the sense that they recognize the history and the install base, but I don't see BE being pushed in the same way LE is (and certainly not by distros). I don't see any moves to resurrect BE with Red Hat 8, for example; you might expect this if IBM's policy differed from RH's now that RH is an IBM component. That said, I'm sure my observation partially represents our respective differences in perception, of course.

      Delete
  3. The 405 doesn’t implement its own FPU. Wouldn’t that be an issue?

    Though I personally think 32-bit powerpc is still viable for desktop use. Of course it's low-end and nowadays anything that can’t address more than 4GB is seen as stupid, but in several aspects they aren’t as crippled (and never have been) as i686/i386 is in comparison to their 64-bit counterpart. But well I guess I should move on and bigger stronger faster..

    ReplyDelete

Post a Comment

Comments are subject to moderation. Be nice.