Posts

Showing posts from May, 2019

Firefox 67 on POWER


Firefox 67 has been released and to my relief (though I now build smoketest builds of Firefox on ppc64le approximately weekly to find such problems) mostly builds uneventfully. It has a number of nice features, including enhanced content blocking, improved full keyboard accessibility and various performance improvements. The marquee GPU-accelerated WebRender isn't on most Linux systems yet but that's coming soon, hopefully. I haven't experimented with it yet myself but the existing GPU acceleration works fine on this Talos II with the BTO AMD WX7100 card (set layers.acceleration.force-enabled to true in about:config).

That brings up the first catch, because I did say mostly uneventfully: changes to profile handling. If you build from mozilla-release as I do and I recommend, you will end up with a "nightly release" version (assuming you don't pass --enable-release, which I advise you don't pass right now). Starting with Fx67 nightlies from any tree will try to create a new profile separate from your previous profile but the old one remains intact. You can explicitly select it from the Profile Manager (pass -P), or, if you know already which profile you want to use, you can specify it with -p (on my system the default profile is unimaginatively called default, ergo, -p default).

The second catch I haven't figured out the cause, whether it's a kernel or a Firefox bug, but periodically it will throw occasional but not infrequent warnings that look like this in dmesg (this is on a 5.0.x kernel):

[337262.237052] ida_free called for id=170 which is not allocated.
[337262.237089] WARNING: CPU: 6 PID: 12276 at lib/idr.c:519 ida_free+0x114/0x1e0

If you are on a distribution where kernel warnings get converted into notifications (like the Fedora machine I'm typing on), this can be rather obnoxious. If you are badly afflicted, you can temporarily turn them off with these instructions. I haven't found the root cause for it yet and it's hardly a great hardship, but it didn't occur in Firefox 66.

As far as the Firefox JIT for POWER9, I'm still plugging along, but other than a minor pull request to the documentation it's still 100% yours truly working on it. Of the remaining pieces the macro assembler is about 2/3rds written, leaving the low level assembler after that, and then trying to make it build. However, I'm also in the midst of a systems update for TenFourFox, which I still have a commitment to maintain in the short term, so any help will get it in your hands faster. Hopefully the commits make it clear how I'm translating the MIPS backend into POWER9, using all that 3.0B goodness (population count instructions! trailing zero count instructions! load PC in one instruction! it's an assembly language candy store!).

It's been a little while since I posted the .mozconfigs I use, so rather than direct you to old entries I'll just reproduce them here. Note that MOZ_PGO and MOZ_LTO don't seem to properly work and may generate defective binaries, thus their absence, and I explicitly pass --disable-release to an opt build because of various minor problems which hopefully we'll eventually smoke out. Adjust the number of cores as you like; this is a dual-4 system, so with 32 threads available I reserve 8 to let me still play Descent II during build runs. :)

Debug

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-Og -mcpu=power9"
ac_add_options --enable-debug
ac_add_options --enable-linker=bfd

export GN=/usr/bin/gn # if you have it
export RUSTC_OPT_LEVEL=0

Optimized

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++

mk_add_options MOZ_MAKE_FLAGS="-j24"
ac_add_options --enable-application=browser
ac_add_options --enable-optimize="-O3 -mcpu=power9"
ac_add_options --disable-release
ac_add_options --enable-linker=bfd

export GN=/usr/bin/gn # if you have it
export RUSTC_OPT_LEVEL=2

What's missing in this picture


I've got the case (an inexpensive mATX Silverstone SST-ML03B), I've got the memory (16GB). The PSU, optical drive, wireless keyboard and WiFi should arrive next week. Now, what am I missing? Think think think!

Since the whole idea is a POWER9 system for the more price-sensitive, the trimmings cost about $500 on Amazon (minus tax and shipping) and could probably be found elsewhere for less. I also got in on the 4-core $999 Blackbird bundle special price, so with the 2U HSF and tooling that was $1090 before tax and shipping (now it would be roughly $1380) for a base outlay of about $1600. This is a nice attempt at a barebones 8-core for $1950, also apparently minus tax/shipping. Yes, I know you can get an Intel system for less, so don't even bother posting that. If price is your highest priority, you already know you're in the wrong place, but at least now price can still be a priority for what is a decent libre system regardless.

Obviously the aim for us here in the Floodgap household is to use it as an HTPC and that's how I'll be reviewing it. If you just want it as a workstation or to jam in a closet as a low-end server, you can almost certainly cut this parts list further.

ZombieLoad does not affect POWER9


If it's Tuesday, there must be yet another speculative execution attack debuting with a funny name and this Tuesday's entry is ZombieLoad. ZombieLoad works on the same conceptual basis of observable speculation flaws to exfiltrate data but implements it with a new class of Intel-specific side-channel attacks utilizing a technique the investigators termed MDS, or microarchitectural data sampling. While Spectre and Meltdown attack at the cache level, ZombieLoad targets Intel HyperThreading (HT), the company's implementation of symmetric multithreading, by trying to snoop on the processor's line fill buffers (LFBs) used to load the L1 cache itself. In this case, side-channel leakages of data are possible if the malicious process triggers certain specific and ultimately invalid loads from memory -- hence the nickname -- that require microcode assistance from the CPU; these have side-effects on the LFBs which can be observed by methods similar to Spectre by other processes sharing the same CPU core. Other internal buffers of potential value can also be sussed out by related MDS-style techniques.

Because of the limited bandwidth of the LFBs and the effectively streaming nature of the technique, an attacking process can't select arbitrary addresses and therefore can't easily read arbitrary memory. Nevertheless, targeting easily recognizable kinds of data can still make the attack feasible, even against kernelspace. For example, since URLs can be picked out of memory, this apparent proof of concept shows a separate process running on the same CPU victimizing Firefox to extract the URL as the user types it in. As the user types, the values of the individual keystrokes go through the LFB to the L1 cache, allowing the malicious process to observe the changes and extract characters. By its nature there is much less data available to the attacking process but that also means there is less data to scan, making real-time attacks like this more feasible combined with other attacks or social engineering.

However, ZombieLoad is pretty much irrelevant against POWER9 because the LFBs it attempts to monitor are specific to Intel's implementation of HyperThreading (which is true for really any other SMT implementation other than Intel's; the authors of the attack say they even tried on other SMT CPUs without success, almost certainly AMD, though it is not stated for certain that they tested on Power ISA). Even for unpatched Intel machines the actual risk from this (or even most speculative execution attacks, to be sure) is probably limited because it requires running a malicious process to do the snooping and such processes almost certainly have other, more reliable ways of pwning such machines. The decision to patch may simply come down to how much risk you're willing to tolerate: nearly every Intel chip since 2011 is apparently vulnerable and the performance impact of fixing ZombieLoad varies anywhere from Intel's rosy estimate of 3-9% to up to 40% if HT must be disabled completely.

Blackbird shipments start next week


The release of the Blackbird firmware indicates shipping is imminent and Raptor confirms on Twitter that shipments should start next week. Time to get those mATX cases!

Red Hat Enterprise Linux 8.0


The newly Big Blue Red Hat released Red Hat Enterprise Linux (RHEL) 8.0 today, based on Fedora 28. Because it's based on F28, this release of RHEL should "just work" on the Talos II (F28 was the first Fedora to support it), and mostly whatever applies to F28 applies to RHEL 8, including GNOME 3.28, Wayland-by-default and other changes. Although both big and little endian are apparently supported on the trial evaluation images, the documentation says only little-endian is supported, so it's possible not all parts of the site are in sync yet. Still, if you like your Linux corporate (and paying for it), now that Red Hat is an IBM product it's probably as corporate as you can get.

DAWR YOLO mode coming to kernel 5.2


Thanks to Michael Neuling at OzLabs who gave me the heads up and wrote up the patch. One of my pain points doing development on the POWER9 is that hardware watchpoints are disabled at the kernel level. This is because the CPU will checkstop if a watchpoint is set on cache-inhibited memory such as devices, and if a checkstop occurs will invariably bring the system down. The formal name for the special purpose register governing this feature (recall that Power ISA has three classes of registers, i.e., general purpose, floating point and special purpose) is the Data Access Watchpoint Register, or DAWR. There is no software workaround for this problem, and because a malicious local user could bring the system down without privileges by managing to provoke such a situation, setting such watchpoints via the DAWR is therefore currently disabled for safety. Unfortunately, software watchpoints are sometimes hundreds of times slower than hardware watchpoints and for certain debugging tasks are just about indispensable (such as JIT code generation).

IBM notes this issue as an erratum which implies they see it as a defect and therefore suggests it will be fixed in hardware in the future (it does not affect POWER8). Until then, Michael's patch enables "DAWR YOLO mode" for those of us (like me) who are single users on a workstation who know what we're doing, need hardware watchpoints to debug our software before the heat death of the universe, and accept the risk of system crashes. It creates a debugfs switch at /sys/kernel/debug/powerpc/dawr_enable_dangerous that enables the superuser to (mostly) freely turn access to the DAWR off and on; see the patch for more details. Fortunately this change has been finally queued for kernel version 5.2, which means I hopefully won't have to screw around with a custom kernel for much longer and is very good news for other developers in the same boat. Thanks, Michael!

A friendly FPGA reminder


Raptor is seeking beta testers for the upcoming 2.00 Talos II firmware, including a tease for users of the built-in VGA port. This brings up a question: flashing the PNOR and BMC is relatively straightforward (I've done it from my Quad G5), but flashing the FPGA requires a programmer and not all of us are handy with those -- or even have one. Do we need to do that too to try the beta?

Fortunately, Raptor's great support staff responded to my query and said (emphasis mine), "The FPGA firmware is largely independent of the BMC and PNOR. FPGA updates are only released to improve compatibility with PSUs and chassis components encountered in the wild, and at no point will a BMC nor PNOR update be released that is incompatible with the earliest FPGA revisions." That's very reassuring since I have an early T2 that is basically on the same FPGA flash it came from the factory with.

If that's the case, then, when should you update the FPGA? Raptor Support answered that too: "The only time you need to upgrade the FPGA is when you need functionality a new FPGA release provides, for example to activate the VGA disable jumper or to allow the system to boot with a different, previously problematic PSU."

Looks like I've got something to try over the weekend.