Fedora 34 mini-review on the Blackbird and Talos II (it sucks)


Once again it's time to upgrade Floodgap's stable of Raptor systems to the latest release of Fedora, which is up to version 34 (see our prior review of Fedora 33). You may not necessarily run Fedora yourself, but the fact that it does run is important, because it tends to be very ahead of most distros and many problems are identified in it and fixed before moving to other less advanced ones. And boy howdy, are there problems this time. I'm going to get it over with and tl;dr myself right now: if you use GNOME as your desktop environment and you haven't upgraded yet, DON'T. F34 and in particular GNOME 40 are half-baked, and the problems don't seem specific to OpenPOWER and the hard work of folks like Dan Horák; these issues are more generalized. There is always that sense of dread over what's going to break during the update, and while I'm finally typing in Firefox on this updated Talos II, it took me hours to get everything glued back together and the desktop performance problems in particular are cramping my ability to use the system well. Fedora 33 will still be supported until a month after F35 comes out; it may be worth sticking with F33 for a couple months for the GNOME team to work on the remaining performance issues.

The problems started from the very beginning, even before actually updating. I do my updates initially on the Blackbird to shake out any major problems before doing it to my daily driver T2. As I explained previously, neither the Blackbird nor the T2 use gdm; they both boot to a text prompt, and we jump to GNOME with startx (or XDG_SESSION_TYPE=wayland dbus-run-session gnome-session if we want to explore the Wayland Wasteland). I do the upgrade at the text prompt so that there is minimal chance of interference. Our usual MO to update Fedora is, as root,

dnf upgrade --refresh # upgrade prior system and DNF
dnf install dnf-plugin-system-upgrade # install upgrade plugin if not already done
dnf system-upgrade download --refresh --releasever=34 # download F34 packages
dnf system-upgrade reboot # reboot into upgrader

If you do this with F34, however, you get a number of downgrades (unavoidable, apparently), missing groups and an instant conflict with iptables when you try to download the packages:

dnf suggests we add --best --allowerasing to deal with that. It doesn't work:
Neither does adding --skip-broken. The non-obvious solution is dnf system-upgrade download --refresh --releasever=34 --allowerasing, and just ignoring the duff package.
The Blackbird does not have a GPU; all video output is on the ASPEED BMC (using the Blackbird's HDMI port). Ordinarily I would select the new kernel from Petitboot when it restarts after the final command above to see a text log of the installation but this time we get an actual graphical install screen.
After the installation completed, the machine rebooted uneventfully and came up to the text prompt. I entered startx as usual and ...
At this point GNOME just plain hung up. There was no mouse pointer, though pressing ENTER on the keyboard triggered the button and put it back to the text prompt. Nothing unusual was in the Xorg logs, and journalctl -e showed only what seemed like a non-fatal glitch (Window manager warning: Unsupported session type). Well, maybe the time for the Wayland Wasteland was now. I did an exec bash (gnome-session doesn't properly handle using another shell, or you get weird errors like Unknown option: -l because it tries to be cute with the options to exec) and XDG_SESSION_TYPE=wayland dbus-run-session gnome-session, and Wayland does start:
However, it still doesn't support 1920x1080 on the Blackbird on-board HDMI, just 1024x768. It also seemed a little sluggish with the mouse. I exited it and tried to start gnome-session --debug --failsafe but it wouldn't initialize.

It then dawned on me that I was setting XDG_SESSION_TYPE manually for Wayland; I previously left it unset for X11. Setting XDG_SESSION_TYPE to x11 finally brought up GNOME 40 in X with a full 1080p display:

I put that into my .cshrc and that was one problem solved. The Applications drawer seemed a little slower to come up, though I have a very vanilla installation on this Blackbird on purpose and few apps are loaded, so I didn't try scrolling through the list or running lots of applications at once. (More on that in a moment.)

Just to see if anything shook out subsequently, I ran dnf upgrade again. This time the missing iptables compatibility packages came up:

That solves that mystery, so just ignore iptables during the initial download and the next time you run dnf after Fedora has been upgraded, it will clean up and install the right components. This whole sordid affair now shows up in the Release Notes.

Upgrading the Talos II is usually a much more complex undertaking anyway because I have custom GNOME themes and extensions installed on it and I always expect there will be some bustage. I don't like it, mind you, but I expect it. Armed with what I had learned from the Blackbird, I installed the packages on the T2 (some other groups also had "no match," though all of my optionally installed packages could and did upgrade) and rebooted.

Unlike the Blackbird, however, the installer still came up in a text screen as in prior upgrades when I selected that kernel from the Petitboot menu.
This machine has the BTO AMD WX7100 workstation card and does not use the ASPEED BMC framebuffer. If you don't select the kernel from the menu and just let the default go, you will get the usual black screen again, and as in prior versions you'll have to pick another VTY with CTRL-ALT-F2 or something, log in as root and periodically issue dnf system-upgrade log --number=-1 to watch.

I rebooted and started X (with XDG_SESSION_TYPE=x11), and GNOME came up, but it looked a little ... off.

If you noticed the weird pink-purple tint, you win the prize. However, my second monitor seemed to have a normal display (so did the Blackbird), and the difference is that my main display is colour-managed. When I selected the default profile, the tint went away but my colours weren't, you know, just right. I spent a few hours regenerating the profile with my Pantone huey manually with dispcal, but the same thing happened with the new profile.

The problem is the new colour transform matrix (CTM) support; the prior profile obviously worked fine in 3.38 but isn't compatible with 40. The proper way to solve this would be by letting GNOME make a new colour profile for you from the Settings app and it even allegedly supports the Pantone huey and other colourimeters. However, it has never (to my knowledge) worked properly on OpenPOWER (it crashes), so I've never been able to do this myself. Instead, my current solution is to just temporarily disable CTM with

xrandr --output DisplayPort-0 --set CTM 0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1

(that's 0, 1, seven zeroes, 1, seven more zeroes, and 1). Adjust DisplayPort-0 to where your colour-managed display is connected. Note that every time you (re)start GNOME or its shell, it will forget this setting and you'll have to enter it again. It would be nice if the colour manager could work with OpenPOWER, but CTM should have never broken working profiles in the first place.

However, that all got solved later, because an even more pressing concern popped up first: the UI was slow as molasses. GNOME 40 defaults to the Activities overview on startup with nothing running. It takes literally several seconds to move from one page of activities/apps to the next. Several seconds. This problem is not unique to OpenPOWER, and occurs on both Wayland and Xorg, but a general fix is apparently months away.

The performance problems are not X11-specific. In fact, Wayland is even worse, because the mouse stutters even just moving it around. This is the first time Wayland is actually worse on the system with the GPU (the T2) rather than the system without one (the Blackbird), though I hardly consider this regression a positive development.

What am I doing about it? Well, what can I do about it, short of trying to fix it myself? GNOME is the default environment for Fedora Desktop, and while I could switch to KDE or Xfce (and I might!), these are serious regressions that are hitting a decent proportion of users and were even evident during the beta phase. Did QA just fall asleep or something? To top it off, even if it were working well, whose freaking bright idea was it to make you go to the upper left corner to click Activities, then back to the bar to click the Show Applications button, just to pull up what you have installed? I've started using the Applications menu that Fedora includes by default; at least that doesn't take a Presidential administration or two or wild sweeping mouse gestures just to show you a list of apps, even though it's still noticeably slower than 3.38.

The slowdowns are entirely specific to GNOME. Once you actually get an app started, like Firefox or a game, display speed is fine, so the problem clearly isn't pushing pixels; it's something higher level in GNOME. Switching all the core scheduling to performance made at most minimal difference. Similarly, (as root) echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level instead of auto made things a little better, but there is still no excuse for how bad it is generally. About the only thing that made more difference than that was simply turning animations off altogether in GNOME Tweaks. Nothing was smooth anymore, but it was about twice as fast at doing anything, so that's how I'm limping along for the time being.

With those significant problems on deck, the usual turmoil with custom themes and extensions is actually anticlimactic. I had to make some tweaks to my custom Tiger-like GNOME shell extension to fix the panel height and a weird glitch with slightly thicker border lines on the edges of the panel, which you can see in the screenshot below. Quite a few extensions could not automatically update to GNOME 40, either:

I've become irritated enough by this that I actually did set disable-extension-version-validation to true in dconf-editor, which made a couple start working immediately, including my beloved Argos custom script driver. For the others I downloaded the most current version of the shell system monitor and this fork of Dash-to-Dock, and manually installed them in ~/.local/share/gnome-shell/extensions/ (you may need to reset the GNOME shell with Alt+F2 and r to get gnome-extensions enable to actually see their UUIDs). A few I should have dispensed with earlier: No Topleft Hot Corner can now be simply replaced by gsettings set org.gnome.desktop.interface enable-hot-corners false, and AlternateTab's switcher behaviour now can be rigged manually from GNOME Settings.

I'm now more or less back where I started from, but working with apps is much less fluid and the desktop experience is undeniably inferior to prior releases, and I can't believe no one thought to blow the whistle during the test phase.

If you use Fedora purely as a command-line server, other than the initial hiccups with downloading packages, it seems to work. If you use KDE or Xfce or anything other than GNOME as your desktop, you're probably okay with F34 too, though I didn't test those (I may later). But if you use the default GNOME on Fedora, especially if you use Wayland, think twice about this update before installing it while you've still got some time with F33. Part of riding the bleeding edge is drawing blood now and then, but F34's wounds seem much more self-inflicted than usual. This is the worst Fedora update since I started using it in F28 and I'm not exaggerating in the slightest.

Comments

  1. Thanks for spending time writing this awesome mini-review. Fedora 34 is indeed rougher than Fedora 33 but having said that it was bump for me initially when I upgraded from 32. I reckon giving it few months everything will be back to its orbit.

    I don’t think there is enough resources / help with QA. With x86_64, the test week usually has many helps from the community but I have not seen a single community QA/feedback from OpenPOWER community

    ReplyDelete
    Replies
    1. Oh, I don't doubt there aren't enough QA resources for OpenPOWER, but most of these problems seem to be in the x86_64 version as well (the bug reports I linked are from PC users). Why not wait a release for GNOME to settle? They do that with other things.

      Delete
    2. Hi guys, I obviously confirm everything, I only updated one of the 2 hardisks, obviously I don't update the other. I also noticed another serious flaw unfortunately. I will write it soon also on the forum, for those who have a dedicated sound card like my RME, the volume no longer works in the Gnome environment. If you put XFCE it goes regular as always, if you use Gnome the volume stays at 0 and if you try to turn it up it goes back to 0 by itself ... Incredible, this Fedora 34 is truly a total disappointment ...

      Delete
  2. Wow, I had no clue the GNOME side of things was such a disaster. It's been pretty smooth sailing on Plasma—there's the usual package upgrade strangeness (there's another one involving rdma-core on x86 installs), but those are generally documented on the Common Bugs wiki page for new releases (https://fedoraproject.org/wiki/Common_F34_bugs#Upgrade_issues). I do have an as-yet-unsolved issue with sddm, but that is specific to a single machine. Beyond that, I didn't encounter any major bugs, just the number of changed things that one expects from such an update. (Caveat: this is largely my experience from x86, as I don't run Fedora as a host on POWER.)

    I'm also kind of impressed by the number of extensions you're using to whip GNOME into shape. With the direction GNOME has been heading for the past several years, I'm surprised one still has these sorts of interface tweaks to eke out functionality from the DE.

    ReplyDelete
    Replies
    1. Well, about half of the extensions are built-in to Fedora (which still doesn't say much for GNOME, but at least *I* didn't install them, and Fedora keeps those up to date). And a couple I was able to just jettison but by twiddling other settings. I really am thinking about switching to something else. GNOME is not a lot of fun right now.

      Delete
  3. https://github.com/ebassi/graphene/pull/233 fixes gnome performance

    ReplyDelete
    Replies
    1. Well, that's kind of a big omission. Nice find. I hope they accept the PR.

      Delete
    2. (Still, it doesn't fully explain all the other regressions, including those seen on x86_64. But it's a good mitigation and puts us on a much more even footing.)

      Delete
    3. shrug, currently running gnome 40 on void/ppc64le-musl and not experiencing any issues (i've already incorporated the fix downstream for us)

      using dual 4K screens so my setup is probably one of the heavier ones

      Delete

Post a Comment

Comments are subject to moderation. Be nice.