1

I have noticed a strange behaviour on some RHEL 9 servers: GRUB does not boot on the latest kernel, despite the fact that

  1. it is supposed to do so automatically whenever a new kernel is installed, and

  2. grubby --default-kernel reports the default booting kernel is set to the latest kernel.

In this example, the server boots automatically on kernel v5.14.0-362 although there are 5.14.0-427 and 5.14.0-503 installed.

[myrhel9host root ~]# uname -a Linux myrhel9host 5.14.0-362.13.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 24 01:57:57 EST 2023 x86_64 x86_64 x86_64 GNU/Linux [myrhel9host root ~]# grubby --default-kernel /boot/vmlinuz-5.14.0-503.34.1.el9_5.x86_64 

This only happens on some servers, although they are configured all the same and have the same updates installed.

As a workaround, I have:

  1. uninstalled the two oldest kernels (kernel-core packages)
  2. deleted their GRUB entries on /boot/loader/entries/
  3. rebuilt the GRUB via grub2-mkconfig -o /boot/grub2/grub.cfg

However, the problem is going to reappear at the next kernel update, so I'd like to find the reason for this bug and - most importantly - a permanent solution for what seems to be a recurrent problem on RHEL 7/8/9 and Fedora servers (see below).


Here is an example of this problem; however, I've checked and on my systems /etc/default/grub has the correct option GRUB_DEFAULT=saved.


This solution from the Red Hat KB says that it might be due to /etc/sysconfig/kernel containing the entry

DEFAULTKERNEL=kernel-uek 

instead of (for RHEL 7 machines)

DEFAULTKERNEL=kernel 

or (as seen on my RHEL 9 machines)

DEFAULTKERNEL=kernel-code 

However, this is not the case on my servers. Also, we have vanilla RHEL (not Oracle Linux) installations, so it's unlikely that this setting is going to point to the Unbreakable Enterprise Kernel.


This Reddit post (thanks to @Chester_Gillon) points out the New Release Notes for RHEL 9.3 mentioning new bootloader behaviour, which could be the cause of this problem.

To disable the new behaviour, the procedure is to change this line in /etc/default/grub

GRUB_ENABLE_BLSCFG=true 

to

GRUB_ENABLE_BLSCFG=false 

On all servers the option is set to GRUB_ENABLE_BLSCFG=true. However, as said, only some of them do not boot to the latest kernel.

In any case, I've modified the setting GRUB_ENABLE_BLSCFG=true to GRUB_ENABLE_BLSCFG=false on a machine, then installed the new kernel. This did not solve the problem; the server still booted on an old kernel.


Here is the same problem on Fedora, and it was caused by a missing file /etc/sysconfig/kernel. However, the file exists on my systems.

8
  • Is this a regression? Commented Jun 4 at 14:59
  • @SirMuffington What do you mean? Commented Jun 4 at 15:22
  • Out of curiosity what happens if you set GRUB_DEFAULT=0? That used to be to tell to boot the first (newest) entry. Commented Jun 6 at 7:21
  • @tukan The first entry is not the newest kernel. GRUB effectively boots to its first entry. Commented Jun 6 at 7:25
  • I see. Is there a reason why the newest kernel is not the first entry? Commented Jun 6 at 7:26

1 Answer 1

1

It turns out that a change in RHEL 9.3 was related to UEFI, and /boot/efi was not mounted on our servers. This was probably the cause of the problem (files in that partition were several years old).

Running these commands solved the issue:

mount /dev/sda1 /boot/efi dnf reinstall grub2-efi-x64 grub2-common shim-x64 grub2-editenv - unset menu_auto_hide init 6 

The relevant entry in /etc/fstab is:

/dev/sda1 /boot/efi vfat defaults 0 1 

After that, the system booted up correctly to the latest kernel.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.