In a new white paper, Vendor Kernels, Bugs and Stability, the infrastructure software and Rocky Linux company CIQ presents a compelling argument that Linux vendor kernels are plagued with security vulnerabilities due to the flawed engineering processes that backport fixes.
Also: The top three Linux 6.9 kernel upgrades
While this may shock some, it's an open secret in the Linux community. As Greg Kroah-Hartman, Linux stable kernel maintainer and a prominent member of the kernel security team, recently said: To be secure, you should always use the latest long-term stable kernel. The key word here is "latest." It's not enough to use an LTS. You must use the most up-to-date release to be as secure as possible.
Unfortunately, almost no one does that. Nevertheless, as Google Linux kernel engineer Kees Cook explained, "So what is a vendor to do? The answer is simple: if painful: Continuously update to the latest kernel release, either major or stable."
Why? As Kroah-Hartman explained, "Any bug has the potential of being a security issue at the kernel level."
Jonathan Corbet, Linux kernel developer and LWN editor-in-chief, agreed: "In the kernel, just about any bug, if you're clever enough, can be exploitable to compromise the system. The kernel is in a unique spot in the system ... it turns a lot of ordinary bugs into vulnerabilities."
What CIQ engineers Ronnie Sahlberg, Jonathan Maple, and Jeremy Allison did was to put hard numbers behind this position. Their paper shows that -- with current engineering practices -- almost all vendor kernels are inherently insecure and that securing those kernels is impossible.
That's because Linux vendor kernels have been created by taking a snapshot of a specific Linux release and then backporting selected fixes as changes occur in the upstream git tree. This method, designed in an era when out-of-tree device drivers were prevalent, aims to enhance stability and security by selecting changes to backport. This paper examines how this works in practice by analyzing the change rate and bug count in Red Hat Enterprise Linux (RHEL) 8.8, kernel version 4.18.0-477.27.1, comparing it to upstream kernels from kernel.org.
Also: How Red Hat is embracing AI to make sysadmins' lives easier
Although the programmers examined RHEL 8.8 specifically, this is a general problem. They would have found the same results if they had examined SUSE, Ubuntu, or Debian Linux. Rolling-release Linux distros such as Arch, Gentoo, and OpenSUSE Tumbleweed constantly release the latest updates, but they're not used in businesses.
Their analysis of the RHEL 8.8 kernel reveals 111,750 individual commits in the change log. This data, while not detailing the content or size of the commits, provides a general understanding of the backporting process. Initially, there was a steady rate of backporting, but this decreased around November 2021 and again significantly in November 2022, corresponding with the release of RHEL 8.5 and RHEL 8.7, respectively. This pattern, the authors believe, reflects a shift toward more conservative backporting to enhance stability as the major release cycle progresses.
Their examination found 5,034 unfixed bugs in RHEL 8.6; 4,767 unfixed bugs in RHEL 8.7; and 4,594 unfixed bugs in RHEL 8.8.
These figures represent known bugs with upstream fixes that have not been backported to RHEL. The earlier cessation of backporting in RHEL 8.6 and 8.7 has led to more unfixed bugs compared to RHEL 8.8. Red Hat's practice of not publishing the complete source code changes adds complexity, resulting in possible false positives and negatives in the data CIQ had to work with. Despite these limitations, CIQ reports that manual checks suggest a high accuracy in identifying missing fixes.
Also: Ubuntu 24.04: This great new Linux distro isn't just fast - it's a fortress
Contrary to the assumption that bugs are quickly fixed upstream, many persist for extended periods before resolution. This delay impacts kernel quality, as the slowing back-porting process results in an increasing number of known, unfixed bugs, which undermines kernel stability and security over time.
Since Linux kernel developers have taken over managing Linux's Common Vulnerabilities and Exposures (CVEs), 270 new CVEs in March 2024 and 342 in April 2024 have been reported. These have already been fixed in the stable Linux kernel git branch.
Still, the sheer numbers underscore the importance of using stable upstream kernels for enhanced security. The volume of new CVEs and the lack of an embargo period for fixes necessitate a proactive approach from organizations in evaluating and addressing these vulnerabilities.
Besides, although RHEL 8.8 hasn't been actively developed since late 2022, about 10% of all newly discovered bugs still affect it. RHEL 8.8's last major set of bug fixes came in May 2023. The same is true of other, older (but still supported) enterprise Linux distros. More troubling still, according to CIQ: "Some of the missing fixes we examined are explicitly disclosed as being exploitable from user space."
Also: Linus Torvalds takes on evil developers, hardware errors and 'hilarious' AI hype
Therefore, the CIQ team concluded the traditional vendor kernel model, characterized by selective backporting, is flawed. The growing number of known, unfixed bugs suggests that vendor kernels are less secure than upstream stable kernels. The team advocates for a shift toward using stable kernel branches from kernel.org for better security and bug management.
According to the authors, "this creates a strong incentive" for security-conscious customers to adopt stable kernels over vendor-specific ones. They assert, "We believe that the only realistic way for a customer to know they run a kernel that is as secure as possible is to switch to a stable kernel branch."
This paper is not a critique of the dedicated Linux vendor kernel engineers. Instead, it's an invitation for the industry to rally behind kernel.org stable kernels as the optimal long-term solution. Such a shift would allow engineers to focus more on fixing customer-specific bugs and enhancing features rather than the labor-intensive backporting process.
Therefore, they have four critical conclusions:
Also: This backdoor almost infected Linux everywhere: The XZ Utils close call
So, will vendors do this? For all the good security reasons to move to upstream stable kernels, there are counter-arguments, which boil down to this: If you're always upgrading to the most recent kernel, you may also run into stability problems. A program that works just fine with the 4.18.0-477.27.1 kernel might not work with 4.18.0-477.27.1.el8_8. Of course, in that specific case, the newer kernel fixed an important security bug.
It all comes down to a delicate balancing act between security and stability. Some top Linux kernel developers and CIQ are coming down on the side of security. We'll see what the rest of the Linux vendor community has to say.