Monday, 31 March 2014

Live Kernel Patching Solutions

A large public cloud is quite a complex thing to manage, even with toolsets such as Openstack to deploy and manage it.  You often have many customers with compute instances on a single server so security is a high priority.  This also introduces an interesting problem, when you need to deploy a Linux Kernel security patch to the host (sometimes thousands of hosts) how do you do it with the least disruption?

One of the currently most accepted methods is to suspend the instances on a box, deploy the patch, reboot the host and resume the instances.  Whilst this works in many cases you have customers who have had their instances down for X minutes (even if it is planned and the customer is notified it can be inconvenient) and instance kernels that suddenly think "my clock just jumped, WTF just happened?".  This in-turn can cause problems for long running clients talking to servers as well as any time-sensitive applications.  Long running clients will often think they are still connected to the server when they really are not (there are ways around this with TCP Keepalive).

There are a couple of old solutions to this and a couple of new ones and as part of my work for HP's Advanced Technology Group I will be taking a deep dive into them in the coming weeks.  For now here is a quick summary of what is around:


This is probably the oldest technology on the list.  It doesn't quite fall under the "Live Kernel Patching" umbrella but is close enough that it warrants a mention.  It works by basically ejecting the current kernel and userspace and starting a new kernel.  It effectively reboots the machine without a POST and BIOS/EFI initialisation.  Today this only really shaves a few seconds off the boot time and can leave hardware in an inconsistent state.  Booting a machine using the BIOS/EFI sets the hardware up in an initialised state, with kexec the hardware could be in the middle of reading/writing data at the point the new kernel is loaded causing all sorts of issues.

Whilst this solution is very interesting, I personally would not recommend using it as during a mass deployment you are likely to see failures.  More information on kexec can be found on the Wikipedia entry.


Ksplice was really the first toolset to implement Live Kernel Patching.  It was created by several MIT students who subsequently spun-off a company from it which supplied patches on a subscription model.  In 2011 this company was acquired by Oracle and since then there have been no more official Open Source releases of the technology.  Github trees which are updated to work with current kernels still exist.

The toolset works by taking a kernel patch and converting it into a module which will apply the changes to functions to the kernel without requiring a reboot.  It also supports changes to the data structures with some additional developer code.  It does temporarily pause the kernel whilst the patch is being applied (a very quick process), but this is far better than rebooting and should mean that the instances do not need suspending.


Both Red Hat and SUSE realised that a more modern Open Source solution is needed for the problem, and whilst SUSE announce their solution (kGraft) first, Red Hat's kpatch solution was the first to show actual code to the solution.

Red Hat's kpatch solution gives you a toolset which creates a binary diff of a kernel object file before and after a patch has been applied.  It then turns this into a kernel module which can be applied to any machine with the same kernel (along with kpatch's loaded module).  Like Ksplice, it does need to pause the kernel whilst patching the functions.  It also as-yet does not support changes to data structures.

It is still very early days for this solution but development is has been progressing rapidly.  I believe the intention is to create a toolset that will take a unified diff file and turn that into a kpatch module automatically.

For more information on this solution take a look at their blog post and Github repository.


SUSE Labs announced kGraft earlier this year but only very recently produced code to show their solution.

From the documentation I've seen so far their solution appears to work in a similar way to Red Hat's but they have the unique feature which gives the ability for the patch to be applied to the kernel without pausing it.  Both the old and the replacement functions can exist at the same time, old executions will finish using the old function and new executions will use the new function.

This solution seems to have gone down the route of bundling their code on top of a Linux kernel git tree which means it took an entire night for me to download the git history.  I'm looking forward to digging through the code of this solution to see how it works.

The git tree for this can be found in the kgraft kernel repository (make sure to checkout the origin/kgraft branch after you have cloned it) and SUSE's site on the technology can be found here.


All three of the above solutions are very interesting.  Combined with a deployment technology such as Salt or Ansible it could mean the end to maintenance downtime for cloud compute instances.  As soon as I have done more research on the technologies I will be writing more details and hopefully even contributing where possible.

No comments:

Post a Comment