back to article NetBSD, OpenBSD improve kernel security, randomly

The folks at NetBSD have released their first cut of code to implement kernel ASLR – Address Space Layout Randomisation – for 64-bit AMD processors. The KASLR release randomises where the NetBSD kernel loads in memory, giving the kernel the same security protections that ASLR gives applications. Randomising code's memory …

Jim Mitchell
Unhappy

Kernel ASLR does its thing at boot. From then on the kernel address/layout is static. How often do you reboot your BSD/Linux devices?

With this current implementation, one leak of a kernel address to the attacker and they can figure out the rest. "Minimum Viable Product" box checked, hope the improvements continue.

alain williams
Silver badge

it is a good start ...

yes: more to do, but follows in the Open Source philosophy: release early. They can get a next step done in the next months, then release that. Eventually they will have something that will please even you!

Charles 9
Silver badge

Don't know if you really can perform a live relocation of something so heavily used like the kernel. It's especially tricky to move live code being executed (which for the kernel, is practically all the time).

Nick Stallman

I think the point is typically everyone's computer would put it in the exact same location making attacks against multiple computers trivial.

A buffer overrun or similar attack with ASLR means each computer is different from each other, so when attacking you have to first find your target addresses which makes it a lot harder.

It's not about having code jumping around constantly on a single PC.

foxyshadis

It's pretty trivial to live relocate as long as certain conditions are accounted for, as hinted in the article: Turn entry points into mere trampolines to the real code. When you're ready to cycle the code location, copy the code to the new location, rewrite the trampoline, and tear down the old code when you're sure no one is executing it anymore. Code's changed and no caller knows the difference, just like a stable API/ABI.

Charles 9
Silver badge

But it'll be the next logical step. As ASLR becomes commonplace, mallard will take it slower, spying on systems to learn the critical locations before striking. The logical way to beat that is a moving target.

Charles 9
Silver badge

So noted, but let me phrase it another way: how do you lock a door that never has the opportunity to close? How do you relocate code that's constantly in use?

Hans 1
Silver badge
Happy

Unix != Windows ... UNIX does not have silly Windows file locking issues, so, as another put it, you copy, then update the kernel interfaces to point to new kernel, finally dump the old. You then do that every 10/15 minutes ... note, the size of the beast is ~1Mb, you can compile your own to make it smaller ... you could move it every minute without noticeable lag ...

Fullmetal5

That's not how KASLR works in any implementation and for good reason.

Before the kernel begins executing it's totally feasible to relocate it but after the kernel starts running the kernel can't be moved without breaking any pointer it has to its self.

Also think of the performance hit. Even if the kernel is only a couple megabytes big with drivers added that's still a large amount of data that needs to be moved every so often.

Also how often would you move the kernel around?

Even if you solve all of those problems that doesn't even help because all an attacker has to do it use whatever address leak exploit they were using in the first place just later in the exploit chain so that it will still be correct whenever the exploit actually use it.

KASLR is just suppose to make it so that the address isn't predictable without leaking its address somehow, not to prevent leaks from permanently disclosing the kernel location.

patrickstar

File locking has nothing to do with a running kernel. Kernel code is typically not pageable, or atleast not a lot of it.

For the record, Windows has had kernel ASLR for quite some time now. I think since 8.

Relocating a running kernel, while probably technically possible, is tricky to say the least. And it would take quite some time and be a huge performance hit if done regularly, since at the very least you need to stop all CPUs. You can't let them keep executing in userland and get any meaningful work done, since any page fault or other interrupt can't be handled until the relocation is complete.

And this is just for moving the code - if you want to randomize data as well you're in for a lot of pain.

There wouldn't really be any security benefit over just randomizing things at boot. Leaking kernel addresses and then exploiting a bug takes place in several orders of magnitude less time than the frequency at which you could ever perform a total relocation of the kernel.

Martin an gof
Silver badge

Turn entry points into mere trampolines to the real code.

Forgive me for being thick, but hasn't that been "a thing" for decades? Even the BBC Micro did this (established "indirection" locations which could point to a routine anywhere in local, or even remote - e.g. second processor - memory), and I got the impression it was established practice back then.

Yes, yes, I know it's a distant, distant relative of the randomisation being talked about here, just brought back memories :-)

M.

patrickstar

It's well known that KASLR isn't a very effective defense, mainly because of the ubiquity of potential address leaks in kernels. However, having it is certainly a lot better than not having it.

handleoclast
Silver badge

Trampoline of fail

@foxyshadis

I don't see trampolines helping. What's the obvious point to attack? The trampoline.

So even if you trampoline the trampolines, all you've done is given an attacker the easy way in.

Hint: randomising where you leave a spare front door key may help, but not if you always leave a note under the door mat saying where you've hidden the key today.

Christian Berger
Silver badge

Well yes...

... but how often do you share in RAM kernel images with other systems?

Besides if the attacker guesses wrong, you'll have a reboot.

Anonymous Coward
Anonymous Coward

"a huge performance hit"

Moving a few megabytes of memory, and updating some pointers? You do realise that modern CPU's run at multiple GHz? And move memory at many 10's of GB/second? Stop talking bollocks. Grab the kernel mutex, move everything, 0.001 seconds later it's done, release it. You won't even notice.

patrickstar

There isn't a single "kernel mutex" in any modern SMP kernel. Linux was pretty late to the game in abolishing 'the big kernel lock' and that work started in the early 2000s.

More to the point - if you are in the middle of shuffling the kernel around, that includes the code that tests locks/mutexes itself...

You need to, at the very least, stop all running CPUs and disable all interrupts - interrupt/trap handlers and the code they depend on are, after all, part of the kernel and among the things that will be moved around.

The entire system is now frozen. Good luck if you were doing anything remotely real-time-like. "Why is my video/audio playback skipping at regular intervals?"

Then you have to actually perform the relocation, update page tables, descriptor tables/trap handlers, and any pointers to code. Such as the method/function pointers that are very common in pretty much all kernels except perhaps the very smallest. Plus various things I forgot.

(Sidenote: How would you even solve the latter problem? Compile-time annotations and a compiler plugin? Add a "relocation callback" to all drivers and anything else?)

Some time later, when all the things are done, you can reenable interrupts and restart CPUs again.

Note that nothing of this has anything to do with bulk memory copy speeds, since changing the address mapping of the kernel has nothing (or at least very little) to do with actually moving it around in memory. The Linux kernel is always located at pretty much the same places in physical memory.

The fact that you think it does makes me imagine that your last experience with OS kernels were before the era of MMUs and paging...

Memory access speed does come into play - but it's for updating individual pointers and offsets in memory.

And all this for accomplishing exactly nothing since the attacker will have leaked the relevant address(es) and exploited the kernel sometime between your regular system-wide nap-times. Not to mention that all of this only affects code addresses and not other kernel memory.

FIA

"a huge performance hit"

Moving a few megabytes of memory, and updating some pointers? You do realise that modern CPU's run at multiple GHz? And move memory at many 10's of GB/second? Stop talking bollocks. Grab the kernel mutex, move everything, 0.001 seconds later it's done, release it. You won't even notice.

If you're ever inclined, boot into VGA or a VESA mode on a modern machine and drag a few windows around. See how much you don't notice moving a few megabytes around. It's not as quick as you think.

Robert Grant

Re: it is a good start ...

Early? Wasn't this the thing OpenBSD had in 2003, or was that something else?

Kiwi
Silver badge
Holmes

It's pretty trivial to live relocate as long as certain conditions are accounted for, as hinted in the article

This reminds me of "live patching" to some degree, which must have some of the same code involved. Or is someone going to try and argue that the old code and the new code physically occupy exactly the same space? No? Then what is the mechanism to call the new code instead of the old?

patrickstar

Generally speaking live patching is done by putting the new code somewhere else entirely and inserting a jump to it at the start of the code to be replaced. So basically you insert a trampoline where one didn't exist before.

If you have some degree of cooperation from the code to be patched (eg hotpatching in Windows), this can be done even though others may be executing the code at the moment. However, if you're going to remove the old code, you need to prevent execution of it regardless (like stopping all CPUs and resuming their execution at a known location).

Kiwi
Silver badge
Flame

However, if you're going to remove the old code, you need to prevent execution of it regardless (like stopping all CPUs and resuming their execution at a known location).

Have taken a bit more of a look into it, and the CPU doesn't get stopped, just pointers changed (simplified version). Looking at kgraft only for this, I haven't delved much deeper or into any other such products.

From https://www.linux.com/news/suse-enterprise-linux-live-kernel-patching-open-source-kgraft

"...Imagine, if you will, applying a kernel patch to your production servers in the middle of the day, during peak transaction periods, and not … missing … a beat...."

and

The gist of kGraft is this:

  • kGraft locates differences between the running kernel and the patch.
  • It creates replacement functions based on those differences.
  • It loads and links the patched functions.
  • It redirects code execution to patched functions.

and also

"...Businesses with massive server deployments that demand 24/7/365 uptime are ripe for Live Kernel Patching. As well, this technology is perfectly suited for big data. Why? When you’re looking at terabytes of in-memory data that will take hours to reload on reboot ─ you need Live Kernel Patching to ensure those security patches (patches that can range from a mere two lines of code to thousands) can be loaded without having to give the dreadful command to “shut it down”. This can be a real game-changer when the bureaucratic red tape of rebooting can delay the process days, weeks, and even months (or send the CEO, COO, and CFO into fits of apoplectic shock)..."

From https://www.linuxfoundation.org/blog/suse-labs-director-talks-live-kernel-patching-with-kgraft/

First, a patch module that contains all the new functions and some initialization code that registers with the kGraft code in kernel is loaded. Since it contains the new functions as regular code, the kernel module loader links them to any functions they may be calling inside the kernel.

Then, kGraft uses an ftrace-like approach to replace existing functions with their fixed instances by inserting a long jump at the beginning of each function that needs to be replaced. Ftrace uses a clever method based on inserting a breakpoint opcode (INT3) into the patched code first, only then replacing the rest of the bytes by the jump address and removing the breakpoint and replacing it with long jump opcode. Inter-processor non-maskable interrupts are used throughout the process to flush speculative decoding queues of other CPUs in the system. This allows switching to the new function without ever stopping the kernel, not even for a very short moment. The interruptions by IPI NMIs can be measured in microseconds.

So doesn't seem to stop the CPUs at all, not in the way you(?) were mentioning earlier where video playback would have noticeable pauses.

Admittedly patching is not the same as moving portions of the kernel every x minutes, but from these texts the same processes could be applied to shifting kernel code to a new location without noticeable impact on performance. Hell, my 6+yr old machine runs a couple of VMS and still has more than enough grunt to have games running in the background while I watch a movie via Kodi. If moving thousands of lines worth of code is "measured in microseconds" then the impact of moving significant chunks on a fairly regular basis would be unnoticeable by most (at least until one of those glitches occurs where a pointer isn't updated correctly - but then if kgraft is as good as advertised and acceptable to the target markets, I seriously doubt that would be something that happens often).

I understand the issue of locations being "leaked" - after all if you don't know where to find a bit of code it's a little hard to call it when needed, but this could help mitigate some attacks in that the attacker would always have to be keeping track of where stuff is moving to. Maybe not that big a hurdle but one that could have an impact, further making buffer over runs harder to work (just because you found the location a minute ago doesn't mean it'll be there, though there is a good chance it still will be unless you're changing locations every couple of seconds which might be a tad excessive!)

[El Reg please for fucks sake GET RID OF THAT FUCKING CLODFOOL STUPIDITY! Fuck that shit is so fucking annoying and useless! IT takes longer to get through that fucking mess of stupidity than it does to write a post like this, even with a 5minute full-foam rant at the end! You're a tech site, you should be better than this shit!]

--> Icon Please oh please let me meet the people behind clodfool in a dark alley!

patrickstar

kgraft and similar schemes don't actually delete the old code. To do that, you will need to stop CPUs to make sure they aren't in the middle of executing the code you are deleting. Just like I described.

The scheme it's describing (INT3 yadda yadda) is so it can insert the jumps into code that's not been specifically prepared for it. Just like I described...

(Hmm - noticing a pattern here. Might have something to do with the fact that I have actually implemented a lot of this in other contexts...)

There's a bit of discussion regarding hotpatching in Windows here and how they solved that same issue, in that case by having the code be prepared for it beforehand: https://blogs.msdn.microsoft.com/oldnewthing/20110921-00/?p=9583/

kgraft and similar schemes are making far smaller changes than moving the entire damn kernel would be. Even if it would be nuking old code etc, it's making tiny changes to individual functions, which typically are not interrupt handlers or any of the other tricky cases. It's simply not comparable.

If you're gonna MOVE the kernel code, you need to have all CPUs stop at a safe point.

This would typically mean that you have to stop them one by one as they reach it, not just suddenly stop them all at once.

You'll either have to re-route IRQs from the CPUs one by one as they are being stopped or live with a potentially large random delay in serving IRQs from the CPUs stopped before the others.

Then once they're stopped, or sit spinning at some code you won't be removing, with interrupts disabled, you have a lot of work to do.

For once, you need to locate all function pointers in memory and change them. How would you even do that? You'd need some sort of managed memory scheme to know what's a relevant pointer and not, similar to mark/sweep GC with exact tracing. At the very least, you need full cooperation from ... everything in the kernel. Including all drivers. Good luck with that.

If you want to move the static data as well, you're in for exponentially more fun. You might as well stick a compacting mark/sweep GC in the kernel and be done with it.

How long would all of this take? The answer is most likely "too long" in a lot of cases. At the very least it's utterly unpredictable beforehand.

And for all this you will have accomplished... nothing. Maybe you have prevented some cache/paging side channel attack or something that takes longer to execute than the interval between two "total kernel re-mixes", but that's not what actual kernel exploits use in the wild.

Not to mention that none of this exercise helps you move anything else than the kernel code and possibly static data. You have just moved the kernel code, which is good for stopping things like ROP (eg to disable SMEP), but there are lots of other attack vectors you haven't affected at all.

So to sum up:

How much work would be needed to do it? A lot.

Would the resulting impact on performance, responsiveness and timing cause a problem for anyone? Yes, for a lot of users, and the exact impact would be dependent on the specific combination of kernel, drivers, workload, etc.

How much would it improve security? Very close to nothing.

Kiwi
Silver badge

kgraft and similar schemes don't actually delete the old code. To do that, you will need to stop CPUs to make sure they aren't in the middle of executing the code you are deleting. Just like I described.

So what you're saying is all these systems that've been running for perhaps years with quite a few kernel patches - they all still have the old code from v1.0 sitting there somewhere, never deleted from memory? That seems at odds with the text.

(Hmm - noticing a pattern here. Might have something to do with the fact that I have actually implemented a lot of this in other contexts...)

I'll try to take your word for it - but what you describe and what the authors describe seems to be at odds :)

kgraft and similar schemes are making far smaller changes than moving the entire damn kernel would be.

But you're the only person talking about moving the entire kernel. And the texts do reference moving "thousands of lines" of code as well. It may not always be a huge amount, but the principles are still the same.

If you have the kernel in an area of ram, and you patch a module so that it becomes say 1 byte longer than the original module (or a kilobyte, or terabyte, use whatever number gets you to see the point) then you either need to move everything after that point, or move the module to a larger block of ram.

If you can move a module to another area of ram to accomodate an increase in the size of a module in a running kernel without noticeable performance impact, then you can move a module for any other reason without noticeable performance impact. If not, can you explain how moving the module for patching is safe and done in "microseconds" without stopping processing, but moving it the same module for other reasons requires the stuff you describe? What is the difference other than the reason for the move?

If you're gonna MOVE the kernel code, you need to have all CPUs stop at a safe point.

This would typically mean that you have to stop them one by one as they reach it, not just suddenly stop them all at once.

Again, the texts from others say otherwise. These are people I know for a fact have implemented it.

With updating files in Linux, the system works by leaving the old data in place until it is no longer being used - new instances of the program(etc) use the new code, old instances use the old code. Why would this need to be any different?

How long would all of this take? The answer is most likely "too long" in a lot of cases. At the very least it's utterly unpredictable beforehand.

"Microseconds", according the texts I've read (we're talking moving portions of the kernel, not the whole lot)

So to sum up:

Would the resulting impact on performance, responsiveness and timing cause a problem for anyone? Yes, for a lot of users, and the exact impact would be dependent on the specific combination of kernel, drivers, workload, etc.

"Microseconds", and done during peak times with no noticeable impact - according to the people who implement the live patching.

How much would it improve security? Very close to nothing.

That, at least, we agree on - while I can see that the processes are in place to allow random relocations of parts of the kernel (and the article is about randomising the entire kernel layout at boot after all, so the kernel isn't being treated as one big block of code that can't be broken up but is being treated as a number of smaller blocks that don't have to reside in any specific place in memory), I can also see that a) there are more exploits than just the few buffer overruns this may stop and b) there has to be pointers to the code and data, and these most be known thus must be discoverable (of course, isolating data and code would be helpful but there must be reasons why it's not as common as we would like).

So while it would be an interesting idea and mitigate against a few attacks, it probably does add load with little gain.

patrickstar

kgraft is essentially based on the principles I've described, with some neat hacks on top. It does, indeed, work on a function-by-function basis, by inserting a trampoline at the start of the original.

On top of this it has a mechanism that ensures everyone 'sees' (well, executes) a consistent view, in case the changed behavior of one function is dependent on the changed behavior of another.

It doesn't actually replace functions "in-place" or such, for the obvious reasons, including what you mentioned (if it's bigger than the original it'd have to move everything after it, which can't be done for any function there might be function pointers to or any other external reference, like interrupt handlers).

See eg. http://events.linuxfoundation.org/sites/events/files/slides/kGraft.pdf (slide 11)

"-Callers are never patched

-Rather, callee's NOP is replaced by a JMP to the new function

- So a JMP remains forever

- But this takes care of function pointers, including in structures"

So you have the original around, but I think it does free up old patched versions if they are replaced in another patch applied after.

You could presumably remove (well, zero out) the code after the trampoline eventually.

However, if you're going to do this for the entire kernel, after one full "move" you are in exactly the scenario which was discussed here earlier - all function calls just go to a trampoline that calls the actual function. With exactly the same issues - just done a lot more inefficiently.

TL;DR kgraft doesn't actually MOVE code, which is what we're discussing here. It diverts execution. It's meant for small patches, not shuffling the entire kernel around.

Kiwi
Silver badge
Pint

kgraft is essentially based on the principles I've described, with some neat hacks on top.

Thanks for that. I'm much closer to understanding the argument now :)

patrickstar

The really important question that remains after reading the kgraft presentation is, of course, if "getties" really is the plural of "getty"?

(PS. Note that the presentation somewhat ambigously refers to the 'World view' checking code as 'trampoline'. This can indeed safely be removed once everything is done. The JMP at the start of the original function will remain, however. Also note that if they had more cooperation from the compiler like the Windows hotpatch scheme, instead of ftrace piggybacking on the GCC profiling code, they could presumably do this without any delay whatsoever on the other CPUs, but I suspect this was evaluated and rejected for various reasons.)

jms222
Bronze badge

You don't move the kernel in physical memory because that would be silly. You just fiddle with the MMU (as you do when a new process is created anyway) to map it somewhere else. But agreed there must still be some overhead. It is also unclear whether this happens per process, per boot or with the weather. But then I'm writing without reading up on it properly like many people.

Aodhhan
Bronze badge

For all of those who don't get it...

Live relocation; copy/update kernel; trampolines... doesn't it make you want to shake your head?

It will actually be easier and more efficient (not to mention less bugs) to halt input, complete processing (yeah, this could take a bit of time; so think about) clear cached inputs, archive data and reboot.

Now, if you think this is ridiculous then think about what you're saying to... routine out some 'random' locations/toss these into memory, pause input, halt processing, halt services, change memory locations, update pointers then start everything back up; oh every 15 minutes or 4 times a day (makes no difference). BTW, think about how this 'randomizing, updating, restarting' routine has to work while everything else is in limbo.

If you think rebooting is inefficient and will take time, think about a system which is likely running more than one application along with an underlying OS to go along with your silly scheme.

YARR

Stating the obvious?

If the memory controller had a flag to make memory blocks read-only until they are freed up, then the kernel code would be immune to buffer overruns. Only the memory containing the kernel state (stack / heap) might need to be dynamically relocated.

patrickstar

Re: Stating the obvious?

Uhm, the kernel code itself is read only. Kernel overflows target the stack and heap. They have to be read/write for obvious reasons.

Christian Berger
Silver badge

It doesn't matter that it doesn't relocate in RAM while running

Relocating it once per boot is enough. You essentially hide a 1 Megabyte Kernel in 4Gibibytes of space... or 16 Exibytes if you're on a 64 Bit plattform. Guessing the right address gives you a 1:4096 or 1:17592186044416 chance of successfully hitting anything inside the kernel. (I may be off by a factor of 2)

And what happens if you guess wrong? Your kernel will have a page fault and cleanly terminate, resulting in a reboot and a new kernel layout.

BTW if you have guessed one address of the kernel directly, you still haven't won very much, you still need to guess what part of the kernel you've just found, and where the parts you want are.

patrickstar

Re: It doesn't matter that it doesn't relocate in RAM while running

Nitpicks:

First of all, the actual ASLR entropy is much lower than that in any sane implementation, for various reasons. Usually it's a couple of address bits.

Second of all, often you can figure out the ASLR base from a single leaked pointer. Then you have effectively defeated ASLR.

See https://grsecurity.net/~spender/exploits/wait_for_kaslr_to_be_effective.c for an example of how this is done against Linux.

This is one of the reasons why you should always build a custom kernel when security matters, and protect the build tree and kernel image itself from potential intruders. That way just leaking the kernel base is not enough - you still have no idea of exactly where specific code and data lives inside it.

Anonymous Coward
Anonymous Coward

Re: It doesn't matter that it doesn't relocate in RAM while running

But then how does Userland interact with the kernel without some way in, which an exploit can just as easily usurp? Customizing the kernel may help randomize jump points within the kernel, but you still need common interface points, unless you require all userland programs to be compiled against your custom kernel as well (which may not be possible if some of your userland programs are not open-source).

patrickstar

Re: It doesn't matter that it doesn't relocate in RAM while running

The 'common interface point' is the syscall interface. This doesn't have to reveal anything about the underlying memory layout, any kernel addresses, etc. In fact, when it does, it's considered a security issue and fixed.

See my earlier posting giving an example of a kernel address leak via a syscall. This turned into https://nvd.nist.gov/vuln/detail/CVE-2017-14954

syscalls on x86/64 are typically done via the 'syscall' instruction (or the classic way of using a software interrupt, eg int 0x80 on Linux and int 0x2e on Windows). This does not, in itself, reveal any information that would be useful for an attacker. Userland code just invokes the magic instruction, and some time later the execution resumes and typically a register has changed so that it now holds the return value/error code. That's it.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2018