back to article Clever attack exploits fully-patched Linux kernel

A recently published attack exploiting newer versions of the Linux kernel is getting plenty of notice because it works even when security enhancements are running and the bug is virtually impossible to detect in source code reviews. The exploit code was released Friday by Brad Spengler of grsecurity, a developer of …

COMMENTS

This topic is closed for new posts.
  1. LaeMi Qian
    Linux

    It does sound rather...

    ...more like a compiler issue to me from what was said in the article: Optimising-out bounds- and/or null-checking code!!!

    Not impressed with the kernel devs' responses anyway.

  2. spendergrsec

    Bigger issue is the SELinux vulnerability

    Really the bigger issue here is the SELinux vulnerability, as that does exist on all current distributions using SELinux out there right now, and that particular vulnerability likely goes back several years. No vendor yet has mentioned how long exactly the systems have been vulnerable, but both Fedora 10 and 11 are known to be vulnerable. The vulnerability allows anyone to exploit the large class of null pointer dereference bugs in the kernel, which would not be possible with a regular kernel.

    -Brad

  3. Anonymous Coward
    FAIL

    @LaeMi Qian

    That was my first thought too

    Couldn't they use the same source code with... you know... a GOOD compiler that actually compiles all of the code?

  4. Tzvetan Mikov

    Wrong explanation!

    Come on, El Reg, I really expect better from you.

    The following: "Although the code correctly checks to make sure the tun variable doesn't point to NULL, the compiler removes the lines responsible for that inspection during optimization routines." is completely false.

    The bug is real, but it is a very simple bug. Not checking a pointer for a NULL value. No, the code does _NOT_ check for NULL and that is what causes the problem. This has been blown way out of proportion.

    For those who know C, this is the relevant code:

    struct sock *sk = tun->sk;

    ...

    if (!tun) return POLLERR;

    The bug is in the 1st line - it uses tun before checking it for NULL. The check is a few lines below. A very simple bug that happens to the best of us.

    Now the exploit is extremely clever, but the bug itself is trivial.

  5. spendergrsec

    re: gcc

    It's not the fault of GCC that the kernel developers failed to use the proper optimizations to build the kernel with. There exists a specific gcc optimization flag, "-fno-delete-null-pointer-checks" that keeps these kinds of bugs with this pattern from turning exploitable like mine did. This flag will be added in the next stable version of the kernel.

    -Brad

  6. Rick Giles
    Linux

    I can't believe...

    That any of this is an issue considering the sheer number of competent coders that are out there. This is the kind of stuff that I expect from M$.

  7. Robert Heffernan
    Grenade

    Security First

    I want to know who's idea it was to have the compiler remove NULL pointer checks by default. If you are testing for NULL you are doing it for a reason!

    The way I see it this is not a failure of the kernel team for not specifying the -fno-delete-null-pointer-checks compiler flag, it is a failure of the gcc team having the compiler do away with such checks by default!

    I for one, would prefer a kernel that spends a few extra cycles testing for bad parameters, to getting my system reamed by some pimply script kiddie in china!

  8. spendergrsec

    Mikov is wrong, explanation is correct

    Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed.

    I'm sorry you don't seem to get it, but you don't make yourself look smarter by spamming your response on every site mentioning this vulnerability.

    Oh and for reference, Red Hat has marked the SELinux vulnerability I disclosed as "High Severity":

    https://bugzilla.redhat.com/show_bug.cgi?id=511143

    -Brad

  9. jake Silver badge

    @Rick Giles

    All major chunks of code have bugs. 'tis the nature of the beast.

    Unfortunately, it's human nature to point fingers, thus this minor bru-ha-ha.

    It'll get fixed, the world will continue spinning, and hopefully people will learn something.

  10. Chris Gray 1
    Boffin

    gcc flag

    Well, this interested me, so I wanted to check. Here is what "info" says about that gcc flag:

    -fdelete-null-pointer-checks

    Use global dataflow analysis to identify and eliminate useless checks for null pointers. The compiler assumes that dereferencing a null pointer would have halted the program. If a pointer is checked after it has already been dereferenced, it cannot be null.

    In some environments, this assumption is not true, and programs can safely dereference null pointers. Use -fno-delete-null-pointer-checks to disable this optimization for programs which depend on that behavior.

    Enabled at levels -O2, -O3, -Os.

    I don't know the kernel environment, so I don't know what happens on a NULL pointer dereference there. But, with typical user code, what gcc is doing is reasonable, if a bit extreme.

    The bug is in the kernel code, where the check is *after* the dereference. Even if the author knows that that works in the kernel environment, I think it is still a bad idea because it is quite non-obvious. If performance is that critical, then add a comment explaining what is going on. Adding the gcc flag to the kernel compile flags will help.

    All IMHO of course - I'm not a kernel developer.

  11. Anonymous Coward
    Anonymous Coward

    Completely avoidable problem

    This kind of defect can be found automatically by static analysis tools.

    Coverity Prevent, Klocwork and Microsoft Visual Studio should all be able to flag simple mistakes like this one.

  12. Alex 3
    Gates Horns

    Microsoft

    Funny, normally if it's a Windows vulnerability you have a trillion Linux heads jumping up and down in the comments forum. When it happens with Linux I notice nobody from the MS camp can be bothered :).

  13. BlueGreen

    @Tzvetan Mikov

    I'm about to demonstrate that I'm not an expert, but why don't non-kernel processes have their (virtual) first page/segment, into which any null (as zero[*]) would point, by default removed from the process' address space? The hardware would then catch it and hand it to the kernel on a plate.

    And perhaps get GCC to report check-after-use constructs like this which are clearly wrong.

    [*] null != zero in the C spec but in any current machine I'd expect it to be.

  14. ElReg!comments!Pierre
    FAIL

    So, no danger whatsoever then?

    Let me check, this is a potential *compiling* problem, so the kernel code is sound. The compiler is OK, too. It´s just a matter of passing the right options at compilation time. Hardly a Linux problem then. More like a *potential* vendor problem...

    Good to flag, so that self-compiling guys don get caught pants down, but hardly the end of the world. Especially as, from what I gathered, any exploit would need to run with the setuid bit set, which, let´s be honest, is not bloody likely to happen in any standard distribution, let alone hardened ones. Dubious setuid programs are likely to be prevented from running in the first place. It looks suspiciously like a ¨Oh my dog, if I run exploit code as root my system might be vulnerable!¨. Wake up people, regardless of the OS if you run exploit code as root you´re screwed. And any attack that needs admin privilege to be efficient is a non-attack to begin with. If I get admin access to your system, I am totally not going to try and exploit an obscure vuln in the kernel. There are much easier and more interesting things to be done. I side with Linus on this one. Any program running with the setuid bit set *is* a potential hazard and should be carefully reviewed, that´s why it´s considered bad practice, and that´s why it´s forbidden (or triggers massive warnings) in most serious distros. Now if your sysadmin is willing to make his system wide open, it´s hardly an OS problem, is it?

    It´s still a clever attack, one of which might spread using social engineering to root Ubuntu n00bz. Oh, except that Ubuntu doesn´t seem to be vulnerable (yet).

    Just one more thing, hardening a system doesn´t mean running SELinux. It means (amongst other things) that only trusted code is allowed to run, so this attack code is never going to be allowed to run in the first place. To this regard, the article is misleading: hardened systems are completely, absolutely, positively, 100% safe.

    For one second I thought some of my systems could be vulnerable, I´ll just relax and have a pint or ten now...

  15. Francis Vaughan

    Language subtleties

    Stepping back for a moment there seem to be a few lingering issues that are not really fully resolved.

    A null pointer dereference exception is a nice debugging aid. But in code that is supposed to be secure it can never be trusted. It depends upon the system protecting an area of memory at address zero. Typically a page. Clearly it may fail to trap the dereference if the data structure referenced by the pointer is larger than a page. This is not exactly a common thing, but it isn't impossible. Array indexing through a pointer with large array indexes might also come under this failure, and is a much more common thing.

    The point is that secure code can never rely upon null pointers being trapped. Code must always check. Always. The colorary is clearly that optimising out null pointer checks is always incorrect in secure code. Always.

    It seems that someone forgot that in kernel space, you don't have the possibility of protecting memory like this and that, a-priori, this compiler optimisation is invalid. Always, for all code.

    This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code.

    One might hope that no SUID program anywhere in the OS is compiled with this optimisation. It isn't just the kernel, it is the entire OS build process potentially at risk. So this is an issue that reaches to each and every distribution packager.

    Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough.

  16. joe 14

    @Alex 3

    Microsoft Rules! Is that better?

    I think i'm going to wash my mouth out with soap now ;P

    LOL, your assumption is that MS people understand what these VERY intelligent people are talking about. I sort'a do and i'm just a mac-moron :p That might be a slight; let the war begin!

    ROTFL!!!

  17. spendergrsec

    The joke's on you Michael

    The reason that quote was included in my exploit was because of the incredible incorrectness of it, as I was indeed exploiting the kernel in every case, and in the case where SELinux was enabled, there was no setuid binary necessary at all. So Linus' analysis at the time was completely off. Linus is no security expert -- I don't understand why you Linux zealots prop him up as one. If you really want to know what Linus thinks about my exploit, why don't you ask him about it now that he's (presumably) actually seen it? I know what he's said about it in private, and he is most certainly not calling it "trivial bollocks." So with him as your idol, do you now also agree it's not "trivial bollocks" or do you have any critical thinking of your own? You ignore the response of every other legitimate security researcher and point to a quote from Linus in reference to a video of the exploit I posted last week, which was included in the exploit precisely because it was so horribly and hilariously wrong.

    "Trivial bollocks" that is currently unfixed and rated by Red Hat as "High Severity."

    It's exactly this "let's fix the bug, patch the software and get on with it" that perpetuates the cycle of "fix the bug, patch the software and get on with it." That's such a 1992AD security mentality, which the rest of the world has moved past, while the Linux upstream still lives in the security stone-age.

    -Brad

  18. Anonymous Coward
    Anonymous Coward

    @joe 14

    I develop code to run on MS platforms, and I know what most of you 'Linux people' are talking about. But the fact is that all software contains bugs and vulnerabilities. Jumping up and down and crying out about whos better than who is just childish bullshit. Maybe 'we' realize that and 'you' dont? Personally I dont care which platform people choose. Right tool for the right job I say.

    I mean really. When there is a bug on Windows, it is better for *everyone* if it is patched. Likewise when one is discovered on Linux. Nobody wants systems falling down or sending spam (unless you are the one doing the toppling or spamming!).

    Cant we all just get along? :)

  19. Crazy Operations Guy
    Unhappy

    @Alex 3

    You'll find that most Windows users are quite well-adjusted and not likely to jump on the fanboy bandwagon, like how I see some Linux users/zealots do so (I am not saying that all are like this, just enough to make it noticeable)

    I have grown tired of all the fanboism that comes with a story that projects Microsoft / Linux / Mac / etc in a less-than-perfect light. I wonder when the time will come that people realize that it is just a personal choice and no matter what is poster on a forum will never change the minds of others, and that extreme thinking only takes away from your argument.

    I suppose this is that same as how everyone has seen the Muslims / Christians / Jews, a few extremists will cast the entire religion in bad light and then everyone else will assume that a person from another religion is a terrorist / bible-thumping racist / money-grubber.

    Life has taught me that there will always be trolls harming the adoption of good ideas, people constantly using inflammatory phrases in an attempt to convert people to his or her side but rather harming their own position. Perhaps the only way to treat people like this is just to ignore them, to effectively deny them the attention that they so crave.

  20. Shakje
    Stop

    ZZzzzzz

    It's a code bug, not a compiler bug. The compiler ignores the redundant NULL check after the pointer's been used (which, assuming the compiler is smart enough to work out if a pointer may have changed its reference, is perfectly reasonable). So, surprise surprise, open source doesn't lead to perfect coding however much some people believe it does. True, this isn't a major issue, it's still slightly worrying that NULL reference checks aren't checked with a static analysis tool before releases..I would have thought that was pretty standard practice for something as important as an OS kernel.

  21. Anonymous Coward
    FAIL

    Enough rope to hang themselves

    So this is why people developed Java, C# and other "less flexible" environments.

    Because if you give them enough rope, the programmers will hang themselves.

    As they have, again.

  22. Richard Kettlewell

    -fdelete-null-pointer-checks

    What would be nice would be a -Wdeleted-null-pointer-checks (and similarly for any other optimization options that infer that certain bits of code can be deleted). Then it would be possible to have the optimization without also the risk of bugs.

  23. GettinSadda
    Boffin

    Major flaw?

    > "Setuid is well-known as a chronic security hole," Rob Graham, CEO of

    > Errata Security wrote in an email. "Torvalds is right, it's not a kernel issue,

    > but it is a design 'flaw' that is inherited from Unix. There is no easy

    > solution to the problem, though, so it's going to be with us for many

    > years to come."

    Um, so doesn't his translate as "Linux is known to have a major security hole that is unlikely to be fixed in the near future"?

  24. Tony Hoyle
    Stop

    Torvalds is right, really

    This 'exploit' requires the user to have root in the first place, to inject a setuid program into the system (which would be caught by the next run of tripwire and SELinux wouldn't let it run anyway, but let's not let facts get in the way of a good story).

    If the bad guy gets root = game over. Anything else they do is just icing.. even SELinux isn't an absolute defence against this.

    I agree the optimisation flag on gcc is the real bug - it should be flagging these dereferences as errors not deleting the tests.

  25. Peter Hartley

    @spendergrsec: no, Mikov is right, you're scaremongering

    The simple bug is dereferencing tun in the line "tun->sk". The fact that after that there's a NULL test on "tun" which GCC correctly optimises away, doesn't make it a more serious or unusual bug -- although it would certainly be nice if GCC issued a warning "optimising away NULL test because you've already dereferenced it". In particular, your contention that "from a source review the bug is unexploitable" is wrong, unless the source reviewer in question somehow misses the "tun->sk" line with the bug in.

    The bug in PulseAudio, which the Reg article somehow conflates with this one, is of course completely separate.

    Peter

  26. Boris the Cockroach Silver badge
    Linux

    Oh no

    Quick everyone, dump linux and run back to the safety of windows

    Actually I what I suspect will happen is an advisory to compile the kernel correctly

    But I wonder if windows has the same vulnerbility?

  27. Anonymous Coward
    Troll

    @Alex 3

    Its because the M$ Camp (me included) all have hangovers on Saturday morning as we were all out last night with real women in real pubs not geeking out over some compiler issue.

    Obligatory flame

    *nix sux - cry yourself to sleep cos some bloke with a beard made a mistake in your shitty OS

    (I don't care really - just joining in for the sake of it)

  28. Paul Shirley

    context

    The core problem here is this is a dangerous optimisation that should only be enabled explicitly, not bundled into -O3. It's dangerous because it assumes the privilege level of the code being compiled and the system behaviour of the target. It should default conservatively and doesn't.

    The source itself is strictly correct but inherently dangerous, it assumes knowledge the compiler doesn't automatically have and could have been written more robustly. Its sloppy. being blindsided by gcc gets them off just once, they need to take this much more seriously. I want robust defensive coding in my kernel, not blame shifting.

  29. Ken Hagan Gold badge

    Re: Mikov is wrong, explanation is correct

    "Apparently everyone else gets it but Mikov (who posted a similar response on lwn.net). That from a source review, the bug is unexploitable and yet I have exploited it is what makes this 'clever' as every other security expert (and Linus himself) has agreed."

    I expect that's because he based his diagnosis on the much-quoted code fragment...

    struct sock *sk = tun->sk;

    ...

    if (!tun) return POLLERR;

    If this is the vulnerability, then it is indeed a trivial "used before checked" bug. "tun" is clearly used before it is set and any decent data flow analysis would pick it up even if it is buried in a long and confusing routine. LINT has done this sort of thing for years. I doubt the Linux kernel is marked up with all the annotations required, but to suggest that this can't be found by examining the source code says more about you than the state of the art.

    If this is not the vulnerability, perhaps you could enlighten Mikov and the rest of us.

  30. Rich 2 Silver badge
    Alert

    Well fix it then!

    I have not looked at the code in question and have no intention of doing so, but according to the previous comments, the problem is either a compiler bug (though I see that this is disputed) or the code is checking for a NULL reference AFTER it has dereferenced it, which even if it is legal on a particular platform, is a bloody stupid thing to be doing!!!

    Either way, this should be trivial to fix (ok - fixing the compiler could take a while if that really is the issue, but it's hardly insurmountable).

    But being Linux, I suppose those responsible need to slag each other off and argue and talk shite for a couple of months before anything actually happens. Mmm.... I think I'll stick with BSD, thanks.

  31. Dave Bell

    And the other programs?

    I was being told that there was a security update for PulseAudio, yesterday, but there were problems with accessing the repository.

    Odd...

  32. James Condron

    Source Code

    You should have just printed the 300 lines of comment as the article, very funny, though perhaps in places not intended.

    Fair play though, not a bad piece of code, but this is why we don't use brand spanking kernels. That being said, it is a bit of a non-issue, there are some very specific circumstances and dependencies needed here, and the exploit is a tad flimsy in places- though hey, if it works...

  33. Anonymous Coward
    Boffin

    @Francis Vaughan

    > "This is sort of worrying really. One would have thought that by now the kernel writers would have spent the time to work with gcc and identify all the optimisations and clearly understand which ones are inconsistent with the peculiar constraints of either secure or kernel mode code."

    They do, and the gcc team are happy to work with them and add new options or modify optimisations to make the compiler more suitable for their usage. But I don't know of anywhere the kernel team have ever sat down and made a clear list of what they do and don't want the compiler to do; they're a bit reactive rather than proactive, what tends to happen is that some optimisation turns out to cause a problem for some bit of code in the kernel, the kernel team approaches the gcc team and gets the problem addressed, then six months later it all happens again...

    >" Indeed, it might be nice if the gcc developers took a moment out to provide a list of know good optimisations that do not rely on assumptions about memory layout, exception behaviour and likely, but not guarenteed, code structure. Maybe adding a --secure-optimisations-only flag would be a good thing. Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough. "

    Now you wait just a cotton-pickin' minute there. Kernel development is hard-core stuff, and not suitable for amateurs and dabblers. You need to know how a computer works from top to bottom to do it, you need to understand everything from hardware and busses and memory accesses and caching to low level assembly and synchronisation and threading techniques up to the level of security and usage patterns and efficient algorithm design - and you need to understand how the toolchain works and what it does. Kernel developers have very special and unusual requirements, and a compiler is a general purpose tool for a broad audience. It is for kernel devs to know and clearly explain their requirements, not for non-experts (compiler devs) to attempt to second-guess them. They should absolutely be expected to read the fine print of the opimisation flags they want to use to build their code - it's only one more drop in the ocean of fine print they need to read and understand to write reliable kernel code.

  34. Jason Bloomberg Silver badge
    FAIL

    Passing the buck

    There seems to be confusion here as to whether it was a compiler bug or a coding error, whether the kernel is flawed or not ...

    If it is the compiler optimising away something when it shouldn't have, this is a fail for the compiler developers.

    If it's a case of the source code being correct and the compiler optimised away a null check this is a fail for the developers who built the kernel and their process, but not a fail for Linux per se.

    If the source code is incorrect then that's a fail for the kernel programmers and no amount of buck passing to the compiler or arguments that it's not a bug will wash.

    No matter how the bug arose, if it exists, is exploitable and demonstrably so in the field, it's a huge fail for Linux either way, and trying to claim it's not worth worrying over is simply trying to downplay the issue.

    If there's no source code error, the compiler did not optimise away something it should not, and no exploitable bug exist then I'll agree it's a storm in a tea-cup. Unfortunately that does not seem to be the case.

  35. Anonymous Coward
    Anonymous Coward

    @ Michael 2

    Yes, there's one problem with your theory.

    The Linux kernel has a known, demonstrably exploitable security problem in the field, and the kernel developers do not wish to fix it.

    Trivial or not, apparently it's not so trivial that they'll be fixing it any time soon.

    No, the reality is that too many Linux zealots including the kernel developers refuse to ever accept they're wrong on anything.

    This is why Linux is never going to make traction whilst this attitude is so prevalent and why it's stuck in a rut. Because Linux developers write the code that Linux developers want to write, usability be damned. Find a security exploit in their code? They'd rather let it stay in there claiming it's not their fault than accept they're not perfect and are equally capable of making simple, blatant mistakes.

  36. Ed 4

    The real problem(s)

    There are two problems.

    The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error.

    That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check.

    Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid.

    The second problem is that some of the people on the Linux kernel do not apparently intuitively grasp the seriousness of this.

  37. Anonymous Coward
    Boffin

    The bug is in the kernel, not the compiler.

    The compiler is a red herring. It doesn't delete NULL pointer checks - *UNLESS* you've already dereferenced the pointer, in which case it quite reasonably assumes that you've already crashed before you get to the check anyway - and the exploit proves that it is correct in this assumption, or rather that crashing would be a best case! But testing for a NULL pointer at the end of the routine is just too late: the bug occurs here:

    struct sock *sk = tun->sk;

    At that moment, because it is allowed for a user process to map memory at address zero, what you have done is inject a user-controlled data structure into the kernel, which implicitly trusts its own data structures. That is the security violation, and it's nothing to do with the compiler, it's more a consequence of a false assumption in the kernel:

    1) I can trust all pointers to kernel objects, because they will only point into kernel space and only privileged (i.e. trusted) code can place anything in kernel space.

    2) But NULL is a pointer value and it is not in kernel space.

    3) The kernel trusts *all* pointer values, incuding the one that happens not to be in kernel space but is under user control.

    Ouch. The false assumption could be stated in a single sentence as "we can trust all pointer values, including NULL, because even though it's technically not a kernel address it will always make a crash if you access it". But no, it won't, that's just not true.

    What might work better would be to build an option in the compiler to use a value like 0xffffffff as a NULL pointer instead of numerical zero. Or for a few pages (maybe even a few meg?) down at the zero end of memory to be declared 'honorary kernel space', protected by the same kind of PTEs that prevent the user accessing kernel space, and not mmap'able, although that might only mitigate rather than fully block the entire class of exploit.

    I don't understand grsecspender's response:

    >"from a source review, the bug is unexploitable"

    We've known for some time that dereferencing a possibly-NULL pointer is exploitable, it was first shown by that ARM exploit by Barnaby Jack

    http://www.theregister.co.uk/2007/06/13/null_exploit_interview/

    then there was the SWF/ActionScript null pointer dereference by Mark Dowd

    http://documents.iss.net/whitepapers/IBM_X-Force_WP_final.pdf

    so frankly, any source code audit that doesn't ask the question "Could this pointer possibly be NULL?" is not asking the right questions at all. I think it's fair to say that there are two bugs: the use-before-NULL-check bug is one, and the user-is-allowed-to-mmap-NULL is the underlying and more serious bug which is what enables this and the whole class of other similar bugs to be exploitable. (You could probably argue that a third bug is in the code that calls this routine while passing it a NULL pointer in the first place.)

  38. Anonymous Coward
    Thumb Up

    so what!!

    Put this way I know which OS I would use to connect to the internet ! the one I've been running for the last 7 years with no anti-virus,spamware etc.... that will be LINUX then !!!!

  39. Sean Timarco Baggaley
    WTF?

    The problem is C.

    That GCC was optimising-out a null-ref check when it could clearly see that the variable had already been used, is *expected behaviour* for the compiler and clearly documented in its manual. If the programmer couldn't be bothered to RTFM, that's *his* fault. not the compiler developers'.

    That non-time-critical code for an allegedly modern operating system is being written in a portable assembly language that's getting on for nearly 40 years old is the real bug here. Even my humble Nokia 2630 is more powerful than the computers C was created for.

    Oh yes: for those who haven't read the original article the Register's piece was based on, take note of the following quote from grsecurity's own website:

    "Due to Linux kernel developers continuing to silently fix exploitable bugs (in particular, trivially exploitable NULL ptr dereference bugs continue to be fixed without any mention of their security implications) we continue to suggest that the 2.6 kernels be avoided if possible."

    Note that they're referring to *multiple* instances of these kinds of bugs. The line people enjoy quoting was just one example, which the researcher used to write his proof of concept exploit. That the Linux kernel is *riddled* with such bugs reflects poorly on its developers.

    This is also the kind of bug which, had the software been written using half-decent tools, would never have made it into the released code in the first place. For all the criticisms of Microsoft's ".NET" languages, the simple fact is that C# wouldn't have let you write such bad code in the first place. When a user interface—and that's all programming languages are—makes unwanted actions easy to perform, it is time to replace it.

    Microsoft may have their flaws, but at least they're trying to do something about the appalling tools this industry insists on using. They're still a long way from development nirvana, but at least it's *something*.

    (Oh yes: my computer is a Macbook Pro, not a Windows box. So please don't waste time accusing me of fanboyism. There's no such thing as a "best" platform. Only a "least worst".)

  40. Anonymous Coward
    Boffin

    @The real problem(s)

    >"The first, and most critical problem, is a bug in GCC, where it optimizes away null pointer checks in some cases where it should be giving a fatal compile time error."

    It's not a bug. It is an explicitly documented feature in the manual, which also warns you to take care with it:

    > >" `-fdelete-null-pointer-checks'

    > >Assume that programs cannot safely dereference null pointers, and that no code or data element resides there. This enables simple constant folding optimizations at all optimization levels. In addition, other optimization passes in GCC use this flag to control global dataflow analyses that eliminate useless checks for null pointers; these assume that if a pointer is checked after it has already been dereferenced, it cannot be null.

    > >Note however that in some environments this assumption is not true. Use `-fno-delete-null-pointer-checks' to disable this optimization for programs which depend on that behavior.

    > >Some targets, especially embedded ones, disable this option at all levels. Otherwise it is enabled at all levels: `-O0', `-O1', `-O2', `-O3', `-Os'. Passes that use the information are enabled independently at different optimization levels. "

    >"That's right. I'm saying GCC should bomb on this code, complaining that the pointer is used before the null pointer check."

    I'm saying that when the user explicitly tells GCC to assume that it can do this, GCC should assume that it can do this, and if it is not true, the user should not have told lies to the compiler, and the compiler should do what the user tells it. Maybe you *want* to get a SEGV if the pointer is NULL because you plan to handle that elsewhere? The compiler can't second guess you. You told it that control flow cannot possibly reach that test if the pointer is NULL; it would be stupid of the compiler to bother inserting it.

    >"Note: I think GCC can safely optimize away redundant null pointer checks - and I've certainly seen code that has those. But optimizing away a null pointer check simply because it's already seen broken code is stupid."

    It's not stupid. The vast majority of these checks are not going to come from buggy code like this, but from places where any/and/or/all of macro expansion, inlining and templating have combined to generate inefficient code. If the compiler wasn't very aggressive with optimising this stuff, C++ templates would still be the hideous bloated monstrosities they used to be back in the 90s - i.e. practically unusable in anything that has to be the least bit efficient.

    You can argue about whether this is a "dangerous optimisation", and should only go in -O3 by default (I might agree with you), or whether it shouldn't be turned on by default at all but always left to the user to request (I'd probably disagree), and you can argue that it should generate a warning (I'd certainly agree with you, but I might consider that sufficient reason to leave it enabled at lower -O levels), but calling it "stupid" is simplistic and lacks insight into the issues. As I think I mentioned once before, GCC is a general purpose tool that must work for a huge range of different applications from small realtime embedded to overnight number crunching batch jobs. No single set of optimisations is ever going to be completely right for all those applications, and the -O levels are crude guidelines, but if you have a very specialised need, you need to take control of how you use your compiler.

  41. Anonymous Coward
    Anonymous Coward

    @Sean Timarco Baggaley

    Sure, get rid of C, that will free up our time to concentrate on reference counting, garbage collection, bytecode inefficiencies, blah blah blah :).

    Linux, Windows and Mac all run on C-based kernels. C may have 30-year-old problems, but at least they're *understood*.

    It's worth pointing out that there are many demands on software, not just security. Execution speed comes out pretty high on the list, and nobody wants to cripple their PC with a kernel that already killed their performance for them before they launch their first application.

    C is 'portable assembly' because that is precisely what is required for an efficient kernel implementation. If kernel developers could write in something else they would - they are not masochists!

  42. Shakje

    There's no reason at all

    To make it an error, doing that would pretty much go against the principles of C++, as there's plenty of reasons why you might want it to actually do that. Even putting it in as a warning is a bit dodgy as far as I'm concerned..

    A far better way of managing it would be to have a compiler switch that flags them as warnings instead of just optimising them away silently, then when you add new code you could easily keep track of it, especially if you've done some pointer intensive code.

  43. Anonymous Coward
    Anonymous Coward

    Optimisation

    Surely the reason for the optimisation is (among other things) code like this:

    inline char foo(char *p) { if (p == 0) return 0; else return *p; }

    char bar(char *p) { *p = 2; return foo(p); }

    int main() { char c = 0; return bar(&c); }

    If foo gets inlined into bar, the compiler can spot that the null pointer check in the inlined code is unnecessary and remove it. This is a most excellent optimisation (granted, in this example foo and bar do so little work that other optimisations may render it unnecessary).

    As far as the C standard is concerned, this optimisation doesn't have to assume that a null pointer dereference would halt the program. The dereference of a pointer which may or may not have been null means that the implementation can thereafter assume it wasn't null. If it was null the behaviour of the rest of the program is undefined anyway, so the tiny detail of the assumption being false doesn't make it invalid. If dereferencing null is valid and is supposed to have predictable behaviour, then you're into non-standard C, so you have to read the compiler docs. GCC's behaviour appears to be (a) standards compliant and (b) documented, so should come as no great surprise to the programmer.

    For my example code, the optimisation certainly should not result in a compiler warning or error. There's nothing wrong with either function foo or function bar. It's just that one of them takes the (perfectly reasonable) approach of checking its input, and the other one takes the (also perfectly reasonably) approach of requiring that its callers not pass in null pointers. Standard functions exist taking both approaches - compare for example time() and strlen().

  44. Werner McGoole
    Thumb Up

    Argument is good

    Maybe the point is being missed here.

    I doubt that anyone regards the existence of an exploitable bug/feature in Linux to be good. However, ask yourselves why there is argument about whose "fault" it is...

    In a complex system, one always tries to put the right solution in the right place. There are several ways this problem might be fixed. Choosing the wrong one might fix it more quickly, but may cause problems later. If speed is not the over-riding issue (and it seems it isn't) then thinking carefully (and this means arguing) about whose responsibility it is to protect against this problem is the correct response.

    Once you have the correct protection installed in the correct place and everyone knows whose responsibility it is to look after this in future, then you have a more robust system. Failing to argue this out and fixing it the wrong way just starts you on the path towards a system that's unmanageable from a security point of view. I think you all know the example I'm thinking of...

  45. Tzvetan Mikov

    Much Ado About Nothing

    @spendergrsec: Brad, you should really stop tooting your own horn and it would also help if you weren't unnecessarily rude . Everybody so far has acknowledged that the exploit is very impressive. Good work. I really mean that and have said it from the start. But please, don't let that go to your head.

    I am trying to clarify to readers of El Reg who may not be experts in C or the Linux kernel (unlike the crowd in LWN), that contrary to what has been said, this is an ordinary run of the mill bug, which is easy to spot and fix in a regular code review, and it is not caused by a flaw in GCC.

    @BlueGreen: Normally the hardware would catch the NULL pointer reference and it would result in a kernel oops. However part of the exploit is that it (relying on another bug) first maps valid memory at address 0. It really is a very clever exploit relying on unrelated kernel bugs.

    The bug in question itself however is trivially noticeable and fixable. Any tool like LINT would have caught it (in theory; in practice it is not so easy to run LINT on the kernel).

  46. Anonymous Coward
    Happy

    Binary Is Better Then?

    So perhaps the notion that even the kernel should be distributed as source code and compiled is not such great idea after all????

  47. The Jase
    Flame

    And this is why Linux is not popular

    When an article of this nature comes up for Mac or Windows, we flame each other, etc, but at the end of the day, the company fixes it. When this comes up for Linux, a whole bunch of code monkeys have a pissing contest ("I know more about coing than you, look at this crap I typed" and "mom, mon, he said I couldn't code, tell his mother so he gets spanked") and argue that its a compiler issue.

    And this is why Linux is shite, its for code monkeys, who actually like to spend their weekends coding and compiling, rather than having a life. As they say, you get what you pay for.

    TANSTAAFL

  48. Anonymous Coward
    FAIL

    I not a few things

    1) Big geek fight over who's fault it is - problem not addressed

    2) Much finger pointing - problem not addressed

    3) "It's a feature!" (of either the kernel or GCC) - problem not addressed

    So Linux out in the field has (or will have...) a critical security flaw and the freetards are too busy waving their pocket protectors about to actually fix the problem. With an attitude like that, is it any wonder most organisations who rely on IT to run their business would not touch Linux with a shitty stick?

    Get your act together you bunch of jumped-up primadonnas; you are not doing yourselves, Linux or the open source community any favours with your public bitch-fest.

  49. This post has been deleted by its author

  50. Defiant
    Thumb Down

    Really Sad

    I normally wouldn't bother posting on this subject matter because I'm not a fan of Binux but I had a feeling the Binux geeks wouldn't be able to post without mentioning Microsoft and you didn't let me down. Great so you use Binux but get over your fascination in Microsoft and your hatred for those who prefer it over your free alternative. Yes Binux is free and people still prefer the paid alternative LMAO

  51. Francis Vaughan

    Following up

    I wrote: " Depending upon other developers to read the fine print of each and every optimisation flag is clearly not enough. "

    AC replied above: "Now you wait just a cotton-pickin' minute there. Kernel development is hard-core stuff, and not suitable for amateurs and dabblers. "

    I don't disagree. (I do have the background, I have been involved in a number of OS projects, have written kernel code, debugged production commercial kernels, taught computer architecture etc etc.) The point was that - no matter what the needs - it still clearly, as demonstrated, not enough. It wasn't this time, and I will bet won't be again.

    One of my pet hobby horses comes to mind here too. A deeper problem that besets pretty well all code now is the whole idea of a null pointer. This is an age old problem that comes in many forms. Basicly you have out of band semantics being carried with the in-band data. In this case zero has special meaning for pointers. It could have been any number (ffffffff was mentioned above as an alternative, but it doesn't work any better). The semantics of "not-a-valid-pointer" is carried as a special case value. We are so used to this as the only way of doing things that we forget an entire generation of computer architectures that never suffered from such issues. Tagged memory was about in the 60's, but here we are, 40 years later, and the total dominance of the wretched x86 has left computer system design moribund. It is like the Apollo missions. 40 years ago great things were done. Now we can't even do the same stuff, let alone progress beyond.

  52. Sean Timarco Baggaley
    Stop

    Re. "@Sean Timarco Baggaley"

    The problem is that the developers chose a linear, text-based UI designed in the 1970s to convey their instructions to the computer.

    The 1970s was the tail-end of the era when most computer applications were non-interactive, transactional processes. E.g. payroll runs, bank account systems and the like. You started the program, let the machines chew through all that (serial) magnetic tape, (serial) paper tape or (serial) punched cards and waited until it went "BING!"

    Very few developers write linear, non-interactive applications like that any more. Yet we are still using textual programming languages. Consider that *all* languages are fundamentally linear: we read from left to right (or vice-versa in some societies), top to bottom, from the beginning to the end. Programming languages are *inherently* linear.

    When you try and remove that linearity—e.g. with many OOP attempts—you end up with a language so stuffed to the gills with structural scaffolding and similar fluff that you end up with code that's obfuscated behind umpteen layers of brackets, punctuation and meta crap. Because—I repeat—all languages are INHERENTLY linear! No matter what colour you paint your cat, it'll still be a cat.

    Now, textual user interfaces are still used in IT today, but they're no longer centre-stage. Most people prefer graphical interfaces. You can convey a lot more information visually than you can with words alone, yet software development tools are stuck with a "text in text files!" mentality that is arguably doing far more harm than good. Just because textual interfaces is how we've always written code in the past, it doesn't mean this is how we should continue writing code in the future.

    Linear, text-file-centric programming languages are the wrong tool for the job. That there are practically no mainstream alternatives reflects very poorly on the IT industry and its conservatism. (If the FSF movement really wanted to make a difference, this is where they should be concentrating. The world really does not need more UNIX clones.)

    The rise of multi-core CPUs should be a rallying cry to designers of development tools the world over. There's a massive market simply gagging for the right answer. C is not that answer. Neither are C++, C#, COBOL, Object Pascal, x86 assembly language or Java.

    To researchers and students who are looking into this field, please do not invent yet another linear, text-based programming language. We have far too many of those as it is.

    (I do have my own view on how programming *should* be done, but the essays I've written on the subject are long and unsuitable for a comments box like this. I'll post something to my website when I'm done evaluating CMS software for it.)

  53. Hi Wreck

    Of course C sucks...

    Like people haven't cut their fingers off on table saws before. For the *nix lovers who want their cake and eat it too, there auroraux (http://auroraux.org/index.php/Main_Page). Coded in Ada, because "Ada sucks the least" (http://www.osnews.com/comments/21123)

  54. Robert Forsyth

    It is not really a problem anymore.

    Until all the instances of dereferencing the pointer before testing for NULL have been fixed, you just set a compiler flag and the problem goes away.

    AFAIK, this kernel has not been released yet.

    Because it is open source, you can go and look at the source code and see the extent of the problem. And the kernel coders are 'embarrassed' into fixing it. Contrast that with closed source projects.

  55. Anonymous Coward
    FAIL

    Re: I not a few things - AC

    "With an attitude like that, is it any wonder most organisations who rely on IT to run their business would not touch Linux with a shitty stick?"

    LOL - you clearly know nothing about the industry. I work in IT and vast swathes of companies rely on Linux mostly RHEL or SuSE for their business.

  56. Paul 4

    @The Jase

    I dissagree. I am writting from my linex net book. It is brilliant for sitting on the sofa or in bed browsing the web. The MS version would have cost alot more to run well and had a disk HD, rather than the small SSD this can have for the same cost. It dose what is needed, and if it all gose tits up ill wipe it and start again.

    Wouldent use *nix for much else though, short of servers, for just those reasons. The last thing I want to do when I get home from work is to piss about with code just to get the thing working.

    BTW, this crashes about once a week. My Vista PC? Never crashed. Although im sure *nix geeks will call me a liar.

  57. BlueGreen

    Was trying to not post but here goes..

    @Crazy Operations Guy + AC 09:19 + Defiant + others: I'm with you on the childishness of bringing MS into it. All shall have bugs: a serious one was a failure to honour FILE_FLAG_WRITE_THROUGH correctly in some circumstances for Win2k, this being fixed in SP3 (ie. it took far too bloody long). This was potentially very serious if you are, as I was, managing large DBs for clients, but did anyone hear about it? Nope. I only found out by accident. MS don't advertise their failures.

    @Sean Timarco Baggaley: You were making sense up until the point you started talking about linear text. Do some research on these (eg. Visual Programming Environments, paradigms & systems, Ephraim P. Glinert), actually design a language & learn the lessons (which are: dataflow driven execution behaviour becomes totally obvious and... that's it on the good side) and use a visual language. I & a colleague used several. They were all horrid; unwieldy, slow to program with, had to rely on text to do anything significant etc.

    Also don't confuse the linear representation of text with the concepts the text impart.

    Certain stuff is a very good fit for graphical expression, such as smallish automata, which are much more comprehensible, and probably high-level UIs (perhaps inc. stuff like Mathematica). Much else isn't IME because the necessary abstractions can't be expressed graphically (how do you represent addition but with a + sign? or looping n times without numbers? Or a 'max' function without using the word? or 'pick the nth item from this array'? How do you identify subroutines if you can't use text to name them? etc. forever.)

    @Francis Vaughan: Your comments on null make a kind of sense but tagged architectures simply offload complexity from the software to the hardware. Which I think is a fail. Keep hardware simple (& consequently correct) & fast, let the compilers do the work.

    @Tzvetan Mikov - thanks.

  58. David 141
    Megaphone

    The real issues?

    @Tzvetan Mikov : Normally the hardware would catch the NULL pointer reference and it would result in a kernel oops. However part of the exploit is that it (relying on another bug) first maps valid memory at address 0. It really is a very clever exploit relying on unrelated kernel bugs.

    So the exploit manages to (1) get code inserted at address 0, and then (2) exploits (waits for) an (unrelated) null pointer dereference to get it executed? Presumably (1) is the "overlooked weaknesses" and "whole class of vulnerabilities" the article mentioned but failed to provide any details on. Because fixing (2) should be trivial and seems to be a red-herring (along with GCC).

  59. Martin Usher
    Go

    Not a big deal....

    >For those who know C, this is the relevant code:

    >struct sock *sk = tun->sk;

    >if (!tun) return POLLERR; (Tzvetan Mikov)

    This is poorly written code but it should do one of two things -- trap out or nothing in particular. Its not good code -- I personally dislike implied casts (just because everyone assumes a NULL pointer is a zero doesn't make it so and relying on the compiler fill in the blanks is asking for trouble). Assuming it does give a NULL pointer (more accurately, a NULL segment) trap then I'm not sure how an attacker could get code into that segment to execute. After all, one of the big weaknesses in Windows used to be that it never really did segmentation -- it just ran a big, flat, address space, so you could abuse it but I think even W. today will just trap out on a NULL access.

    The fix is easy. Just revise the compiler options and rebuild the kernel (then go through the sources looking for this coding problem). That's one-upmanship over Windows....the whole thing can be done in less time than it takes to write a comment.....

  60. amanfromMars 1 Silver badge

    Just around the Next Bend and Bent Operation ...... so Be Prepared .

    "The fix is easy. Just revise the compiler options and rebuild the kernel (then go through the sources looking for this coding problem). That's one-upmanship over Windows....the whole thing can be done in less time than it takes to write a comment....." .... By Martin Usher Posted Monday 20th July 2009 04:13 GMT

    And that creates a NeuReal Ruling Elite, Martin Usher, and therefore you can fully understand why the Status Quo Pretenders do not mend themselves/their ways, and are therefore Destined to be Purged Catastrophically and Unceremoniously from Power with the Control Operating Systems Collapse/IMPlosion.

  61. This post has been deleted by its author

  62. Anonymous Coward
    FAIL

    @Anonymous Coward Posted Sunday 19th July 2009 20:20 GMT

    Wow. I must have missed all those Linux boxes strewn all over the worker's desks, ramming the control centres and generally saving life as we know it. Gosh and golly. Maybe I'm Linux blind?

    Or maybe Linux only has less than 1% penetration for a very good reason. For an example, see this geeky bitch-fest about who has the biggest calculator, meanwhile the kernel could be sitting wide open to attack.

    Well done. *slow clap* This is why your beloved kernel only sees the inside of geek's bedrooms and certain niche applications. Which is a bit of a shame really, as it could be quite good if it was managed and dealt with in a professional (and customer/end-user focused) manner. At the very least it might give the incumbent OS maker pause for thought.

    But no - you keep waggling your GCC options in the air and missing the point entirely.

  63. Anonymous Coward
    WTF?

    This is being blown out of all proportion

    This is being blown out of all proportion

    Kernel root hacks have been around since the dawn of linux, i don't understand what is new about this, sorry. Finding vulnerabilities is part of coding.

    All operating systems are vulnerable when you have them at terminal level. The main challenge is any web-facing services, ensuring that you have security at that level - this is much more important. Microsoft's track record on this is not great at all compared to linux.

    One comment for the binnex guy: go run your website on bindows without doing updates, see how long it lasts vs a 'binux' box.

  64. Anonymous Coward
    Anonymous Coward

    Give us a break managed code fanboys

    For all of you muppets banging on about how C# or Java or some other managed code sanboxed language would be sooo much better than C - have you cretings stopped to think what actually does the managing? We're talking about code for an operating system here , yes OPERATING SYSTEM, the thing that operates the hardware. If it could somehow be written in a managed language what exactly pray would be doing the underlying managing? A bunch of magic elves?? Or perhaps there should be an even lower hyperviser layer that would do it! Yes, thats it! Oh , but what could you write that in then? Surely not nasty unmanaged C or assembler?

    Why don't you muppet apps developers leave the system development discussion to real programmers who have a clue and you get on with writing your cutesy GUI apps in the fluffy managed language of your choice.

  65. Wize
    Megaphone

    What we really need to do is

    1) write a macro to pull out all the windows bashing comments made over the years, swap round the words Linux and Windows and repost them.

    2) to keep a note of the url of this story for the next time some Linux user posts to a story about a Windows bug claiming how bullet proof their system is.

  66. Anonymous Coward
    FAIL

    On a slight off-topic...

    ...me and some pals went on a class on how to make *assembly* code for PICs (IC). Yes, we were writing code straight in assembler to be compiled and transferred to ICs that would later run it (it was a program designed to run a simple LCD display).

    Compiling the thing straight out of our feeble assembly skills was downright scary, because the compiler would bomb us without mercy. Guess what, things like null pointer checks on any sort were *disabled* by default, but nonetheless we were bombed, until we did exactly what the the teacher told us. Bottom line, the final code went for less than 2k of memory, and would easily fit in the IC memory. The LCD worked perfectly. Of course, the teacher was guiding us on how to avoid any booby traps we were planting beneath our own feet. Plus we were mapping every memory and register and nook and cranny used, according to the PIC structure we had, data-sheet on hands.

    The teacher showed us the same code running out of a C compilation. The thing bloated to 10k or more, could barely fit in the IC, and would run *nearly* as fast (but in fact 50% slower to run, since the PIC and display IC would run at 4MHz, way faster than the tiny code we wrote, so we couldn´t tell it was slower). The code seemed so much simpler and care-free, though. The code looked sloppy. Any of us would have gotten that written in half the time we took on the real assembly hardcore stuff.

    From this experience compiling straight out of assembly and opcodes, or using C, I fully understand now the kind of crap we run into, be that Windows or Linux. People are sloppy, they don´t care if their code is bloated, or will be so when compiled, and it is damned hard in hell to find a bug once it is past all compiler checks without a hitch.

    As a mind exercise, try to compile *the whole* kernel out of assembly, and come back in 5 to 10 years after you have brushed every bit, tested every memory loophole pointer problem, and you will have a better kernel. From my point of view and tiny experience in programming, good things don´t come easy.

    That´s why we have the bug or *feature* in the first place. that´s why we will probably find another. That´s why M$ never got their act together. That´s why *very* few people proactively search for bugs, because they just fall on your lap, when you are not looking for them.

  67. call me scruffy

    @Sean Timarco Baggaley

    Processors, even multi-pipelined and SMP processors, execute each instruction from an arc in a linear fashion, one at a time.

    Data structures can only really be modified by one process at a time, this is why we have mutexes, and well understood criteria for claiming and releasing resources.

    Programs, even with exotic event driving models are inherently linear. A task/thread/function is started (Possibly asynchronously.) it does something, it finishes, the caller picks up the result. That's how things work at the processor level, regardless of what language you're writing in, and for the most part it's how people think too.

    We can already partition and refactor code so that an operation can be described in psudeo-spatial-distribution rather than in strict linear fashion, but some degree of linearity of the entire programming experience is inevitable as the guy at the keyboard will have to press one key, after another.

    Your argument may as well be "The problem is computers as they exist today." Good luck inventing your hovercraft, but I suspect that at the end of the day you'll just invent your own wheel.

    I happen to think you're a troll, but never mind.

    @Everyone else, And I do mean everyone.

    Conspicuously bad spelling, typing, and grammar stopped being amusing a few years ago, didn't you get the memo?

    My main concern is that code like this (Using an unchecked pointer) got into the kernel in the first place. It opens up a vector for blackhats to introduce vulnerabilities.

  68. tapanit
    Linux

    Fixed in 2.6.30.2

    Looks like this has been fixed in 2.6.30.2, according to

    http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00706.html:

    "[...] Fix NULL pointer dereference in tun_chr_pool() [...]"

    After the patch the code in question looks like this:

    struct sock *sk;

    if (!tun)

    return POLLERR;

    sk = tun->sk;

    I suspect 2.6.30 and 2.6.30.1 won't appear in many distributions.

  69. Peyton
    Paris Hilton

    A zillion comments in...

    and I'm still confused. Isn't the simple answer to just arrow down to the null check on tun, type dd (we are talking linux afterall ;), arrow up to the line initializing sk, and hitting shift+p? I get the impression the kernel isn't rife with this problem... as many have pointed out it *is* something one can find in a code review... what's the big deal?

    @call me scruffy: Pedantry was never amusing in the first place. Didn't you get the memo?

  70. Happy Skeptic
    Linux

    Re: Fixed in 2.6.30.2

    "Looks like this has been fixed in 2.6.30.2, according to

    http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00706.html"

    So to sum it up we have a vulnerability that appeared in a kernel release not yet (and now never will be) adopted by a single *release* version of a Linux distribution, that a fix was available for 3 days ago (so almost same day as disclosure of the bug?) and that apparently required root privileges to exploit anyway - rendering it redundant.

    We then have a torrent of Reg "commentards" writing off Linux as an operating system because OMG it has bugs! It's no better on the recent articles about IE and Windows security holes.

    It all has the feel of the Daily Mail about it: a sensationalist article which neglects a couple of small but important facts, and then the predictable stream of knee-jerk reaction comments based on people's prejudices against this or that.

  71. tiggertaebo
    Boffin

    OS in managed code

    @boltar - while I certainly wouldnt suggest that using managed code is necessarily The Future(TM) for OS development how about you try widening that narrow mind of yours and check out the Singularity project

    http://channel9.msdn.com/shows/Going+Deep/Singularity-A-research-OS-written-in-C/

    http://research.microsoft.com/pubs/69431/osr2007_rethinkingsoftwarestack.pdf

  72. This post has been deleted by its author

  73. Random Coolzip

    @Sean Thomas Bagley

    "You can convey a lot more information visually than you can with words alone"

    True, but you can convey information a lot more quickly (and precisely) with words than you can visually. I can write a pseudocode description of how a particular function operates in far less than half the time it would take to create a simple UML sequence diagram that conveyed exactly the same information. That is, on the computer anyway -- on a whiteboard it would be probably about half the time, since I wouldn't have to drag and drop a bunch of icons with a mouse. A picture may be worth a thousand words, but it's often faster to write a thousand words than to paint a picture.

  74. yossarianuk
    Linux

    Fixed (already) in 2.6.30.2 (phew that was fast)

    Looks like it is already fixed in latest kernel - 2.6.30.2 (released today..)

    Looks like the opensource security method really works.

    Compare the speed to this fix with some of MS's well know vulnerabilities (wasn't there a year long vulnerability fixed in ie recently?)

  75. Ken Hagan Gold badge

    Re: OS in managed code

    I don't think Singularity is particularly mind-broadening at all. It is an emulator for a non-native instruction set (CLR in this case) upon which you have run an OS for that arch. There was a project about ten years ago to do exactly this with Java, a language that was rather popular at the time. However, all such projects are doomed, for what (in the present discussion) is a rather interesting reason.

    You see, an OS kernel is like an embedded system. You have complete control over all the interfaces and if you ship enough drivers as part of the system then it is entirely possible that every byte of executable code running at kernel level is your own. That makes tools like LINT (or PreFAST, for the Microsofties out there) especially powerful because they can do as much data flow analysis as you can afford. Under such conditions, most of the molly-coddling provided by a managed run-time can actually be done *statically*. Not only does this make the checks infinitely faster, it allows both the compiler and programmer to make further adjustments and improvements to the code.

  76. John Savard

    Interesting Explanation

    One of the comments clarified what was going on; that an optimization, enabled by default at higher optimization levels, makes an assumption (about dereferencing a null pointer) which is false for kernel code.

    If one is compiling kernel code, I would have thought that one would have to set some special compiler flags in order to do so. Thus, maybe the best fix for this would be to change the compiler so that -fno-delete-null-pointer-checks is on by default, no matter how high the optimization level, if the compiler has reason to believe it is compiling for an environment in which this particular optimization is invalid.

  77. Anonymous Coward
    Anonymous Coward

    1960's kernels written in PL/I, Algol, etc.

    Back in the 1960's, many of the operating systems were written entirely in high-level programming languages, including their "kernel". For example, Burroughs used Algol for some systems, and the current generation of that line from Unisys still does.

    The Multics operating system developed by Honeywell and MIT was written entirely in PL/I. Many of security and operating system concepts of today originated in Multics.

    Many describe Unix as a "child" of Multics, but unfortunately it was of the earlier Multics efforts by MIT, RCA, and Bell Labs that was abandoned because of issues with the third-party-developed PL/I compiler required to implement everything else. After the split, RCA sold its computer division to Honeywell (who could build a PL/I complier, itself written in PL/I).

    While those at Bell Labs were involved with design and other early work, they left before the widespread changes required to resolve issues found during developent and to reflect ongoing research. Since their initial needs were single user in a protected environment, they gave little attention initially to security, file systems, and such. (Much of this was later "fixed" as AT&T began using Unix for real projects and Unix was available to universities.) Their implementation language was "C", essentially an alternative assembler language for the PDP-11.

    VMS/OpenVMS is a truer descendant of Multics on one side, the DEC TOPS-10 and RSX-11 operating systems on the other, and addressing security (e.g. "Orange Book"), clustering, sharing of resources,and much more, while requiring relatively few people to administer and support even massive networks of systems. Even today, none of the Unix, Linux, MS Windows, or OS X systems come close to matching the reliability, security, clustering, or ease of use found in VMS 25 years ago.

    The VMS executive (kernel) is mostly written in BLISS, a system implementation language which DEC had already used to write some parts of TOPS and RSX. BLISS has features and limits much like C, lacking language features like I/O and UI, while producing modules that can run bare bones in restricted contexts, without access to most system functions or libraries. As I remember, the code to process device interrups had to be written in BLISS or MACRO, with C later allowed,

    Any supported language could be and were used together or separately to implement the ancillary processes because they all used the language-independent calling standard and data structures. This is the same for user applications call system functions, with the automatic validation and verification of arguments and data eliminates errors that routinely plague Windows, Unix, and Linux. Interestingly, when the "Open" software support was added to VMS (thus OpenVMS), the UNIX-style system calls sometimes lacked the additional checking of their aguments required by VMS, resulting in several security issues involving VMS. VMS has been been ported from VAX to Alpha and then to IA-64, and it is now owned and supported by HP.

    Finally, a note on tasking, threads, etc.

    We also have known for decades how to avoid all the issues related to multi-tasking, threads, sharing of data structures, deadlocks, rundown/cleanup on abnormal exit, etc. The PL/I standards committee examined this requirement extensively about 25 years ago and determined that none of differing models found in existing implementations of PL/I could be the basis for the a standard. The committee developed a general model for "real-time" and the language constructs required to add this to the PL/I language, publishing an approved technical information bulletin to document their research and to guide possible implementors. This model uses block-structured critical regions to contain access or modification of shared data, has high-level scheduling and locking functions, static (compiler) deadlock prevention with inclusion of a constraint on nesting shared regions, and dynamic (runtime) deadlock detection without the constaint, and much more.

    This model allows you to code things like the following (I forget the exact syntax):

    WAIT (InputQueue^=NULL() & NumHandlers<MaxHandlers) LOCK(InputQueue,NumHandlers) BEGIN;

    ItemPtr=InputQueue;

    InputQueue=InputQueue->NextQueueItem;

    NumHandlers=NumHandlers+1;

    END;

    Exiting the Begin block by reaching the END releases the locks. A GOTO to a label outside the block also releases the locks.

    Compare this with what is required to do this safely in various programming languages.

  78. Happy Skeptic
    Linux

    Re: Interesting Explanation

    "Thus, maybe the best fix for this would be to change the compiler so that -fno-delete-null-pointer-checks is on by default, no matter how high the optimization level, if the compiler has reason to believe it is compiling for an environment in which this particular optimization is invalid."

    Exactly, and the fix for this was added at the same time as the fix for the main bug, on friday 17th July:

    http://lkml.indiana.edu/hypermail/linux/kernel/0907.2/00705.html

    So much for the rants of some of the Reg commentards, I especially love this one, posted Saturday 18th July 2009 15:21 GMT nearly a day after the bug was fixed: "The Linux kernel has a known, demonstrably exploitable security problem in the field, and the kernel developers do not wish to fix it. ...No, the reality is that too many Linux zealots including the kernel developers refuse to ever accept they're wrong on anything."

  79. Anonymous Coward
    Linux

    Several things are interesting ...

    One thing that's interesting about the original

    struct sock *sk = tun->sk;

    if (!tun)

    return POLLERR;

    is that it is valid, working code. "tun" is not dereferenced: the address stored in "sk" is the sum of the address of "tun" and the offset of the sk field within it. Optimising out the test is an unsafe optimization that I would expect will be fixed in gcc before long.

    What's also interesting is all the guff about "this wouldn't happen if the kernel was written in some high-level language".

    Possibly.

    Actually, no. Not at all. I cannot think of a single language with enough expressiveness to write an operating system that doesn't have its own class of weird, annoying and sometimes not even subtle errors.

This topic is closed for new posts.

Other stories you might like