No, he did not criticize it for having a fallback mode. He criticized it for not starting with the fallback mode. The fallback mode was only added after kvm broke. Linus was pointing out that mindset is wrong. The adding of the fallback only after you notice something breaking. It should have been there from the start.
7 posts • joined 5 Apr 2014
Re: Sounds a bit too Microsoft to me
It has nothing to do with fixing security flaws. It's about hardening the kernel. The disconnect is about how to do the hardening. If your front door can be picked, that's the bug and everyone agrees that it should be fixed. The debate is how to protect yourself if it happens to be picked. Does an alarm go off (Linus wanting just a warning), or do you have it booby trapped to launch an arrow into who ever enters the door (security folks crashing the kernel)? Linus is worried about the innocent person who comes through that door and takes an arrow in the chest.
Re: fairly sensible explanation ...
I see a lot of comments taking Linus's rant as not fixing exploits. The thread he commented on is not about bug fixes, but about "hardening" the kernel. When a bug happens (overflow bug, for example), for it to be exploitable, an attacker must find it, and find a way to exploit it. The hardening effort, is to make it more difficult to find the location in memory to do the exploit. This particular patch series, is about limiting what memory in the kernel the "user copy" functions can write to, by adding "white lists". This prevents the sort of bug, where the kernel lets the "user copy" functions point anywhere, and be able to change how the kernel is suppose to work. It's not about fixing a bug. It's about how to keep bugs from becoming an exploit. It's not all black and white.
The issue is that the original patch series would crash the kernel when a "user copy" function accessed something not in the white list. It was at the end of the series that it was changed to have a "fall back" method to only warn, because it wasn't until the end that it was found that KVM would crash because of it. This mentality of the security folks to crash the kernel first is what triggered Linus to have his rant. He stated that it should have been a warning from the start. This is the disconnect that Linus is having such a big issue with the security folks, and where he says, the security folks are done when the exploit is closed, but for the kernel developers it only begins. The kernel developers now have to find all the places that need to be white listed that currently are not. Without the fall back feature, things that use to work (like virtual machines) now crash the kernel. That is not acceptable.
It's not that Linus hates security, far from it. He's trying to educate them, to see the bigger picture. There's folks in the security world that agree with Linus. Just read the response from Jason A. Donenfeld, and Linus's response to him (which this article is about).
Re: He's right. Again.
I don't know. I haven't seen him wrong yet ;-) Although, he has said in the past something like, "if I'm wrong, I'm just an idiot". But he doesn't go into a rant unless he's confident that he's right.
And I read the thread where this article came from, and I see nothing personally offending from Linus. Yeah, he swears and calls patches crap. But he's managing 10,000 changes a release, and needs to be efficient in letting people understand what he'll accept or not. And this is his way of saying "what you are doing is a show stopper, now stop that". I see people argue that he could be "nicer" and accomplish the same thing. Honestly, I don't buy that. I've seen Linus be "nice" and people don't "get it" until he starts swearing. The whole arm mess didn't change after Linus asked nicely several times, but one he went into his swearing tirade, the entire Arm community fixed their crap.
Re: He's right. Again.
I'm one of the people that directly work with Linus. And it's a lie when he says he's "not a nice guy". Because really, he's one of the nicest people I know. The problem is, like many other people I know, when he gets upset, he can act like a jerk. The reason most people tolerate that, is because when he gets upset, you probably did something really stupid. I've only been on his bad side once, and at the end of that conversation, I realized that I was in the wrong.
It makes Linus look much worse that the only times he is in the headlines is when he's giving one of his rants. But that's really 1% of the time. Want to know how the community really is? Back in 1998, I had a thinkpad that Linux wasn't recognizing the floppy for. I posted to the Linux Kernel Mailing List with what I found with my own sloppy debugging, and Linus himself replied back to me. He worked with me for several hours to help me get my floppy drive working. And I wasn't a one off. Linus and other top kernel developers have spent lots of time helping people get their kernels working. That is what got me hooked to kernel development.
Re: Odd timing
T.F.M, nice write up.
I'd also like to add that people are saying that the kernel should never let a user space app crash it. Well, systemd is no normal userspace application. It's not a word processor or a web server. It's PID 1, the first process the kernel starts and the parent of all other processes. It's responsible for starting everything that mounts file systems, start network services, and the works. If you boot up Linux, some distros show the [OK] after services started, that's PID 1 doing the work (or one of the tasks it created). If PID 1 dies, the system panics. This is the way it has always worked. PID 1 is as *important* to the system as the kernel is.
Second, /dev/kmsg is a file that lets privileged (root only) tasks to write into the kernel logging system. Systemd uses this to write messages into it in early boot up because there's no place else to write to. The filesystems haven't even been mounted yet. /dev/kmsg hooks into the kernels own logging that prints out the messages you see on boot up, like the Linux kernel banner. This is also what it uses to print out oops messages. The output is considered critical and writes to it wait to make sure the data is seen before it continues, as we want as much data out before the system crashes. It's a critical logger, and not something to take lightly.
The bug was that in early boot up, systemd had a bug in it where it would write loads of data into /dev/kmsg, and because this is a critical logger, systemd had to wait till those messages made it out to the console before continuing. If it also had some timeout that would trigger more prints, this could cause systemd to "live lock". That is, by the time it printed out a message, the timeout would trigger, and it would print out the same message, and the time out would trigger again, never letting systemd gain any forward progress. At this point, the system is hung. Remember, systemd *is* PID 1, and if it fails early, so does everything else. Nothing happens, and you can not even log in.
When I read that people say that the kernel should not let userspace hang it, it really did not. It was systemd hanging, but that's pretty much the same as the kernel hanging. The problem is that kmsg is a special file that most userspace is not allowed to write to. If you put too much data into it, it can cause the system to come to a crawl, as things must wait till it finishes. But a patch was created because of this thread that rate limits the data to /dev/kmsg. This patch was written by Linus Torvalds himself. What it does is if there's too much data written to /dev/kmsg, it starts dropping new data. This prevents the writes from taking so much that things stop running. But it also means that you might be losing important data you want to print. There needs to be a balance. You want as much debug as possible printed, but not so much that the system hangs trying to get that data out to the console immediately. And no, you can't let things progress, because the logging is that important that it must get out before the true bug locks up the box.