Do I miss something?
Surely emdedded systems could profit.
Some heralded Docker's acquisition of UK-based Unikernel Systems last week as the golden dawn of a post-container era. Others showed healthy skepticism. One person firmly in the latter camp is Bryan Cantrill, who typed up a long blog post on why he believes unikernels are "unfit" for production. Cantrill is chief technology …
Cantrill seems to be promoting an idea that "proper" OS's, Like Solaris / Unix are more reliable because there is an interface that stops user-space mistakes migrating into kernel-space. This is obviously flawed, as anyone who's ever made a system call with incorrect parameters will know. Or anyone who's application sits, waiting on an I/O to a networked device can see - after that device (or the network) has gone away.
In theory, what he proposes has merit. A reliable, resilient, impenetrable, wall between the two. However faults in device drivers and poorly written code, APIs or bad implementations mean we never get this in practice.
And then there's the performance issue. Moving between kernel and user space takes time. The more checks, tests and privilege validaions you put in place, the longer it takes. (I recall that Sun moved their telnet server from user-space to kernel-space in the 90's for this very reason) and the slower your machine gets when you scale up to production levels of load.
One area that he does flag up is the ability to debugger your applications. But isn't this just a function of the tools that (would) be built into a unikernel? If they aren't there now, that doesn't mean they couldn't be in the future. It might even bring about the return of hardware based debugging - which has the advantage of sitting outside the running system and therefore not affecting it's performance or logic flow.
So you go from the real-OS situation where it takes a flaw in something like a device driver, to a DOS world where it takes a flaw LITERALLY ANYWHERE. I take it you're one of those "what good are static types? I'm clever!" guys.
No, it has to be a flaw in the app that breaks through the built-in protection provided by the hypervisor. That needs to target some vulnerability in the hypervisor... so no different to running on a conventional OS then.
Only of course if your app runs on a bare hypervisor rather than a conventional hypervisor/OS/app stack you only have a single layer of vulnerability rather than two. You also have only a single e.g. layer of memory management running rather than two, and that running mostly in silicon rather than needing another emulated software support on top - yes, even with the assistance of hardware virtualisation.
No, it's not for everyone but for a VM that is only running a single app I can't see the issue.
Oh, did you want your single-purpose unikernel app to write to your production database?
Where is your hypervisor now?
Precisely where it should be: staying out of the way.
If you build access control into your clients come back when you have something meaningful to all.
The more software you have involved the greater the vulnerable surface. How many 0 days are in your operating system? How many affect you if there is no OS?
... the more secure your OS and application are.
Yes, it takes time. That's what fast processor are for. Security needs additional processing - you can get rid of it and obtain faster applicatios. DOS was very, very fast. Utterly unsecure, though.
While I can't speak to how much more efficient using the unikernel approach might work, in all the software packages I wrote sanity and validation checks consumed a few percentage points of performance. Having survived the DOS world with some of my sanity and humor intact, I'll take a pass on the unikernel approach, thank you.
[Actually I predate DOS, having used VMS and an introduction to AT&T Unix in my teens. Fun times. DOS was never fun!]
> DOS was very, very fast.
No, it wasn't. Display calls to MS-DOS were very, very slow. Display calls to BIOS were passable fast. If you wanted very fast display you bypassed both and did direct screen writes, just like most professional software did.
MS-DOS was also very slow on file access, in particular on large data files that required random access, due to the way FAT worked. Large ISAM files were particularly slow compared to other systems because in order to access a particular position within the file the OS, for each access, had to start at the directory entry and follow down the FAT table until it found the appropriate cluster. That is why defragging was required, by bringing all the FAT entries for a file together it reduced the number of data blocks required to be read. iNode systems, for example, could access any part of a large data file with many fewer block reads.
DR-DOS had a feature that was not available in standard MS-DOS and that was the cluster size could be specified when a partition was formatted (other utilities could also do this). On a particular partition size MS-DOS would only give, say, 2KB cluster size. Using DR-DOS to give an 8KB cluster size would give an improvement of 3x for random access to a 1Megabyte ISAM data file with no other change. This was solely because there were 4x fewer FAT entries to access.
The only reason that MS-DOS was perceived as being 'fast' was because it could be bypassed by the programs and didn't get in way.
I find that I need a "proceed with caution" icon, as this is a bit outside my area of expertise and thus what I am about to write might be less sage wisdom and more senseless drivel.
Well, more so than usual, anyway:
It seems to me that if a VM is running only a single application anyway, then a DOS-like approach might not be so senseless, if it gives a performance boost and/or makes life easier for the application and/or the OS and their developers. After all, it's now the hypervisor that's separating the different VMs and their applications, protecting the system (IE, the virtualization host and all the different running VMs) from those that misbehave.
That's my assumption as well - there is still separation between container and hypervisor, by means of switch from "Ring - 1" i.e. virtual kernel mode, to actual kernel in hypervisor more i.e. Ring 0 , and the approach offered by unikernel means that the user mode application (i.e. container) simply does not bother to exit from its "Ring - 1", but from the point of view of the hypervisor it is a user mode application which does not need to run as root etc.
Id like to know if this assumption is correct, actually.
"Hypervisors aren't inherently safe, even if they aim to be".
All code is inherently buggy, yes.
Hell, even HW is buggy:
So I guess there's a case to be made for "defence in depth" for Important Stuff™. But probably not applicable for most workloads, I think.
OS kernels are not inherently safe, either. But I like the approach of "do one thing, and do it well" and the unikernel design seem to promote it (perhaps not directly). This is because you can have 1) actual kernel running the hypervisor with not much functionality enabled beside (e.g. no KEYS - see more recent advisory in Linux kernel) and then on top of it, and with (hopefully !!) proper isolation, 2) user processes running the application code as efficiently as possible and with little dependency on high-level services provided by the hypervisor. And these user processes actually happen to be unikernels, each dedicated to to running single application only.
But, as with anything, I imagine you can also have bad system build from the same blocks - e.g. with unnecessary services provided (and possibly breached) in the hypervisor and/or unnecessary coupling of containers.
I will be watching on the sidelines.
The hypervisor could protect access to shared resources of the host, but would have no way to protect data inside the unikernel app. Unless the app is very simple, there could be different threads at different privileges running even inside a single app. Take for example a server that needs to control accesses to different resources depending on the user requesting it (be it SMB, HTTP or wharever). If it is the kernel to enforce processes/threads security, and the application has no way (but bugs/vulnerabilities) to modify them, security is much higher. Make everything run into a single security context, and with everything accessible, and enforcing security becomes much more difficult.
It looks to me the simplistic model of HTTP requests with no security but some form of cookie just handled at the application level, and no mapping to the underlying OS security model, is the driver of these approaches - but many system are far more complex and with stronger security needs than a simple web request - and needs security enforced beyond the application code.
From my operating system classes, I remember the idea of running single-tasking operating systems inside a hypervisor was developed already by IBM in the 1960's as their solution for this new-fangled idea of timesharing. Wikipedia has a writeup here: https://en.wikipedia.org/wiki/CP/CMS
On a OS which doesn't isolate apps, it only takes a pointer mistake and the OS can crash hard and you have no idea why, so have to guess more!
Passing data between app and OS does not need to be slow if you use lightweight queued message passing and minimise memory copying.
I wouldn't want to write a unikernel app in C for this reason, but MirageOS is written in OCaml and either that makes pointer mistakes impossible and the entire program simpler to verify correct before deployment or we're not ready for unikernels. Given how free Haskell/Scala code tends to be I can believe that OCaml is similarly reliably correct.
This sounds like User Mode Linux which has loaded an executable into the UML VM kernel space.
That is what? 17 or 18 year old tech which has been beaten into oblivion by now and has an extremely well known performance envelope.
What it can:
1. It can perform network-wise - replacing the stock network drivers gives you 3G+ per core forwarding speeds under pure paravirtualization and 6G unidirectional to "blackhole sink" - same as bare metal Linux. (been there, done that, patches are published, but not in mainline kernel).
2. It can perform disk-wize - same, the stock drivers can be replaced to jack up performance 2x (been there, done that, patches are published, but not in mainline kernel).
3. When you combine 1+2 you can perform app-wise in a monolitic single-threaded app (be it in kernel space or in UML emulated userspace) if it preallocates all memory it needs. If you feel very UNI just run the app instead of sysV init (been there done that, this requires no patching).
4. You get all kernel APIs and/or userspace APIs depending on how you use it so you do not need to reinvent the wheel. You for all practical purposes can (ab)use the Linux kernel as a library this way.
This works around the issue known as "UML is slow" which under the hood is that sucks rocks when it tries to exec a new process because that causes full memory synchronisation and TLB flush. That in the case of "uni-app" especially in virtualized kernel space is not a problem. It never execs and it can rock (and blast at stupid speeds per core which kvm, xen and vmware can only dream of)
There is nothing new here, move along. Just classic case of Rabid Californication - one over inflated Californicating Entity overpaying a stupid amount of money for another Californicating Entity. Anywhere outside the Silly Valley distortion field the money exchanged for the Californnicating Cross Pollination will beggar belief.
Posting anon - so people do not associate my El Reg non-de-guerre with my real name :)
If you are spinning up a set of instance for a specific task, then does the occasional instance failure matter?
I mean I'm with him on the debugging thing - it would be nice to know that bad data was the cause of the crash - but in the case of spinning up instances is there not an argument that:
a) efficiency of spinup is more important than normal
b) reliability is less so
Depends what is happening inside that instance. Does an instance serve a single user or transaction? If not, it will bring down all user activities or transactions. How much are people going to push the granularity of containerized applications?
Take a simple instance of a web server - how many users will it be serving concurrently? Otherwise unikernel app should be trimmed down to single processers/threads to minimize impact - but then you have a single process/thread running on a kernel (the hypervisor...) - back to square one (with maybe even more overhead due to virtualization...)
A Unikernel is just a process without the standard syscalls.
It looks fast at the moment because most of the POSIX features are missing. In future these features will get re-added to form a bloated and slow mess, but effectively it will be recognisable as a Linux process...
"They who misunderstand UNIX are destined to reinvent it badly..."
Dtrace/Zone/Zfs ran into a couple of problems. They matured when Sun were falling to bits. Then Oracle bought them. Hard to come back from that.
The people at SmartOs *ought* to be support as they will, hopefully, keep those technologies going in the open world.
Docker is a bodge. I can see that. It allows you to run up an application sat on top of a socket. Its is not a general purpose OS - its an application spoof thing. Im investing time and (not much) money in as Google and Amazon are cutting their throats on competing and I can trial some stuff for free or at least, very cheap. Hey, if a company wants to give me stuff cheap so I can make money then who am I to complain.
I was looking at Google'sAppEngine but there is to much of a technology buy-in there - python 2.7 + googles API. I want to avoid tying myself to a technology and a compant.
If the stuff is successful then I will scale to a SmartOS hosting company when I need to get bigger.
I understand the limitations of Docker. Im not sure a lot of people (hipsters mainly, and management) do.
One prediction on Docker - tehy'll run into the same problems with UnionFS on Linux tha everyone does. They'll get 95% of th way and have to fudge. At least this will speed up the improvment of ZFS on Linux as thats the only mature filesystem that supports the features Dockers needs.
Linux for? Beardy hipsters?
Seriously, running anyhting that needs a graphics driver with hardware less than 7 years old.
I've found BSD graphic support to be more miss than hit than Linux.
And you get those funny enet devices that crop up that are not supported by BSD.
FreeBSD is greap - but you do have to be careful with the HW combos.
It's fine, been stable for long enough and performs well (I use small NVMe for ZIL device and L2ARC). There are always some open issues at ZoL but find me non-trivial project with no open issues :) Here's my favourite . The trick is to find the right recipe for your distribution - I use Arch with archzfs and my own fixes
I have been using it for over a year using Debian and for my molecular modeling machine. The DKMS mechanism works very well.
It is very stable, has snapshots (copy on write), and I can get 500MB/s (0.5GB/s) from the 5 enterprise disks it lives on, RAID-Z3. Granted it is SAS, and I have 256GB of main memory, but it is invisible IMHO.
BTRFS has got much better, though Nov14 when I got this box, BTRFS nearly shredded my SSD...
BTRFS is the long range vision (compatible license), ZFS is for right now.
Given that FreeBSD has a better network stack and ZFS, what exactly is Linux for these days?
Linux is probably bit more versatile (and does support rather obscene amount of obscure hardware) as a desktop/workstation.
For a server I do prefer *BSD given a choice. That is not to say Linux doesn't work as server, of course it does.
As El Reg is a .co.uk site I would have expected the spelling "sceptical". That's how we do things over here.
My dictionary, however, tells me that the root of the word is the Ancient Greek "Skeptikos" ("σκεπτικός"), meaning "one who observes", so the use of the letter 'k' has some history as a transliteration of the Greek letter Kappa.
Were it applied to both Kappas in the word I could buy that argument, but then we should have the spelling: "skeptikal".
Really? These days?
Real operating systems have had a user/kernel split for 40 years or more.
It was a fundamental feature of UNIX since Version/Edition 6 (my earliest experience, possibly longer), and in other OSs like DEC RSX-11 and VAX/VMS, and probably a host of other OSs from the same era.
Even in the Microsoft world, Windows/NT must be 20 years old at least.
DOS was a retrograde step that should have been strangled as soon as the 80286 became the dominant processor, and MS should really not have compromised on the initial security design of NT.
What would have been even better would have been a desktop UNIX on a suitable architecture at a cost that suited the industry! Linux just came along too late!
"... and in other OSs like DEC RSX-11 and VAX/VMS, and probably a host of other OSs from the same era.
Even in the Microsoft world, Windows/NT must be 20 years old at least. ..."
Not surprising - really - that they all share this feature (among others), as Dave Cutler was responsible for RSX-11M, VAX/VMS and then moved to Micro$haft to create Windows/NT! (Cool or what?)
Why is that pedantic? I was aware that Dave Cutler did all of these, and I nearly mentioned it myself. I listed UNIX and the DEC operating systems, as these were from my own experience.
I'm pretty certain that PrimeOS, MPE, VME, AOS, VOS, MTS (just a list of other time-sharing OSs that spring to mind from this era) also had this feature.
I'm pretty certain that PrimeOS, MPE, VME, AOS, VOS, MTS (just a list of other time-sharing OSs that spring to mind from this era) also had this feature.
Hmm, should I worry that I have used all of the above?
MPE V was fun to write for, and HP's manuals on the intrisincs were quite good (if my memory serves).
"DOS was a retrograde step that should have been strangled as soon as the 80286 became the dominant processor, "
There was an attempt to do so. It was called OS/2.
Did you every try to write segmented code in assembler for Intel 8086/80186/80286 ? Oh, the horror.
Progress had to wait for 80386, with a big, flat memory model.
Biting the hand that feeds IT © 1998–2019