Cores are cheap, it's how to use them...
Having taken the plunge into embedded software development, the unikernel concept doesn't seem very different from the standard embedded way of doing things.
Perhaps a differerence in that in embedded-land, specs are much "harder" and clearly thought-through, and testing much more thorough than in web-service land.
However there is a better way, that (of course) was first developed in concept more than 25 years ago.
That is, a bunch of unikernels that are functionally separate and only interact by exchanging messages. This re-introduces isolation, both conceptual and physical, without corresponding overheads. The model is called Communicating Sequential Processes.
CSP also includes a concept of low-overhead fork/join and yes, works best with languages that are amenable to high-quality static analysis (so that resource requirements can be computed in advance). XMOS microcontrollers and their C-derivative language XC are the modern-day examples; how I wish they'd produce some proper _fast_ processors!
The key idea with the XMOS chips is that there is direct hardware support for all the CSP constructs, so message-passing overheads and task switching are very efficient.
Back in the land of normal processors these kind of facilities are not available; but cores are now so cheap that it is possible in concept to dedicate a core to managing each peripheral. I'm not thinking of the x86 architecture here, but devices like the lowrisc chip, Zynq MPSoC, and especially TI's DaVinci media processors.
What is still lacking is decent message-passing hardware in mainstream processors, so that hand-off of requests to lightweight coprocessors does not need to go through a memory-mapped interface (where the resource management handshaking always seems to get ugly).