Hmm ...
Still no standard library / keywords for multithreading / synchronization though ? (Something akin to Java's (Concurrent Pascal's) would have been welcome ... ).
It seems that the basic idea (by Brinch Hansen, Hoare, ...) from the 1970s of "safe" languages, that is, trading some runtime speed for an easier implementation still fails to catch on :). While waiting for that, libmudflap (BoundsChecker,...) can be used to achieve similar results with C/C++ (for test/development), in practice, catching a bunch of bugs "free". These days many of the actual runtime environments can support these for weeding out bugs, for an embedded system without a MMU and/or the memory to spare one should be able to run (most) of the same code on a workstation for testing purposes (extra work for setting up some kind of a simulation though).
Originally I owe the above insight to converting a 1990s real-mode x86 (DOS) C (C++) system to 286/16-bit protected mode (DOS extender providing an enviroment like 16-bit protected mode Win 3.x) . As a side effect I ended up with a system where most logical pieces of memory had an individual, HW bounds check at runtime. The amount of bugs this catched was quite phenomenal, and, importantly I was able to set up an exeception dump facility which allowed me to trace a run-time exeception, even one that would happen only every once in a blue moon back to the source line and machine instruction. With the CPU register values it was usually easy enough to figure out what had gone wrong (the typical fix being adding an explicit bounds check to array access: unchecked, these are may crash the system down the line when it is involved with some other section of code entirely ...).