Having worked with Inmos Transputers and Intel i860s in the 90s this does all sound rather familliar.
However, although there defintely is a need for a parallel programming laguage for high performance applications, for most needs this is a red herring.
Modern operating systems run many separate processes to perform individual tasks, even Windows. So once the kernel can handle a a large number of processors (not necessarily a trivial task), the general application layer can carry on as before with the many processed geninely getting run in parallel.
Indeed, when I was doing parallel simulations it was impossible to beat the overall efficiency of running a complete simulation on each processor, rather than running a succession of parallel jobs really quickly.