Sounds a lot like SCSI queue re-ordering to me.
Also, if the program needs the data before it can proceed, then what does it do before the cache has fetched it? Presumably blocks. OR do you have to register all your data requirements from the start? Not commonly easy to do.
That said, I've often thought that programmers write code in the order that make sense to them, but often the code can be completely re-ordered and still work the same, but I assumed modern compilers & processors also know this and do it already.