@Tim
It's the quality of code generated by JVM compared to gcc -o2. In C you can have a structure/object peeling, so the more structure/objects fit the cache (line), in java it's doable but way harder - that would be a major one. The other, I mentioned already, is vectors manipulation - SSE alike is not java forte. Peeling part does require profiling and non-trivial skills but I give it.
JVM has default profile guided compilation which is not the norm in C (still doable though). Allocation on the stack (in C) is no biggie as "new" in java is just a pointer bump (when elided it's nothing but that's not often enough).
The other one often mentioned is bounds checking but well-written java code (needs to know when it would be optimized) will have the bound checks removed by the JIT. For instance: I got ~20% perf. increase by adding an extra check in jzlib to show the compiler the loop variable stays in bounds. In C you don't have to do that, although imo automatic bound checks are really, really good feat. Knowing when the JVM can inline the code is probably important as well.
The human factor would be the main strength of C - the developers are generally more experienced/skilled/talented. Few Java developers have a clue how the cache coherency works.
However "time-to-market" and the lack of explicit memory managed (especially in concurrent environment) is a clear win for Java.
This is sort of anecdotal evidence - Azul didn't have native zib lib so they coded in java, the difference was minuscule and when the code was paralleled (having 768 cores), java version was a lot better.
For large scale projects - java is still much better: easier to debug with good stack traces (no debug builds or anything). You can even get a memory dump in production (still slow but doesn't stop the process or attach a debugger for real, jstack/jmap technically use the debugging interfaces to communicate)