Yes and no.
I'd say "yes and no". Some HPC tasks require each node to have fast access to the entire data set (weather modelling and hydrological modelling are the two I know of right off hand.) This type of computing requires either one large box (SGI built a few Altix systems with over 4000 nodes with one large pool of memory), or very fast interconnect such as hypertransport or infiniband. Regular clustering will not work for this type of computation, it favors the fastest possible CPUs with large amounts of local memory and very fast access to remote memory.
A lot of computations split up fine though, and the ARM is a beast per mhz. If each CPU is a little slower, but they are so compact, low power, and cheap to purchase, compared to traditional compute nodes, you'd come out well ahead (in cost and power) just buying somewhat more nodes if there's a per-CPU speed deficit.


ARM+GPU = dream super?
Nvidia's Maximus: Hard-core 3D graphics on speed