Re: Re: I could use such power
Many image processing tools are easily broken up into small loads. The tools we develop are so-called connected filters in which data from all parts of the image may influence all other parts. Often you want to avoid this, but e.g. for analysis of complete road networks, you cannot predict which parts need to communicate with which other parts. To cut communication overhead down you need comparatively coarse grained parallellism.
We have been able to device a scheme (which we are now implementing) which cuts down local memory requirements from O(N) to O(N/Np + sqrt(N)), with N the number of pixels and Np the number of processor nodes. Normally you just require O(N/Np) per node. Communication overhead can be dropped from O(N) to O(sqrt(N) log(N)). This has moved the problem from the impossible to the "possible, but you need a coarse-grained cluster."
Of course, the first (shared memory) parallel algorithm for this filters dates from 2008, so we still have much to learn. Other problems in science can also require global interaction between particles (gravity has infinite reach). A lot of work is done cutting communication down, but this is often at the expense of accuracy.