Useful if combined with other stuff
What they've *actually* shown is that they can recover speech, specifically, from a very noisy signal (same as quiet) that is heavily band-limited way below what you'd expect, and non-linearly distorted. There's other covert sensors that can give you exactly that input.
There's a couple of very nice papers that experimentally demonstrate moderate sound recovery from smartphone video imagery through windows, of objects within the room. The principle is that something like a Coke can vibrates very slightly, and the straight lines on the object appear very slightly wavy due to camera sequential scan.
This research might improve the speech recovery from such a system - which if you listen to it, is at the level of hearing some syllables and moderately guessing others with more high-frequency components.