I think he means to make the distinction between the time domain and the frequency domain. Assuming perfect instantaneous sampling then everything in between samples in the time domain is lost. But, frequency wise, there's no new information between samples to miss.

But that's the whole point - you have defined a frequency domain. A real life audio signal does not keep to neat boundaries so something like a clash of cymbals for instance will reach well into ultrasound territory. If you are sampling at 44.1kHz that is going to be lost. The fact you are defining a region of interest - presumably some "human hearing" range - is itself an acknowledgement of that. The data is lost regardless of whether you were interested in it or not.

