>" "throwing away" data pre-determined as "uninteresting" for a long while yet. Probably forever"
The detectors have layers of processing just because it's impossible to get the raw data out of the experiment fast enough (there isn't enough space for the cables)
So each layer tries to throw away as much as possible and only passes interesting events up the chain. When they were first designing ATLAS I know they were planning on only sampling a fraction of even the final data stream because they couldn't keep up.
The big problem of archiving all the results is the associated knowledge/context.
You can keep all the configuration/calibration info linked to the data and design the data formats to be future proof - but it's harder to capture the institutional knowledge of "those results were taken when we were having problems with the flux capacitor so anything above 1.21GW is probably a bit dodgy, Fred knows about it but he left".
In HEP it rapidly becomes cheaper/easier to re-run the experiment with the newer kit rather than try and wade through old stuff.
In astronomy it's different, we can't rerun the universe so we religiously save every scrap of imagery.