The world's hard drive shortage, caused by deadly flooding in Thailand, is holding back CERN's antimatter research, a top scientist at the boffinry nerve center said last night. Analysis of figures spewing out of the Large Hadron Collider was compromised by a lack of storage space, said Peter Clarke, who works on the CERN LHCb …
I have a few gigs spare
Can't they do a crowd source thing like they did with distributed computing?
I guess some clever data duplication methods might help to make sure all the data is available all the time...
Re: I have a few gigs spare
Nice idea, but my quick and dirty estimates say that if they wanted to crowd-source 1 Petabyte and associated calculations using distributed PC-type equipment; then they'd need 10 million volunteers and the public networks would take a big hit on bandwidth.
It's still a nice idea :)
If they are really desparate I can probably round up 5Tb of unused disks here, and I bet other hobbyists across europe could as well. LVM could turn them into one big bit bucket.
If a pan european call to donate hardware went out, I'd box some stuff up and post it to them.
7Pb of storage
That's an awful lot of Paris Hilton angles. 1,048,576 Paris Hilton angles assuming an uncompressed 60 minute tape at any rate.
Re: 7Pb of storage
Had someone from CERN on "The Life Scientific" on R4 this week and he described how they have something like a Gpixel "3-d camera" taking a picture every few nanoseconds when they run experiments ... hence the huge amount of data!
No we ain't....
...short on space. We just paid a bit more for it this year.
Re: No we ain't....
Exactly, surely his actual quote is more along the lines of "......storage is much more expensive so we can't afford what we need......", there's a bunch of big companies that'll sell you 1+Pb of storage but this year you'll be paying an arm and a leg for it.
Perhaps they could assist Carpathia with their MegaUpload legal battle and free up about 6 months worth of storage for themselves?
Alternative data storage organisations
CERN & NSA -> All the data never gets seen again.
CERN & Google -> Google gets to index the raw structure of the universe - what could go wrong?
CERN & Amazon -> You now get recommendations from Amazon on what elementary particle other people have been using.
CERN & NASA -> the data gets translated into imperial format and then gets lost.
CERN & Mirosoft -> ooh where to start.. Microsoft offer to reformat the data, and from then on you need a succession of patches to read the data.
CERN & Apple -> The data gets formatted to make it look really pretty, but no-one else can read it.
hmm - who am I missing?
Re: Alternative data storage organisations
CERN & The UK Civil Service -> The data is on several million USB sticks left behind on trains.
CERN & MPAA -> There will be a delay while DRM is applied to all existing data.
CERN & PC World -> For a small fee, the data comes with an extended warranty.
Re: Alternative data storage organisations
Well it'll only make sense to Australians, but:
CERN & NBN -> You collect and transfer all your data at blazing speed, but just when you're ready to start analysis the Liberal Party wins the election and Tony Abbott has all data deleted.
Did they try zipping it?
They flat out state that they have more than enough Processing and network capacity, so why not shift some of that to *compressing* that huge gob of data they produce. Or maybe spend a few buck on de-duplication technologies, I am pretty sure that would remove a huge amount of the storage needed.
How do you think "50 million petabytes a year" gets reduced to "15PB" (a factor of about 3 million:1)? They're already compressing it incredibly.
They are not compressing data, they are just throwing away uninteresting collision events
Filtered *and* compressed
They only record 'interesting' events, and they store the events gzipped, IIRC. I may be wrong because I left the field a few years ago but I think the issue is getting enough storage at all the replication sites to allow efficient analysis by all the physicists on the collaboration, not that they can't afford the storage for a single copy of the events. That would be nuts, clearly.
Have they considered
Throwing a few megabucks at HP to get their memristor chip stacks working?
That would solve the storage problem, just dump the raw data directly to MR-Flash then do the processing as needed.
Alternate ideas, get people with PS3s/etc to do the preprocessing like Seti@Home, and locally cache the data for them.
If it gets lost no biggie as its duplicated across multiple machines.
Each PS3 does a different subset of the task on the same data block, and they are all reconstructed at the end.