back to article Mapping the universe at 30 Terabytes a night

It makes for one heck of a project mission statement. Explore the nature of dark matter, chart the Solar System in exhaustive detail, discover and analyze rare objects such as neutron stars and black hole binaries, and map out the structure of the Galaxy. The Large Synoptic Survey Telescope (LSST) is, in the words of Jeff …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Alert

    MySQL?

    Much as I like it, I'm not sure I'd consider the optimal DB for a system like that.

  2. Anonymous Coward
    Anonymous Coward

    Backlog? What backlog?

    Let's see: 30 terabytes each night, transmitted over a 2.5Gbps link. With zero overhead, that would take (30 TB) * (8 bit/B) / (2.5 Gbit/s) = 96000 s = 26 hours 40 minutes. Every day.

    Bring another bit bucket, Betty, this one's full.

  3. Anonymous Coward
    Coat

    AC Bit Bucket Brigade

    Never forget the bandwidth of a car full of media! Anyway data compression should work wonders. These piccies are mostly black aren't they. How do you find dark matter, its dark duh!*

    *OMG I'm falling for popular culture and pretending to be pig ignorant in a posting, gahh!

    Mines the white one with the hoopy straps on the arms

  4. Martin Silver badge
    Boffin

    Sneakernet

    Surprised about the link.

    We had a much smaller system years ago only doing 72Gb/night and the only way to handle it was to post hard drives home.

    Sun are part of the data processing consrtium so MySQL is probably a given.

  5. Martin Silver badge

    oops not SUN

    Sorry I was thinking of another project, ironically Microsoft (well Gatesand Simonyi) are paying for a $30M worth of this one!

  6. Andrew Moore

    No backlog...

    when you take into account active compression.

  7. Luther Blissett

    Peering into the blue yonder, one espies a black Thing

    > One wonders how an automated system could be written to discover previously unknown classes of rare objects - part of the telescope's mission statement.

    You cannot know something before you know it. So how come huge tax "receipts" get thrown at the manifestly illogical? Repeatedly, and institutionally.

    1. Dream on.... Imagine something silly, preferably scary, e.g star-gobbling monsters, or freakin killer viruses with 110% fatality, or water on Mars(*). The best thing is something impossible to see and/or very difficult to detect. Remember the cookie monster that lived under the rug? A grown-ups version of that. With balls on.

    2. Work up some crap mathematics. This can either be real mathematics based on totally unrealistic assumptions, or bad mathematics, e.g. making silly mistakes like forgetting to move the system origin, or simply assert that a non-linear system can be handled as linear. The best way of course is not to know any maths - simply feed your (massaged) data into a stats package or computer model. Anyone can play. Random data? No problem, kid, have yourself a linear regression. It's on the house.

    3. Put on your little Jack Horner suit, sit in a corner, and wait for a tasty looking pie to come along from the empirical researchers. Exhibit a plum and say it is the thing you first dreamed up at Step 1. If you grin cheekily, you won't be disbelieved. (Everyone loves a cute baby). A bit of metaphysical gobbledy-goo helps. (Girls, you really want to try that). Who is going to call your bluff?

    4. Get a Big Media complete ignormamus to write the press release. Mention "climate change" and "our childrens' future". "Big bang" is good too, but only if you're sure there'll be no-one nerdy reading it.

    5. Pass Go. Collect $200. Goto Step 3 and repeat. Or if your balls are really brassy, Goto Step 1 and repeat with something different.

    Science no longer devises theories to explain extant observations of reality. It looks for new observations to bolster extant prejudices about reality.

    Needless to say, if this carries on, we are all doomed, even as we wait for our gigabyte per millisecond data pipes to arrive so we can freetard or iPlay or pr0n, or do all that at the same time.

    * Why is this scary? When beardy Branson starts flogging flights to Martian colonies, there will be more than fighting in the streets to get on board.

  8. John Doe

    Can't even imagine...

    ...how they back up that thing! Must be using 10 Gbps links to the backup server(s) and using a fairly large tape library. LTO-5 coming out next year should do about 900 GB per tape with hardware compression at up to 160 MB/sec. That would be capable of about 5.9 PB at 6881 tape drives and 192 frames. You'd need 26 entire LTO-5 libraries just for 150 PB! And that's not factoring in any data growth, either.

    Doing any large scale restores must suck, too. :-)

  9. Rex Alfie Lee
    Linux

    Not Windoze?

    Stevie babee,

    It's not using Windoze at all. Surely you should scream & yell & rant & rave about the unfairness of all this. Call the NSA, call the CIA, call the FBI & tell them they can't get the data. I'm sure they'll interfere for you. It's just not right eh Balmy. I bet Google's involved! Bastards!

  10. Aditya Krishnan

    But...

    Can it run Crysis?!

  11. F Seiler

    @Backlog AC

    i think you forgot that the data size is most likely measured in powers of two while bandwidth is measured in powers of ten. That way it's somewhat above 105553 seconds.

    About the data compression, i don't know, but "mostly black" seems not good to me. If the telescope is sensitive enough, it shouldn't be that way - more like pretty "random" data. Of course, this only applies if they want to transmit what they actually measured and not just pretty images for humans to use as posters.

  12. Anonymous Coward
    Thumb Down

    It's big

    It's mostly black!

    What's to discover?

  13. Seán

    @LB

    Unfortunately I read your post twice and it was as I suspected a waste of my valuable time.

  14. Anonymous Coward
    Anonymous Coward

    @Luther Blissett

    You are very strange/insane/imbalanced/bored. I hope you are aware of this.

  15. Zmodem

    compress

    only a real freak might be able to tell the difference from a tga/bitmap and a 100% quality jpeg

  16. Anonymous John

    That's a lot of data

    to lose on public transport.

  17. Anonymous Coward
    Stop

    Hmm

    It may sound like a lot of data now, but from the story it's still another 8 years before the survey starts, and it'll be another 10 years before the database has all those PB in it. By then, multi-TB disks will come with your cornflakes (if, indeed, we're still using disks).

    Best strategy is probably to buy the kit the week before you need it.

  18. Anonymous Coward
    IT Angle

    Why assume ...

    That an SQL database (any one) is necessarily the best way to store and index this type of data?

    I wouldn't take that as a project requirement or automatically make this assumption given such a specification.

  19. Squits

    @Backlog? What backlog?

    Andrew is right, it says it goes through initial compression, then 2 jumps later it compresses again, so they might capture 30TB of images but only end up with 8TB, all depends on the ratios they use.

  20. Anonymous Coward
    Anonymous Coward

    Oh forgot to add:

    C++ and Python as real-time processing tools? As much as I adore Python, wouldn't assembler be kinda better for this kind of bit-shuffling?

    Oh I forgot: no one does (or understands) that anymore.

  21. Joe K

    Nice article

    I like the detail.

    I can't help wondering though if by 2016 we'll all have 30 terabytes on our desktops.

    In 2000 most of us were still on sub-gigabyte drives and the idea of picking up a terabyte HD for £90 would have been though of as witchcraft.

    Seems a bit daft planning all this now, no telling where storage and processor (see the Cell) tech will be in 8 years.

  22. jamie
    Thumb Down

    what if this is turned on the vary place it eminates from

    What if this process is turned towards earth, and could observe every action and event taking place at 30 TB's of data a day at 2 gigabits per second every night, with out regard of others privacy or heath and well being. Sounds scary to me, current technology a satellite as an example can see through a house using various spectrum's and resolutions

  23. b
    Thumb Up

    wow

    big project :)

  24. Anonymous Bastard
    Boffin

    Why is this wrong?

    150 Petabytes over 10 years is closer to 42 Terabytes per 24 hours. Not 30.

  25. James

    "Largest proprietory dataset"?

    They're ignoring the computing and storage grid we've already put together for the LHC experiments then? 15PB a year from this year, errm sorry, next year.

  26. James Hughes

    No JPEG....

    They have to use non-lossy compression for this, so the figures won't be that great (compared with JPEG or whatever we are using in 10 years time), but still a good percentage I would think.

    Not lossy because you may compress out something faint but important. Which is most of what astronomy is about.

  27. Norfolk Enchants Paris
    Thumb Up

    Big piccies

    Well, my maths will probably get flamed but what the hey.

    Doing some rough numbers and assuming JPEG's, low compression and 32-bit colour depth then these pictures have a resolution of about 120000 x 90000 pixels.

    That would give a photo-quality print that was approimately 8 meters by 6 meters.

    Crikey!

This topic is closed for new posts.