back to article A 16 Petaflop Cray: The key to fantastic summer barbecues

Successful BBQ at the weekend, or did a cloudburst stop play, leaving your fridge groaning beneath a mountain of uncooked ribs and sausages? If only the weather forecast had been more reliable. By 2017 that might be possible, as the Met Office – responsible for generating more than 3,000 tailored forecasts and briefings each …

  1. tony
    Happy

    Chair,

    That Cray picture disappoints me; where's the natty mix of raw computing power and 70's era futuristic furniture gone?

    1. montyburns56

      Re: Chair,

      Although it's not that well known in the wider computing world, Cray engineers discovered that a vinyl cushion will increase CPU frequency by at least 5%.

    2. Anonymous Coward
      Anonymous Coward

      Re: Chair,

      It's even more ironic that the picture is actually of the Sonexion storage for XCE, not the computing part of any of the clusters.

      What you see are some pretty ordinary racks with some Dell servers in them (as storage and management servers), and many drawers of Seagate disks, with a smattering of 10GigE and Infiniband switches (the storage servers are not connected to the Aries interconnect). Lots of lights, and if you could see around the back, water cooled doors.

      Mind you, the compute frames are exceptionally dull black racks, without even any significant numbers of lights on them! Very loud, though.

      It's also ironic that in the background, you can see two black frames with the large green vertical stripes are part of one of the IBM P7 775 system systems that are being replaced.

  2. Robert Ramsay

    Misunderstanding

    I thought you meant the Cray ran hot enough to cook your BBQ on it...

  3. Bronek Kozicki

    2 million lines of FORTRAN code

    ... doing Monte Carlo simulation. Oh my, I feel 25 years younger now.

    It's a bit of shame that they have no plans to rewrite this to work on massively parallel architecture such as GPGPU (with few thousand compute units on a single card) but, given this much investment and reliance on existing code, it's not a surprise. Hope they eventually make the step, though.

    1. This post has been deleted by its author

      1. streaky

        Re: 2 million lines of FORTRAN code

        the point is that FORTRAN code is nice and portable

        So is C, it's a question of aged met scientists that refuse to write C/CPP that the met office *refuse* pointedly to fire wasting CPU cycles on effectively BASIC: Scientist Edition.

        Never trust a language that has a PUNCH language construct.

        1. Peter Gathercole Silver badge

          Re: 2 million lines of FORTRAN code

          I had this discussion with a couple of FORTRAN programmers some time back, and actually wrote some test code to see what they were talking about.

          The problem with C derived languages is that everything is done by reference (essentially pointers), and this adds an indirection to follow the pointer that needs to be resolved to almost all structured data reference, particularly arrays (very commonly used for this type of work).

          On the other hand, FORTRAN works much more directly with the addresses of data in memory (once it has the base address of an array, for example, it can do direct arithmetic on the index rather than having to resolve the pointer again). This means that it is much easier for the data prefetch mechanism to spot consecutive address so as to fill the cache. This is especially as the FORTRAN standards dictate the way that certain structures have to be laid out, which has actually conditioned processor design in the past, and enables the FORTRAN programmer to make some intelligent decisions about which rank of a multi-dimensional array to traverse to maximise the effect of the cache.

          I can't remember the figures exactly, but I found that FORTRAN code actually ran faster than it's similarly written C equivalent. You had to write some very unusual C to narrow the gap. There was not a lot in it, but when you are trying to get as much as you can from a system, every clock cycle counts.

          This is comparing FORTRAN and C. If you want to include the OO overheads of C++ to the equation, then things get even worse! And the discrepancies are not always fixed by optimising compilers.

          1. Chemist

            Re: 2 million lines of FORTRAN code

            "but I found that FORTRAN code actually ran faster than it's similarly written C equivalent."

            I've mostly written my scientific software in C but one of the major points in favour of Fortran seems to be the masses of thoroughly debugged routines/libraries (including source) available for a vast range of topics

        2. Ian Bush
          Headmaster

          Re: 2 million lines of FORTRAN code

          Errrr, Standard Fortran (note spelling) has never had a punch construct. And your history is wrong as well

          1. streaky

            Re: 2 million lines of FORTRAN code

            I found that FORTRAN code actually ran faster than it's similarly written C equivalent

            Blind benchmarks are the best benchmarks.

            Errrr, Standard Fortran (note spelling) has never had a punch construct. And your history is wrong as well

            Sorry I never meant to give the impression I cared about either Fortran or FORTRAN.

            1. Peter Gathercole Silver badge

              Re: 2 million lines of FORTRAN code

              Blind benchmarks would work if you take identical code and run them on two separate machines.

              Unfortunately, when comparing different languages, the way that the problem is coded is partly conditioned by style, fashion (you don't think fashion is present? Wait until you've been around a while, accepted standards for writing code changes over the years), and personal preference. This makes direct comparisons more difficult, because different people will code the same problem in the same language differently, and some differences can have quite an effect on performance.

              Beautifully written code is not always the fastest!

          2. Peter Gathercole Silver badge
            Alert

            Re: 2 million lines of FORTRAN code @Ian Bush

            When I learnt it, it was FORTRAN. It's only since that trendy upstart Fortran 90 came along, with it's free-form input, long variable names and pointers (amongst other corruptions) that the capitalisation changed.

            I've not written much in the last 25 years, so it's still FORTRAN to me.

            But please not, a lot of the Unified Model is still written in FORTRAN 77 and earlier, so the case is a moot point!

        3. Anonymous Coward
          Boffin

          Re: 2 million lines of FORTRAN code

          Actually this isn't the problem at all. Fortran is really a single-purpose language: it does high-performance, large-scale, numerical code. No-one working on a Fortran system is worrying about the performance of OS kernels, video games or database interfaces, because it isn't used for that: all they have to worry about is getting large-scale numerical codes to run, really fast. And there are people who have a lot of money to spend on this – the kind of people who are interested in CFD simulations which run for millionths of a second of model time.

          The end result of all this is that Fortran systems have very, very good numerical performance. C systems, empirically don't, and C++ systems don't even have support in the standard for it. Fortran is really the only game in town for this stuff, as it has always been.

    2. Peter Simpson 1
      Thumb Up

      Re: 2 million lines of FORTRAN code

      Real Programmers use FORTRAN to write weather models!

      Tip of the hat to those men (and women) who write code nobody else does. Interstellar navigation, weather models, power grid models...now, that's a challenge!

      // COBOL programmers also in high demand, I understand

    3. malcolmus_rex

      Re: 2 million lines of FORTRAN code

      My (limited) understanding is that GPGPU hardware isn't good enough for interprocessor communication for high resolution weather forecasting to work well. This sort workload needs a lot of compute but also, inhernently, alot of comms between the compute. That's a major bottleneck.

      You can bet it's been tried and tested, and it's an exciting avenue for research. For actual usage, not just yet, maybe next time.

      Also, on FORTRAN, it's probalbly here to stay in HPC and scientific computing for a long time yet. A nice discussion is here:

      http://arstechnica.com/science/2014/05/scientific-computings-future-can-any-coding-language-top-a-1950s-behemoth/1/

    4. Peter Gathercole Silver badge

      Re: 2 million lines of FORTRAN code @Bronek

      The problem is twofold.

      Firstly, the Unified Model needs to have quite large sets of data per cell. Currently, the systems are sized with ~2GB per core, and each core is at any one time calculating one cell on the grid. This is to do with the way that the information is arranged, and although current GPUs can address large(ish) amounts of memory, they cannot manage to provide enough memory per core for a "few thousand compute units on a single card". Until the GPUs have the same level of access to main memory that the CPUs and DMA communication devices have, this will always be a block.

      Secondly, all of the time steps are lock-stepped together, and at the end of each time step, results from each cell are transferred to all of the surrounding cells in three dimensions (called the 'halo'). As I understand it, the halo is being expanded so it is not just the immediately neighbouring cells, but the next 'shell' out as well. This makes weather modelling more of a communication problem than a computational one, and one of the deciding factors over the decision over the architecture was not how much compute power there was, but how much bandwidth the interconnect has.

      To do this work on a system using GPUs for some of the computational work would require significantly more memory than can conveniently be addressed in the current GPU models, and because there are different GPU-to-main-memory models around with each generation of hybrid machine, getting the data into and out of the GPUs is not generic, and currently requires to be written specifically for every different model at the moment. There are also no standardised tools to assist.

      Personally, I feel that the current GPU hybrid machines are a dead-end for HPC, as were the DSP assisted systems 30 years ago (nothing is new any more), but what we will see is more and more different types of instruction units added to each core, making what we see as GPUs today just another type of instruction unit inside the core (think Altivec crossed with Intel MIC if you like).

      1. Anonymous Coward
        Anonymous Coward

        Re: 2 million lines of FORTRAN code @Bronek

        "Personally, I feel that the current GPU hybrid machines are a dead-end for HPC, as were the DSP assisted systems 30 years ago (nothing is new any more), but what we will see is more and more different types of instruction units added to each core, making what we see as GPUs today just another type of instruction unit inside the core (think Altivec crossed with Intel MIC if you like)."

        Your hypothesis will be put to the test with the Lawrence Livermore and Oak Ridge hybrid IBM Power/nVidia GPU supercomputers. Unlike current hybrid designs, they will have fast interconnects between CPU/GPU/main memory, and a single shared memory space. The Theta and Aurora Cray designs to be installed at Argonne are simpler/more conservative, Intel many-cores with a Dragonfly topology for the interconnect.

    5. Roj Blake Silver badge

      Re: 2 million lines of FORTRAN code

      "... doing Monte Carlo simulation. Oh my, I feel 25 years younger now."

      I know the feeling. In 1993 the third year project for my degree involved using Fortran to run Monte Carlo simulations to work out how lead atoms behave.

      Those were the days when I still knew how to do quantum stuff.

  4. This post has been deleted by its author

  5. Anonymous Coward
    Anonymous Coward

    "more than 3,000 forecasts and predictions each day"

    and every one of them wrong...

    1. Peter Gathercole Silver badge

      Re: "more than 3,000 forecasts and predictions each day" @AC

      It's not so easy to determine what is right and wrong when the forecast gives probabilities, not yes/no answers.

      What you are talking about is the forecasters attempt to turn a complicated forecast into something that numpties like you have some chance of understanding, all in the space of three minutes or less. It's always going to be wrong for someone, because the weather for a whole region over a space of hours will never be the same across that whole region.

      What you are complaining about is the generalisations that you hear on the radio or TV not being detailed enough for where you are. Look at the more detailed local forecasts on the BBC or Met Office web sites or apps, and you will find it is quite a bit closer to what actually happens.

      There's another thing though. When you have ensemble runs (you run the forecast with slightly different parameters, and the reason why there are up to 3000 per day) it is quite likely that at least one of the ensemble will actually be right!

    2. R Callan

      Re: "more than 3,000 forecasts and predictions each day"

      At least when they are for 4 days later. Those for the remainder of the century are just very expensive fiction.

      1. Anonymous Coward
        Anonymous Coward

        Re: "more than 3,000 forecasts and predictions each day"

        Although it is derivatives of the same models that are used for climate research, the way the models are run for forecast and climate research is very different.

        There is no way that anybody is going to claim that they can tell what the weather is going to be on Christmas day 2099 (well, not unless they have a time machine), but what climate work is trying to tell is, for example in the decade of 2090, whether the average temperature will be higher or lower than it is in this decade.

        My own view is that we still don't understand all the inputs to the way that the global weather and climate systems work to be able to model it over longer periods very well at the moment, but trying to model it is a journey that may, eventually, give us something that is more representative. Unless we start this journey, we will never get to that point, which justifies the effort.

        It's the people who think that we are at the end of that journey already, and that the current models are accurate that really get my goat!

  6. Kubla Cant

    Can anyone clarify?

    Let me be the first to confess that my knowledge of this kind of computing, and indeed of statistics, is rudimentary. Perhaps that's why I'm having trouble understanding some of this article.

    computational algorithms that produce results that can be explained in terms of certainty – the probability that a given will take place

    Which is it, certainty or probability? They surely can't be the same thing, even in the domain of weather forecasting.

    ...customers who make sophisticated, risk-based decisions can benefit from having probabilistic rather than deterministic decision on events of weighing up the probable chances of an outcome rather than working with a black and white.

    Come again? I can imagine it's a challenge to render this stuff into plain journalistic English, and I sympathise wholeheartedly. But that last para seems to be the product of some kind of random word generator.

    1. John H Woods Silver badge

      Re: Can anyone clarify?

      I think there are a few terms mixed up here. First of all they have used 'certainty' as in 'degrees of certainty' when they could have just said "computational algorithms produce results indicating the probability that a given event [e.g. rain between 12:00 and 13:00 in Hyde Park] will take place"

      The second paragraph seems to mean this: "Some customers prefer to make risk-based decisions, on the basis of probabilities: e.g. these customers will benefit from being told that the probability of the aforementioned event is 56% rather than being told 'it will rain' simply because the probability that it will is better than evens"

      I noticed when I visited the USA that many people seemed happy with a forecast of "there's a 30% chance of snow" whereas the standard British response to this on a TV weather forecast seems to be "well, will it or won't it?" so I think they may have a cultural issue on their hands as well.

      1. nijam Silver badge

        Re: Can anyone clarify?

        > I noticed when I visited the USA that many people seemed happy with a forecast of "there's a 30% chance of snow" whereas the standard British response to this on a TV weather forecast seems to be "well, will it or won't it?" so I think they may have a cultural issue on their hands as well.

        In your example, the USA version is essentially useless. In information-theoretic terms they may as well have said nothing at all, as far as the "person in the street" is concerned.

      2. Anonymous Coward
        Anonymous Coward

        Re: Can anyone clarify?

        I remember once sitting in Atlanta airport, listening to the radio on my headphones. In the weather report it said "there is a 30% chance of rain in Atlanta today". At that very instant, it was throwing it down in buckets outside the terminal.

        So, what does "there is a 30% probability of rain today" actually *mean*? If they said there was a 100% probability of rain, but it didn't (or 0% and it did), would the universe fold up? Should these probabilities come with probabilities that they are wrong?

        One interpretation I can think of is: "historically, in days where the inputs to our models were similar to today, in three out of ten of those days it actually rained". However there is such a wide variety of inputs (many thousands of measurements of temperature, pressure, humidity, wind speed etc) that it seems hard to classify any two days as being "similar". Also, this interpretation must be wrong because it removes the need to actually run the models!

        Alternatively: "we ran our simulation program 100 times, and in 30 cases it predicted rain"? But that's still not the actual probability of it raining in real life, unless you believe that the model is perfect.

        Maybe should all just get one of these:

        http://www.theweatherstone.co.uk/

        1. Captain DaFt

          Re: Can anyone clarify?

          "there is a 30% chance of rain in Atlanta today". At that very instant, it was throwing it down in buckets outside the terminal.

          So, what does "there is a 30% probability of rain today" actually *mean*?"

          It meant exactly what you observed.

          Casual observations over the years show that, if you're in the southern US, anything less than a 60% chance may be safely ignored, unless it's 30%. Then it's guaranteed to be pissing down in buckets.

          I suspect it's a little inside joke of the Weather Bureau.

        2. VeganVegan
          Holmes

          Re: Can anyone clarify?

          I actually asked a U.S. National Weather Service forecaster this same question. After all, if it is raining where you are, it is certainly raining; if not, not. It is a binary choice, no probabilities needed.

          His answer was that when it says x% chance of rain/snow/thunder/whatever, it means either of 2 (!) things;

          a. That there is an x% chance of the event happening during the forecast time period;

          or,

          b. That x% of the forecast area will experience the event.

          While either interpretation seems not unreasonable to me, I asked him which was the correct one. He refused to be pinned down.

          This was some 30 years ago, and they might have made up their minds in the meantime, and I have become more tolerant of ambiguity, so I never followed up my enquiry.

          1. Chemist

            Re: Can anyone clarify?

            "After all, if it is raining where you are, it is certainly raining; if not, not. It is a binary choice, no probabilities needed."

            If it's raining where you are you don't need a weather forecast !!!

            1. ravenviz Silver badge
              Facepalm

              Re: Can anyone clarify?

              IIRC a forecast is about the future rather than the present!

        3. ravenviz Silver badge
          Boffin

          100% probability of rain?

          AFAIK Met Office only quotes as high as '> 95%'.

    2. Anonymous Coward
      Anonymous Coward

      Re: Can anyone clarify?

      Monte Carlo Simulations give results and the probability of those results being correct. Given that, that "certainty" in the article seems to refer probability (possibly +confidence level) and not to a mathematical certainty (100% probability)

      As for the probabilistic vs determinsitic: probabilistic models include some randomness (or probabilities) so multiple runs won't produce the same output, deterministic models are same input, same output (but if you don't really have all the variables, as in weather forecasting, your output isn't going to be so accurate)

      1. Anonymous Coward
        Anonymous Coward

        Re: Can anyone clarify?

        there was a nice little documentary on BBC4 last week about this very subject (went to the met office) and explained using Monte Carlo as a method of prediction. It'll be on iplayer I expect

  7. Ben Bonsall

    Finally, the met office has a computer which has power and heat characteristics significant enough to require inclusion into the climate models it's designed to run...

  8. Andy The Hat Silver badge

    Unit fail?

    I'm sorry but 'flops per watt' is so old hat ... surely there's an alternative reg notation?

    Obviously the 'flopperwatt' or 'flatt' ... but how does that relate to olympic swimming pools, blue whales or wrists per second?

    1. Anonymous Coward
      Anonymous Coward

      Re: Unit fail?

      There doesn't seem to be an official reg notation so here's a possibility:

      let's start with the Watt. There are no direct reg units for Watt but 1W = 1J/s and 1 J = 1Nm. Since 1 N = 0.01No, we can say that 1W = 0.1No m / s.

      Now from the highly reliable (sic!) wikipedia we can find that to emulate a human brain in real time we need at least 36.8petaflops

      Thus we can create an equivalence between flops/watt and brain/Norris :

      For instance, BlueGene/Q at 2100.88MFLOPS / W is equivalent to ~570 nanobrains per No m /s or 570 nanobrainseconds per Norris metre (unless I missed a decimal point somewhere, which is not unlikely)

  9. Tromos

    Ensemble modeling builds forecasts that deliver a range of probable outcomes

    A hundred megaquid to tell us it might rain tomorrow - or it might not.

  10. The Boojum

    More power to their elbow!

    Sounds a great deal of fun, but I can't help but think that the justification for doing it was created by Reason, the program described in Douglas Adams's book 'Dirk Gently's Holistic Detective Agency.' Basically you gave it the facts of the subject and the decision you wanted it to make and it came up with a set of incontrovertible reasons why the decision was the right one, as in

    "Gordon was able to buy himself a Porsche almost immediately despite being completely broke and a hopeless driver."

  11. Anonymous Coward
    Anonymous Coward

    There are factual inaccuracies in this article

    and I have submitted some corrections, but just to summarise:

    There are four Cray XC40s currently installed at the Met Office headquarters. Two modest sized test and development systems called XCT and XCD (although this last one may be renamed), and two larger productions ones, XCE and XCF which will initially run the bulk of the production forecast and climate research work. The two production systems are either currently or will very shortly be in parallel running tests with the existing IBM P7 775s to prove reliability and that the forecast results are consistent.

    The larger ones will have their capacity increased when the IBM P7 775s are removed (freeing some power budget) later this year, with the increased capacity coming on-stream in 2016.

    The rest of the resource will not be installed at the Met Office headquarters, but in a data centre that is currently being built on the Exeter Business Park. This will come on-stream sometime in 2017, according to the timetable I've seen. This alternate location is because of power restrictions at the current site.

    I'm sure that David Underwood would not have provided incorrect information. Maybe something was lost in the reporting?

    1. Smig

      Re: There are factual inaccuracies in this article

      You are David Underwood and I claim my five pounds.

  12. Ashton Black
    Joke

    That's all very well, but can it run a Doom LAN?

    (Nod to User Friendly Comic)

  13. AndrueC Silver badge
    Meh

    Appropos of nothing really

    30 years ago when I was at school I used to play in those fields. Now someone has plonked a load of offices and some powerful computing kit on top of them. And the great play area known as 'The Orchard between Honiton Road and Sowton Industrial Estate' is gone and built over. No more apples and damsons from there then.

    How times change :)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like