back to article 'Amazon can't do what we do': Twitter-miner's BYO data centre heresy

Sometimes floating on somebody else’s cloud isn’t enough. Sometimes you just have to float alone – no matter how young you are. DataSift, the five-year-old big data company mining billions of tweets and Wikipedia edits, reckons it’s just one year away from building its own data centre. DataSift sucks down 2TB of data from …

COMMENTS

This topic is closed for new posts.
  1. Simon Barnes

    what's a billion between friends ?

    "... and has just spent hundreds of billions building its own centres"

    I'm thinking you're out by 3 orders of magnitude there ??

  2. Ru
    Meh

    Is "The Cloud" really about being cheap?

    Seems to me that the principle benefit of Amazon's offering is that every Tom, Dick and Harry with a credit card and a sensibly engineered application can now scale up to a silly number of nodes at short notice, for a short time.

    For that sort of bursty traffic, it makes reasonable financial sense. If you've just got a big, steady server load that doesn't exhibit any periods of significant load, cloudy service providers aren't so great. Especially not when they've demonstrated that they aren't vastly better at preventing service outages and data loss that the next datacentre.

    1. Kristian Walsh Silver badge

      For "Amazon", read "Regus"

      Indeed. Cloud application-hosting is really the equivalent of serviced office-space. It's an easy option when you need to set up quickly, or only stay in the area for a few months or a year, but it gets expensive when you know that you're going have an established presence.

      But, that's the whole point of the Cloud, isn't it? Give businesses a means to grasp those short-term opportunities without having to put in all the capital, while sucking some nice money from the fast-ripe,fast-rotten companies of the Web2.0 circus, without the nasty hangover that the likes of Cisco had when all that gear they'd sold to startups hit the market at liquidators' prices.

  3. A Non e-mouse Silver badge

    There was an article (I'm sure it was here on El Reg) a while ago, where a company said that they have their own servers for base-line load, and use Amazon, et al, for peak provisioning.

  4. Anonymous Coward
    Anonymous Coward

    We are doing something meaningful with big data

    Imagine the catastophic consequences if there was a powercut in Reading and 250 million tweets a day didn't get sifted.

    Oh noes !

  5. Hayden Clark Silver badge
    FAIL

    How much do they pay Twitter for the feed?

    ... thought so.

    1. PerspexAvenger

      Re: How much do they pay Twitter for the feed?

      I'm glad I wasn't the only one thinking that - it's a reasonable data-rate and I'm surprised they've not been ToSed out of access yet...

      1. Anonymous Coward
        Anonymous Coward

        Re: How much do they pay Twitter for the feed?

        Er, didn't you read the article?

        They don't have access to the feed using the general access API like most (and dwindling) companies. they have access to the main fire-hose, this does not come easily.

        They are practically Twitter partners, I am sure their devs talk to each other, they may even have a dedicated line or a few to keep up with the data, conference calls etc.

        These guys are not the kind of Proles who have to worry about ToS.

  6. Svein Skogen
    Meh

    Infrastructure. Rent vs Own

    That's what the cloud is about. Do you want to rent the infrastructure you use, or do you want to own it.

    For short term use, renting makes more sense, since you don't have the added cost of buying the stuff, and training the staff.

    For long term, and permanent, use, you want stuff inhouse.

    //Svein

    1. Eddie Edwards
      Thumb Up

      Re: Infrastructure. Rent vs Own

      Damned right. The problem with AWS is by the time you're actually *using* some high level of performance at some high level of occupancy, you're paying through the nose for it, and you'd have been better off with Rackspace, or a sub-organization of your own device.

      What I don't understand is why any nascent business uses AWS. Rackspace is *cheap* - your first server runs way less than minimum wage for a single person, and will see you to at least your first thousand concurrent users (which could be 50,000 registered users). People are spending way more than this just putting stuff onto AWS in the first place. Where's the cost savings, even at startup ground zero? I can see why it might suit your garage business started on $30 by someone with tons of free time, but people are using AWS in actual capitalized startups. Why?

      (I must confess I haven't done serious analysis of this, it's just a gut feeling I have from looking at basic price structures for both.)

  7. Hawknic
    Go

    In vs Out

    Same question for anything, not just IT.

    If you have stable demand that will use a full unit of resource, and you have the skills to manage it, and it is available, then it makes sense to bring it in-house - outsourcing means that you are paying someone margin after all, so it will never be cheaper unless there is some additional value being delivered above and beyond the commodity (which I suppose means that it isn't a commodity).

    For unpredictable demand or resources that you can't use fully or manage or find, then you need more flexible (i.e. external) sources. As someone said already, that is one of the points of cloud, a great example because so many different skills and resources are needed to make a datacentre or any large scale IT work well.

    The argument is ignored by many big co's and governments who have jumped on the "out is good" bandwagon, a decision often based on being too lazy or incompetent to manage what they have (or previously having been too lazy or incompetent to manage it, resulting in ridiculously wasteful practices and dysfunctional relationships with unions etc).

    Nice to hear some sense from a company with ambition (though that communication with the market may be the point).

  8. mjwalshe

    Hadoop networking

    I seem to recall that you use infiband for hpc to get the internode latency down.

  9. Nate Amsden

    SHOCKER public cloud not cost effective

    Nice to see someone come out and spill the beans on the lack of cost effectiveness of IaaS clouds. Whoever runs that joint obviously has some skill!

    Hopefully the trend continues.

  10. melts
    WTF?

    so...

    i read the article and couldn't fathom what these people do that makes it worth paying for.

    sorry but collating twitter is like collating turds. only twitter shit can't be used to fertilise anything.

    i guess i just don't follow why so many web companies are worth more than the hardware they own and a few cents per user. and then you have companies like autodesk that make products that build cities, and they are worth pocket change to these new media monkeys. unless facebook start charging to connect people they'll never approach a hundredth of the value they have been given. as it is i don't see how they could be worth 5 billion with every user on the planet connected. but thats just me

This topic is closed for new posts.

Other stories you might like