back to article Microsoft’s Revolution Analytics buy pays off, Linux-based R Server launched

Microsoft has released R Server – for statistical analysis using the R language – based on software from Revolution Analytics, a company acquired by the tech giant in April 2015. What's used is a distribution of R now called "Microsoft R Open", which is available for Windows, SUSE Linux, and Red Hat Linux. R itself is a free …

  1. Doctor Syntax Silver badge

    "The reason vendors are racing to improve support for data analytics is that the IoT era already generates (and will continue to do so) huge amounts of data which is otherwise useless."

    Does this mean that data analytics will stop the IoT era generating huge amounts of data?

    1. Gordon 10

      I for one expect the IoT data will remain useless regardless which package is used to process it.

      Volume <> quality or insight

      There a series of science fiction novels where the protagonist manages to accomplish all sorts of interesting things purely because he spends time talking to IoT devices that have been so over engineered they are AI class and so bored to death that they act for him simply because he's been kind enough to talk to them. Seems pretty credible right now!

  2. ZSn

    Free

    I'm missing something obviously. R is already free, has Microsoft just integrated it better into SQL server or some such? Or are the IDE tools better?

    1. Anonymous Coward
      Anonymous Coward

      Re: Free

      "R is already free, has Microsoft just integrated it better into SQL server or some such?"

      Yes that's coming soon.

      Probably they will also port it to compile into C# .Net which should make it a good bit faster than it is now. Think Minecraft in .Net versus Minecraft in Java. If you havn't seen it, the .Net one is many times faster.

  3. staringatclouds

    "...available for Red Hat Linux, SUSE Linux, Hadoop on Red Hat, and Teradata Database on SUSE Linux."

    ...All of which will start displaying GWX nagware.

    1. Doctor Syntax Silver badge

      "All of which will start displaying GWX nagware."

      Or "telemetry".

  4. Anonymous Coward
    Anonymous Coward

    R Language, R Services.... how about just using a stats lib for whatever language one normally uses?

    Being a language geek, I've looked into R briefly... IIRC it's slow, adhoc, and its syntax doesn't even offer any particular advantages over a lousy run-of-the-mill general purpose lang. It's only popular because a bunch new-to-programming stats geeks latched onto it a couple decades ago. Am I right?

    1. captain veg Silver badge

      Re: Am I right?

      Partly.

      The fact that it does vector arithmetic intrinsically is one reason. But the biggy is the enormous quantity of library code out there.

      If you need speed, compilers are available, for a price. Or try Ox.

      -A.

    2. Spacedman
      Thumb Down

      Well, if you want to call John Chambers a "new-to-programming stats geek" then yes:

      http://statweb.stanford.edu/~jmc4/vitae.html

      But yes, its still a horrendous language but great for data munging and advanced stats methods that are almost exclusively first implemented in R (but Python is creeping up there).

      1. Lysenko

        but Python is creeping up there

        ...and Python can do a hell of a lot more than just wrangle numbers. You can, for example, write your IoT wireless comms handling largely in Python. I wouldn't fancy trying that in R.

        R isn't really a programmers product, it is meant for people whose tool set would otherwise be SPSS, MATLAB and Excel.

        1. Spacedman

          Re: but Python is creeping up there

          Maybe now, back in t'day it (well, "S") was for people whose tool set was FORTRAN. An interactive data analysis n stats platform with graphics? Flip yeah. No more compiling, linking, running, dumping output to some format UNIGRAPH could understand for plotting...

          Python is really getting there for data analysis with numpy and pandas, still lags R on graphics though. However I reckon the machine learning people - and lets face it, that's what Teh Big Bznesses want to use their analytical engines for - are 50/50 Python/R.

        2. Anonymous Coward
          Anonymous Coward

          Re: but Python is creeping up there

          > ...and Python can do a hell of a lot more than just wrangle numbers.

          Some of us have an insuperable aversion to Python on account of its opinionated ideas about source code formatting and other idiosyncrasies.

          I also have the impression that Python adoption has been somewhat stalled by the recent increase in popularity of JavaScript. No hard data to back this up, and it might just be my own subjective impression due to the places and people I frequent these days.

          1. Lysenko

            Re: but Python is creeping up there

            Python has certainly taken a hit in the web application back end arena with the rise of JS centric "Onesies" (SPAs - Single Page Applications) and that will inevitably depress its market share in terms of overall job postings, but in the context under discussion here that is irrelevant.

            As someone else posted in respect of R, it is as much about the libraries as the language. In the case of Python that means NumPy. There is nothing in the JS ecosystem that comes remotely close - ECMAScript V6 and node.js aren't going to change that.

            If anything is likely (IMO) to make inroads in this area it would be Go, if Google open source some serious math libraries, but even then it would probably be for distributed production quality analytics, not ad hoc short lifetime hackery which Python and R excel (sic) at.

            1. Anonymous Coward
              Anonymous Coward

              Re: but Python is creeping up there

              > In the case of Python that means NumPy. There is nothing in the JS ecosystem that comes remotely close

              That is true, and you also make a good point about JS rise relative to Python being primarily in the web arena (server side as much as client side). Whether that will translate into a move into other arenas remains to be seen, but I wouldn't bet against it.

              From my narrow field of view, the JS crowd are quite happy to use R for the data analysis needs. In fact, I met some bloke a few weeks ago whose startup does some sort of JS/R integration.

              1. Lysenko

                Whether ... move into other arenas ... but I wouldn't bet against it.

                Depends where the project emphasis is. Server side JS (node etc.) isn't driven by JS or its ecosystem being particularly well suited to server side processing. It is driven by "full stack" concepts that promote the idea that it is a good idea to use the same language on both client and server with the implicit assumption that there actually IS a C/S division and that the client is a web browser.

                That doesn't hold for a lot of analytics. Frequently the app crunches data to a PDF, XLS or RDBMS. No web browser so no incentive to use JS besides it possibly being the only skill you've got and in that case R isn't an option either.

                For a lot of IoT "full stack" means Embedded C in microcontrollers to C/C++ wireless comms server to Python/Java/Go/Scala/Whatever database storage. JS/node can maybe handle the last part, but then so can C/C++ and that is the non-negotiable layer here. Programming an MSP430 in JS isn't going to happen. Ever.

                Python/NumPy fits best (IMO) because you can leverage it to do both the analytics and the "glue" between the comms feed and storage. Use JS/node for glue and you need to bolt in yet another language for analytics ... it causes more "tower of babelism", not less. Angular, React, JQuery etc. aren't in the game, so JS main drivers simply don't apply.

                1. Anonymous Coward
                  Anonymous Coward

                  Re: Whether ... move into other arenas ... but I wouldn't bet against it.

                  > Server side JS (node etc.) [....] is driven by "full stack" concepts

                  I have to disagree with you there. Well, partially disagree. The driver that you mention definitely exists, as one can see just by looking at job postings and things like that (as to whether the "full stack" concept is quite so great, I remain unconvinced).

                  However, I think that JS might be gaining ground because of some intrinsic advantages, first and foremost the existence of a choice of pretty powerful yet lean embeddable engines, and the language itself being resource-friendly (after all, it was designed to run on a browser at a time when neither browsers nor computers had all that much spare grunt). That single-thread, asynchronous execution approach may sound like bollocks at first, until you realise what a strike of genius it is. It is also to be noted that Python is more an application than a language--as far as I'm aware, there is only one implementation of it? Nothing wrong with that I supposed, except for being hit by the proverbial bus, whereas there are dozens of JS (Ok, ECMAScript) implementations.

                  Be that as it may, I am seeing more JS being used in domains where it had zero presence five years ago, notably on embedded devices and desktop programming (Qt5).

                  1. Lysenko

                    Re: Whether ... move into other arenas ... but I wouldn't bet against it.

                    >>existence of a choice of pretty powerful yet lean embeddable engines,

                    Lua. Designed for precisely this. You can embed it anywhere from a WiFi chip to a mainframe.

                    >>and the language itself being resource-friendly (after all, it was >>designed to run on a browser at a time when neither browsers nor >>computers had all that much spare grunt).

                    Running in a browser doesn't limit the scale of the resources you can use, just the type (sandboxing). It is a bit irrelevant to big data analytics though. If you're trying to crunch gigabytes of data then the RAM footprint of the VM isn't a significant concern.

                    >>That single-thread, asynchronous execution approach may sound like >>bollocks at first, until you realise what a strike of genius it is.

                    No, it sounds like Win16 co-operative multitasking ... which is exactly what it is. Have a look at all the WSA... calls in the Windows socket API. Writing JS is almost exactly the same as writing for Windows 3.x (in this regard).

                    >>It is also to be noted that Python is more an application than a >>language--as far as I'm aware

                    Nope. You can fire up a CLI, but all other "application" aspects (like an IDE) are third party add-ons. The Jython implementation targets the JVM - you wouldn't call Java an "application".

                    >>there is only one implementation of it?

                    One reference implementation (Cython) sure. But also Jython, PyPy, micro-Python, IronPython etc.

    3. Anonymous Coward
      Anonymous Coward

      > how about just using a stats lib for whatever language one normally uses?

      I do not think you are aware of the vast amount, and quality, of libraries that have been written for R.

      Could you please point to a stats library that:

      * Approaches R in terms of features.

      * Approaches R in terms of extensibility.

      * Approaches R in terms of functionality, directly supported or via libraries/extensions.

      * Approaches R in terms of rapid prototyping or exploratory analysis.

      * Approaches R in terms of supported data sources.

      * Approaches R in terms of APIs.

      * Approaches R in terms of available documentation.

      * Approaches R in terms of scientific community acceptance.

      * Approaches R in terms of price.

      As with any complex tool, I feel that you cannot form an opinion from a "brief" look. It was only after I wrote my first real-world R application that I came to appreciate its usefulness and convenience.

      1. BebopWeBop

        It's horses for courses. I use R a great deal to get a handle on what the data might mean, and as a prototyping tool it is fantastic. I might not be so keen on using it as a production system for reasons of clarity and maintenance. Efficiency, well I really haven't run too many very very large data sets to worry about it.

        What does concern me, about R and many other languages, is that while they are extremely good at compact solutions to problems that a relatively small group of people who communicate well need to solve, scaling is difficult if a proper lifecycle is considered.

        Not that I have any better solutions though :-(

        1. Zimmer

          .....maybe you have

          "Not that I have any better solutions though :-("

          Possibly your solution is SAS software, but it is expensive....

      2. Uberseehandel

        Yes but R wasn't created in a hot Northern Hemisphere university by a white guy in a lab coat. So the ignorant are going to diss it.

        1. Gordon 10

          An important consideration is not just the quantity of R libraries but the quality of them it's a very good example of Linus' law in action.

          Its downsides are its not a general purpose programming language and it's now beginning to suffer library overload where too many of the libraries do 95% of the same thing but you need that missing 5% from each one and then you find the overlapping bits start interfering with each other.

          It's likely to eat the lunch of the low to mid end data analytics 'space as its now directly supported by both Oracle and Miscrosft on their flagship DB's plus all the commercial Hadoop vendors like Clouders and Hortonworks and then SparkR is lurking too.

  5. Uberseehandel

    Why Is Everybody So Rude About R?

    Well, it was developed by couple of Jokers in Auckland, init?

    Data analytics is hot right now, R provides the tools.

  6. This post has been deleted by its author

    1. Anonymous Coward
      Anonymous Coward

      > Big data is BS. In statistical analysis, anything meaningful comes up even with much smaller random samples.

      You are correct about the statistical aspects, but I understood big data to be not (primarily) about stats but about keeping track of many facets of every individual subject. If I am mistaken I would welcome clarification from people who are into this, please.

      1. Mark 65

        The way I understand it it is about targeting for sales. Think supermarket loyalty schemes sending you particular promotions. They want to squeeze every last drop of sales cash out of you by analysing your past behaviour and shopping patterns along with trends occurring in similar customers. Where a company owns multiple stores/brands they want to piece that data together to get a totally invasive picture of your behaviour. Not sure how a myriad of useless shite from connected light-bulbs will help.

  7. packrat

    r.

    anal lytics.

    predictive software and the profiles on police calls get them shooting first and asking questions later.

    GIGO.

    how 'bout the number of kids gone into private schools vs girls dying blond hair black?

    (canada here. RC schools, privates, publics. Or you could be a religious loon + home school.)

    gigo, again. history charcter impact when the whole data-set sucks.

    hot air? let us vent!

    packrat

  8. Gartal

    the IoT era already generates (and will continue to do so) huge amounts of data which is otherwise useless

    the IoT era already generates (and will continue to do so) huge amounts of data which is (>delete<otherwise) useless

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like