back to article Python explosion blamed on pandas

Not content to bait developers by declaring that Python is the fastest-growing major programming language, coding community site Stack Overflow has revealed the reason for its metastasis. Coming a day after Programmer Day, which falls on the 256th day of the year – except January 7: – the explanatory post by data scientist …

  1. Will Godfrey Silver badge
    Happy

    Hmm

    So, pandering to the audience.

    1. soaklord

      Re: Hmm

      I see what you did there.

    2. ecofeco Silver badge

      Re: Hmm

      BA DUMP BA!

  2. Anonymous Coward
    Anonymous Coward

    How did the pandas get explosives?

    1. eswan

      > How did the pandas get explosives?

      From the guerrillas, duh.

  3. Schultz
    Boffin

    Execution speed...

    when you play with big quantities of data in science, the speed is usually limited by inefficient code, not by inherent properties of the language. When I crunch my 5 GB dataset, making a for loop a little faster won't make my code run in a reasonable time -- but moving to a sparse data representation or avoiding the loop altogether will. Python makes those things easy, that's why it is a game changer for science.

    1. Anonymous Coward
      Anonymous Coward

      Re: Execution speed...

      Python makes those things easy, that's why it is a game changer for science.

      When talking about science, it is best to avoid hyperbole and exaggeration. Is python a sometimes convenient tool? Yes. Is it helpful to have iin some situations? Absolutely. Is it a "game changer"? Hell, no.

      New algorithms for data processing and representation, or new models to analyse that data, or new theories to explain it, or new experimental techniques to measure it can be a critical new development, or a "game changer" if you will. A computer language which allows you to quickly slap together a bit of code neither your colleagues nor yourself will understand in two months time? Not so much.

      1. Charlie Clark Silver badge

        Re: Execution speed...

        Absolutely. Is it a "game changer"? Hell, no.

        You should probably talk to more scientists. Python has become popular among a huge range of scientists with no formal computing qualifications who are required to process large amounts of data. I've met several who would never have got their work done without Python. So, yes, for some it really is a game changer.

        1. M. Poolman
          Boffin

          Re: Execution speed...

          is it a "game changer"?

          For me absolutely, for at least a dozen reasons, including resource management, the ability to use it interactively and the fact that you can easily interface to C (or other low level language) libraries. Although it's probably true that there's nothing you can do in python that couldn't be done in C, and the C would ultimately run faster, the development time in python is orders of magnitude faster, which in a scientific, especially research, context is far more important. Consider, for exanple, some real world data set which could be represented by nested dictionaries in which keys can either be real numbers or strings. This is trivial in python and can be taught to science students without a programming background, doing it in C would probably be a 2nd year undergraduate programming exercise.

          Yes, there are potential drawbacks to Python, but in my view they are mainly social (untrained people tend to try run before they can walk), not technical

        2. scarletherring

          Re: Execution speed...

          Absolutely. I work as a scientific developer at a university and we've begun offering Python bootcamps because so many researchers want to learn it.

        3. BeakUpBottom

          Re: Execution speed...

          You could make that argument for Excel/Access/VBA. But I'd rather you didn't.

          At some point it all ends up in front of an experienced programmer as a pile of novice code, a huge problem, a short deadline and requirements of "I can't quite get it working, will you take a look?"

          It's got its merits but, like everything else in this industry for the past umpteen years, all the breathless hyperbole is a bit of a turn off.

          1. M. Poolman

            Re: Execution speed...

            @ BeakUpBottom (At some point ...)

            That is an example of the social problem I mentioned. When I teach short course to non-programmers (for very secific tasks), the first thing I say is along the lines of "you will be learning to use a programming language, but this is NOT going to turn you into programmers" (and repeat several times thereafter!)

          2. Charlie Clark Silver badge

            Re: Execution speed...

            You could make that argument for Excel/Access/VBA. But I'd rather you didn't.

            I know no one who prefers VB(A) over Python and lots of people who've moved from VB(A) to Python and have embraced it wholeheartedly, especially as some of us work hard to make it possible to work with MS Office files without having to start Word or Excel.

            While in Python you can't simply record a macro to get something done, it's a good example of nearly literate programming. Most new users are keen to right good code and respond well to suggested improvements and I've almost never come across unreadable newbie code. I know some people hate the whitespace but it makes a real difference in these environments.

            VBA on the other hand has access to some fantastic API but as a programming language is akin to self-harm.

          3. pete23

            Re: Execution speed...

            There are millions of EUCs in excel/VBA that never need to get in front of an "experienced programmer".

            Python is filling a similar niche for people with more specialised number crunching requirements. In terms of hyperbole it's solving a smaller set of problems than the spreadsheet but it's solving them exceptionally well...

            This is actually a super-happy story in that we've actually managed to grow some combination of language and tooling that people want to use!

      2. The Indomitable Gall

        Re: Execution speed...

        @AC:

        " When talking about science, it is best to avoid hyperbole and exaggeration. Is python a sometimes convenient tool? Yes. Is it helpful to have iin some situations? Absolutely. Is it a "game changer"? Hell, no. "

        That's every bit as strong a statement as the one you're seeking to refute.

        In the early days of computer programming, most people were just scheduling batch jobs (hence "programming") using a scripting language.

        The problem is, most shell scripting languages are rubbish. Most attempts at more powerful shell scripting languages (e.g. Tcl) were contorted, byzantine affairs. Javascript was clumsy to start off with, and when people tried to put it into the shell, if just felt weird.

        What is often overlooked is that Python is a shell scripting language, and it manages to maintain a pretty high level of flexibility and power while still being more learner-friendly than most languages.

        When people complain about its lack of speed, they're kind of missing the point, because in applications like data science, all the heavy lifting is done by libraries, which are generally compiled C code.

        Python with Pandas is a bit like a massively updated version of using calling grep from a bash script.

        It has changed the game.

        1. Richard Plinston Silver badge

          Re: Execution speed...

          > What is often overlooked is that Python is a shell scripting language,

          Python is a computer language. The most common implementation can be used as a 'shell scripting language', or as an application programming language, or as a statement evaluation tool. Other implementations can be used as an embedded language or can compile to various VMs and/or can use JIT compilation.

    2. tfb Silver badge
      Boffin

      Re: Execution speed...

      I used to think this was true as well, but it's not. if you are dealing with large quantities of numerical data (and 5GB is not a large quantity in this sense: our jobs create terabytes a day) then having something which implements various numerical array-bashing operations efficiently does actually matter. Hence NumPy.

      1. Charlie Clark Silver badge

        Re: Execution speed...

        Hence NumPy.… and numbas and pypy, etc. Python has always followed the principle of avoiding premature optimisations and the libraries allow us to continue it.

  4. Vaidotas Zemlys

    approachable for novice programmers?

    So how exactly do you install pandas? Every time I want to try out python, I do not know which version to use, I just hope that random pip command du jour will work and nothing will break. Should I use Python version which came with Mac? Or the one I installed with Homebrew? Where do the packages reside?

    1. hplasm Silver badge
      Happy

      Re: approachable for novice programmers?

      "Where do the packages reside?"

      China -Where the Panda repositories are.

    2. Charlie Clark Silver badge

      Re: approachable for novice programmers?

      Python's packaging remains a problem. However, in general you should avoid installing user libraries for a system language.

      Personally, I create a separate virtual environment for every Python project an install the required libraries only there. However, when it comes to Pandas you can also install Anaconda (from the maintainers of Pandas) which comes with its own package manager for a set of well-maintained and pre-compiled libraries.

    3. Anonymous Coward
      Anonymous Coward

      Re: approachable for novice programmers?

      "So how exactly do you install pandas?"

      For engineers such as myself, this is one of the biggest headaches with Python. Compared to more engineering-focused ecosystems package and dependency management are a pain in the arse.

      However there's also a pretty simple answer that satisfies most use cases.

      Step 1: Install anaconda

      Step 2: Set up a virtual environment per project

      Step 3: Distribute that virtualenv as a docker container

      Job done.

    4. Alistair Silver badge
      Coat

      Re: approachable for novice programmers?

      "So how exactly do you install pandas?"

      Typically one gets a zoo on board about 2 to 3 years ahead of schedule, and arranges government funding to build a specialist panda enclosure, and then one writes up an agreement with the Chinese Government and the panda breeding associations. From what I've been reading of late it costs between $85,000USD to $1.1Million USD a month to host the pandas. Most of the money is supposed to go back to the breeding and protection of the species, but I've no proof of that.

      <ref TorStar article "Pandas Installed at Toronto Zoo over objections from (Free speech advocacy group) >

  5. Notas Badoff

    "It's fun (for a programming language)

    It's readable

    It has lots of libraries

    It's approachable for novice programmers"

    .

    alt.sysadmin.recovery always had a very useful motto: "All hardware sucks. All software sucks. They all suck the same."

    As applicable here, all programming languages suck. 'Fun' is an orthogonal concept.

    Libraries, people, documentation - that's the package that makes progress possible in any particular language. The language is a circumstance.

    1. Anonymous Coward
      Anonymous Coward

      It is the masochistic kind of fun

      It is the masochistic kind of fun. BDSM. All that matters is your pain threshold - how painful do you want it to be before you enjoy it.

      Python is for people with low pain threshold.

      Javascript and C - same but for those who enjoy the quantity - more beatings, more fun. Just light ones every time.

      Perl is for the really kinky ones - it does not hurt a lot, but hurts in some really weird places.

      Java, C++ - for those of us who need to have a glimpse of the light eternal before they get a kick out of it. A good equivalent would be - BSDM fans of strangulation.

  6. hplasm Silver badge
    Mushroom

    Python explosion blamed on pandas

    Obviously.

    Put a panda into a python and it will explode.

    viz. -->

    1. Stoneshop Silver badge
      Holmes

      Re: Python explosion blamed on pandas

      Well, there are pics on Da Intarwebz of a python explosion caused by a crocodile, but none by pandas. So I am disinclined to accept the statement posited above as true.

      1. FrancisKing
        Unhappy

        Re: Python explosion blamed on pandas

        Small pandas are OK. If the python gets greedy, then bad things happen.

  7. Uncle Slacky Silver badge
    Stop

    Why not Fortran?

    If it can't be done in Fortran, it ain't worth doing...

    1. Stoneshop Silver badge
      Boffin

      Re: Why not Fortran?

      Real Programmers can write FORTRAN in any language.

      And Real Real Programmers can write assembler in FORTRAN.

  8. Anomyous Curd

    Ansible for systems management, Pandas for small data analysis, PySpark for big data analysis, Notebooks for presentation and encapsulation.

    Python: actually pretty nice to use for almost everyone.

    1. Merrill

      Notebooks in the Azure Cloud

      Microsoft offers free Jupyter notebooks in the Azure Cloud at notebooks.azure.com for those interested in investigating Python notebooks. There are also two very basic Python courses from Microsoft on edX suitable for the rank beginner that use the notebooks.

      There are free Jupyter notebooks for the Julia language at juliabox.com for those interested in Julia, and there is a Coursera course on Julia that assumes you know other languages.The objective of Julia is to provide the ease of use of Python, R, and Matlab while running as fast as C or Fortran. See juliacomputing.com

  9. kbutler.toledo
    Thumb Up

    OFF-BY-ONE

    Off by one.... in a binary system... Oh, well.

  10. Roman_Yanush

    ...Go DataFrames! Pandas is the Python's answer and the challenge that underpowered Excel will never accept!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019