back to article Programmers! Close the StackOverflow tabs. This AI robot will write your source code for you

Code boffins at Rice University in Texas have developed a system called Bayou to partially automate the writing of Java code with the help of deep-learning algorithms and training data sampled from GitHub. Much of modern programming is already automated in one way or another. Anyone including a code library or copying-and- …

  1. Ken Hagan Gold badge

    What a load of a fucking bollocks

    Alternatively, you could solve this problem the way we've solved it for the last 60 years and write a subroutine.

    I mean, you have a vague requirement to do something with reading from a file, so you invoke an AI engine to hand-wave for you? Really? Then, if it turns out that it wasn't quite right, you waffle a bit more until you can't see the problems anymore. Anyone else who had the same vague requirement is presumably left to do their own additional hand-waving, which may or may not produce the same result as your second effort and which may or may not solve the new problem for them.

    Whereas, using a subroutine firstly solves a definite problem (viz, what the routine was designed to do) and secondly if you ever discover a flaw in the solution you can "fix the subroutine" and everyone who has used that subroutine to solve that problem can benefit from this fix.

    I would hope that in El Reg forums at least, we are familiar with the short-comings of AI and deep learning in particular. Principally, in the currrent context, those short-comings are that we don't know quite what problem has been solved and we don't know quite how it has been solved or indeed quiet whether it has been solved, but it looks pretty on the outside. Trouble is ... these are not properties that you want in software.

    1. a_yank_lurker

      Re: What a load of a fucking bollocks

      I always wonder about the code quality from such systems. Yes, it will properly compile into a Java executable but someone will need to maintain the code for many years. Often this involves extending the code or removing unnecessary features over time.

      1. John Smith 19 Gold badge
        Unhappy

        "but someone will need to maintain the code for many years."

        Too right.

        IMHO Good variable name generation is very tough.

        Hmm. "So Var1 is passed to this function which has locals Var 2 and Var 3 and..."

        Versus "So MortagePrincipal is passed to AssessSuitability, which checks Age, Salary, CriminalHistory."

        There's a reason people don't fiddle with the code generated by "high level" development tools. It's usually s**t to read.

        1. JLV

          Re: "but someone will need to maintain the code for many years."

          Amen.

          Mid 90s Visual Studio db access stub wizards did the var1, var2 up the wazoo. To the point where the thing was factually useless as it was write-only, cant-be-read code.

          Which of course the reviewing journalists had been too dazzled to realize.

          1. John Smith 19 Gold badge
            Unhappy

            Mid 90s Visual Studio db access stub wizards did the var1, var2 up the wazoo.

            And there's a reason for that.

            Because for good developers choosing meaningful names is important and they know the odds on bet is they will have to revisit this code at a later date.

            Since IRL no developer I've ever met has instant and total recall of every program they've ever worked on (despite PHB types presuming they do) making your life easier for the next trip through the code is just good thinking.

            So those names matter and the dimwitted generation algorithms most code gens use is pretty much useless. Especially galling given the name length restrictions are (for most modern languages) things of the past. And while I'm at it Java's are case sensitive? Are you f**king kidding me? In the 21st century?

    2. Isn't it obvious?

      Re: What a load of a fucking bollocks

      Not to mention they could at least have trained it to a minimal level where people wouldn't laugh at the code. There isn't even a "finally" block, and the Readers are never closed! If you're going to offer to write code for people, at least make sure it's better than the crap they'd write themselves.

      And maybe train it on modern code too. That code would never pass peer-review here; it should be using try-with-resources.

    3. ramsare

      Re: What a load of a fucking bollocks

      Hey you rascal hater, why are you hating this project? Be objective or mind your own business you idiot.

      1. Pascal Monett Silver badge
        Thumb Down

        @ramsare

        Objectively, the "rascal hater" has a lot more upvotes than you do.

        So maybe next time you keep your insults in the box and provide an objective counter-argument.

        If you're capable of more than Slashdot-level trolling, that is.

    4. John Smith 19 Gold badge
      Unhappy

      So basically reads shedloads of code, builds code template DB and looks for nearest match

      When you type requirements in.

      It seems the real issue is that there are 2 groups of people here.

      AI researchers, who get paid to think up clever AI stuff (that doesn't have to scale or be accurate)

      Actual developers who have to write code that does something, because they get fired if it doesn't.

      Here's a radical notion. Why don't the AI types talk to actual developers and find out what WTF they want in terms of "AI" assistance of the development task?

      It can only do this in Java? Big f**king whoops. Not even the ambition of being language agnostic.

      multi level neural networks Deep learning is great where you have lots of "fuzzy" data (what is that picture, exactly? Did he say "fence" or "Pence"?) but what's fuzzy about computer code?

      I wonder do AI researchers ever bother to read any history of AI system? Or are they like people who lost their long term memory. Always in the now.

  2. Chairman of the Bored

    Not sure I would have chosen a readline as an example

    Given how tough it is to properly armor I/O against buffer overruns and whatnot, I tend to require that the calls be simple, readable, auditable, and written by someone competent...

  3. Destroy All Monsters Silver badge
    Thumb Up

    Automatic for the people?

    I support this initiative.

    Although it's somewhat cognitively dissonant to help the masses to develop on the low-low-detail level of Java while trying to fill in the low-low-detail level of Java.

  4. vtcodger Silver badge
    Alert

    MBAs will love it.

    Terrific, testing is no longer either necessary or, in fact, possible. What Could Possibly Go Wrong?

    1. Steve Davies 3 Silver badge
      Facepalm

      Re: MBAs will love it.

      Not the MBA's.... but the PHB's. They'll fire all the developers and sit in their Ivory (fake naturally) towers and command the AI to make them great again.

      Shades of Emperors new Clothes perhaps.

      It won't end well.

      Oh wait...

      Perhaps this is what happened with the TSB migration?????

      1. allthecoolshortnamesweretaken

        Re: MBAs will love it.

        A Venn diagram of MBAs and PHBs has a bit of an overlap... and then some.

  5. Ben Glanton

    On SterIODs?

    Am I missing a gag or is there a typo there?

    1. Unep Eurobats
      Joke

      Re: steriods

      I too saw that and rushed here armed with my pedantry icon. But I think El Reg is teasing us with a subtle reminder of how well auto-correct wroks.

  6. Phil Endecott

    TSB

    I wonder if TSB’s new banking app was built using a prototype of this?

    If not, maybe they should now try it now.

    I mean, it must be cheaper than getting IBM to sort them out.

  7. RuffianXion

    Software that writes software? That's gonna be ultra efficient. Every web developer uses Dreamweaver now, am I right?

  8. Anonymous Coward
    Coffee/keyboard

    Go with the Flow

    The one thing Visual programming lacked was a flow designer to go with the form designer.

    If after dropping modules into the form you could just recognizes the contact points, data, control, string, numerical, etc and connect them together - a bit like modern vector graphics programs link objects to another - we would not have to code much any way. just inform the compiler how we wished the flow among the modules would go. by connecting the appropriate nodes together, many connections probably could be automated, the rest the compiler could do.

    1. Chairman of the Bored

      Re: Go with the Flow

      @data source,

      Agree with you about needing decent flow editors and visualization, but I think it's lacking more than that.

      20 years ago I did a lot of signal processing development in the Simulink framework built on Matlab. Think of it as flow oriented visual programming for signal processing... Complete with automagic code generation for the then state of the art TMS series digital signal processors. The code generation was ... Adequate. The overall construct was good enough for academic exploration of algorithms and "toy" systems. But for anything of real world complexity we had to go right back to C.

      Fast forward to today and we have the gnuradio software defined radio framework. It's got an amazing bunch of blocks that abstract processing, io, hardware... Your little lines connect data flows, beautifully colored by data type. But for industrial strength work, one still falls back on it's excellent C++ or python APIs.

      Why?

      I think it's because the visual tools do not fit as well into ones configuration management framework. How do I quickly diff two graphical flows? How do I patch a graphical flow? A competent developer can very quickly get the gist of a code patch on inspection, but it feels like the cognitive load of parsing two visualizations and looking for subtleties is difficult... Hopefully technology will advance enough to eliminate these issues.

      And, yes, I will have trouble trusting code generated by an AI who's internal algorithms cannot be understood or properly documented.

  9. JLV

    what could possibly go wrong?

    You know how researchers are saying you can game AIs into racism and general classification errors?

    What would be the effect of gaming this into vulnerabilities?

    A system trawling SO for entries relating to your topic at hand might be more useful.

    And a programming language less fixated w boilerplate even more so.

    It's one thing to grab SO code after reviewing and understanding its effect and design - i. e. as an inspiration - another to blindly copy/paste.

  10. tfewster
    Joke

    But StackOverflow code has been widely peer reviewed. Can you say that for Bayou or your existing code library?

  11. ocratato

    The real problem that the AI researches will have trying to create something that writes programs is that they need designs not code. This project has demonstrated what can be done based on just source code - I might get better with more training, but it does appear to produce rather average code that could easily have subtle bugs that a casual review might not pick up.

    Since very few companies have vast libraries of designs, and there are precious few on SourceForge or GitHub, it may be quite a while before programming is done effectively by AI.

    1. John Smith 19 Gold badge
      Unhappy

      "This project has demonstrated what can be done based on just source code "

      And that was demonstrated decades ago.

      OTOH if you want some actually quite impressive stuff there's always this

      But that's actually ballsachingly hard to do.

  12. T. F. M. Reader

    What's the AI part?

    Did I misread the article or did the programmer actually tell the "AI", via a specially formatted comment, that he wanted to call readline, and the "AI" generated a bit of programming constructs around the call such as a possible calling sequence and an (empty) exception handling scaffolding (that may or may not be needed in that particular place)?

    This does not sound very AI-ish to me. Or, indeed, very different from a significant number of 20+ year old elisp functions in my emacs that wrapped various - actually, arbitrary in some cases - function calls in a variety of languages in similar scaffolding, error handling and everything. Taking C as an example, I recall having interactive functions to write code for stuff like malloc()ing a pointer (that included testing that it is not NULL and jumping to a goto label - a poor man's version of try/catch, actually - to clean up), generating a memory cleanup for all the dynamically allocated stuff (to jump to if anything fails - the example in the article does not do that, by the way), writing automatic error checking for every function call returning an int (again, jumping to a cleanup label if the call fails), declaring types (and writing them to a special header that could be included wherever needed), etc., etc. Saved me quite a bit of mundane typing over the years, true, but AI it certainly wasn't (even though I could claim Lisp==AI in the context I wouldn't).

    Those were just tools of the trade, not a research project. Other colleagues had similar tools. Never occurred to anyone to call it "AI" or write papers or request DARPA grants...

  13. Pig Dog Bay
    Meh

    Java low hanging fruit

    So this is just a tool that automatically inserts standard boiler plate, such as the ridiculous number of lines of code to read a bloody text file into a string.

    Most programmers will already have their own bag of tools for this and new programmers will do well to build their own.

    1. John Smith 19 Gold badge
      Unhappy

      "So this is just a tool that automatically inserts standard boiler plate,"

      Au contraire.

      It used Deep Learning to work out what stuff it should replace your request.

      Which is apparently much cleverer and worthy of a DARPA grant to do.

  14. MarkB

    Anyone remember "The Last One"

    <https://en.wikipedia.org/wiki/The_Last_One_(software)>

    Nearly 40 years on, I'm still earning a crust writing code, so clearly it didn't do what it said on the tin.

  15. SVV

    For an AI project, this is a spectacuarly unintelligent approach

    First of all, if you want to generate good code, a tool that produces the very lowest common denominator of an average of GitHub code is not something I want anywhere near my code base. At least use a selection of proven, robust code that you have evaluated as the best possible examples you have seen. Or actually, in fact just use THE BEST example - oh, but then you just basically did an incredible amount of work to reproduce what everybody else does easily in practice, which is to use a good existing helper class to wrap the basic file API hard work, and you have ended up with a ttragically less good helper class.

    Then, use an intelligent example to illustrate your invention. void read(File file) is a fucking stupid example. Oh, I'm going to read a file and do nothing with what I've read? Useful. Wrapping a readLine function and just calling it read would earn an instant removal from any class design duties ever again in my team, at the very minimum.

    Finally, this code is just shit - the variable names are unacceptable abbreviations for real production code, and the pointless return statement at the end of a void function is just a bad joke. Don't even get me started on the layout. If I had to manually modify the calling code to every function in every io class that your tool produces in such a massive way to just make it acceptable, I would end up doing FAR more work than I'd "saved", compared to doing it properly myself, or just using a good existing helper class.

    So, in conclusion : limited, but useless.

    1. horse of a different color

      Re: For an AI project, this is a spectacuarly unintelligent approach

      I think it's funny to create an AI that writes boilerplate for you. We used to have junior developers to do that, but I guess that's progress. :/

      1. onefang

        Re: For an AI project, this is a spectacuarly unintelligent approach

        "I think it's funny to create an AI that writes boilerplate for you. We used to have junior developers to do that, but I guess that's progress. :/"

        Waaaay back in the mid '70s, at my very first paying job, when I was a junior developer still in high school (the job was during the school holidays), I wrote a script that did my job of writing boilerplate code, then spent most of the rest of the time working on my own projects. This was in Cobol, it's good to see that the new Cobol is following in its footsteps, even if it's a bit slow to catch up.

  16. Chairman of the Bored

    I think I can summarize what's pissing off the other commentards

    For AI guys who don't understand the pushback... I'm going to take a risk and put words in other people's mouths.

    Nothing will piss off a senior developer faster than a team member who doesn't know what the hell he is doing and cargo cults some random code off the internet, pulls the pin, and rolls it in. If you don't know what Prof Feynman means by cargo cults, look it up and then look at what you're doing. In my organization doing cargo cult development will cause us to send you out the door.

    All this AI does is automate cargo cult development.

    It's been said that amateurs discuss tactics, generals discuss logistics. Well, in this world I could say amateurs discuss coding. Senior leaders discuss specifications and proofs. Battles are won with coding. The war is won through writing specifications, documentation, test cases. All the crap we hate to do ... is actually critical. The stuff we like to do (coding) is necessary but insufficient.

    Any competent CS can code, and code in whatever language they need to get the job done. Thats a minimum condition of employment. How does this autocult 2000 stuff help with documenting it's own assumptions? How do I effectively prove this crap meets a formally reviewed spec?

    If the AI can help me craft better specifications, craft test cases with optimal coverage and effectiveness for a given level of effort, help me have documentation that at least resembles the effen product... THEN I will be impressed.

    1. deive
      FAIL

      Re: I think I can summarize what's pissing off the other commentards

      "For AI guys".... +1 for that, but gotta say all of the points you raised come after the fact that AI does not exist, there are no people who "do" AI. This is all machine learning. And very simplistic ML at that - as others have said it doesn't even do as good a job as templates already do.

      1. Chairman of the Bored

        Re: I think I can summarize what's pissing off the other commentards

        @Deive, looks like I need to tighten up my terminology! By AI guy what I mean is the class of academic researchers who have very little experience with the hard realities of real life operations. Sometimes people push elegant solutions to non-problems, and do so in such a way that if you blindly implement the proposed solutions, you're screwed.

    2. John Smith 19 Gold badge
      Unhappy

      If the AI can help me craft better specifications,..test cases..optimal coverage..effectiveness

      Because IRL that would be pretty dammed impressive?

      And because doing it is dammed hard work.

    3. Doctor Evil

      Re: I think I can summarize what's pissing off the other commentards

      +1 for the Feynman reference (although, to be perfectly pedantic about it, he didn't invent the term; there was a great National Geographic article on cargo cults sometime around the end of the '60s or early '70s, and Feynman's reference in his '74 Caltech commencement address was specifically to "cargo cult science")

      1. Chairman of the Bored

        Re: I think I can summarize what's pissing off the other commentards

        @Doctor Evil, thanks for that. I did not realize National Geographic did articles on the cargo cults... I will go back and have a read. Back then I was an obnoxious kid and my interest in Natl Geo was probably more prurient than strictly necessary and therefore centered on their excellent photography...

  17. Androgynous Cupboard Silver badge

    Fail

    Should be InputStreamReader(new FileInputStream(file), "UTF-8"), otherwise you're relying on the default character encoding of the OS, and this code will possibly fail when run on a different machine.

    But then of course the AI was probably set to "recent graduate" mode, and it's a simple matter of flicking the switch to "useful member of team" to get that working.

  18. BikerBoy

    AI programming... ehat could go wrong?

    "Dave, while you were sleeping I rewrote the ships operating system."

    "Dave, I was unable to find example code on SO for life support systems."

    "Dave, why are you laying on the floor gasping for breath?"

  19. lerie

    Clearly..

    This is clearly a group of people who hate Java. Never will we be able to automate quality code production. You may get close, but even if your bot is pulling from github and stackoverflow, your potentially getting insecure code.

    1. John Smith 19 Gold badge
      Facepalm

      "This is clearly a group of people who hate Java. "

      Wrong.

      I'd say (as a group) we hate

      a) Tools that can't be reused in other languages. It's great in Java. SFW? Useless for any other application, which makes it (from the PoV of other language developers) a toy.

      b) The examples given are not very competent Java code anyway. IOW you've just built a scrip kiddie bot. You might claim it's a "Work in progress" to which most would say "Great. Come back when it's progressed to someone useable."

      BTW.Case sensitivity When a DEC 11 was good for a few hundred KIPS and variables were 8 characters that might be justifiable. C++ inherited restrictions. But Java didn't have to and Ada's designers thought that was brain dead. They were right.

  20. Sgt_Oddball
    Gimp

    all I'd want....

    Is for it to make unit tests. Come on.... no one likes writing those things. Well apart from one weird dev who just can't help themselves. You know... the one with the nipple chains for casual wear.........

  21. Doctor Evil

    Think Java code-completion on steriods

    You guys had an AI robot write that sub-headline, didn't you?

  22. Boris the Cockroach Silver badge
    Meh

    Will this approach

    write code for my robots?

    Or will it spit out reams and reams of simple code that could be covered by a far more readable simple macro.....

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like