back to article Google's neural network learns to translate languages it hasn't been trained on

The gap between human and machine translators could be narrowing as researchers find a new way to improve the learning capabilities of Google Translate’s neural network. On the same day that Google announced its translation services were now operating with its Neural Machine Translation (NMT) system, a team of researchers …

  1. gerdesj Silver badge
    Childcatcher

    Not bad

    "Although the researchers only used a maximum of 12 language pairs for a single model, it would be easy to add more, and simple to use, as it operates on the same architecture as the NMT currently being used by Google Translate."

    "Obwohl die Forscher nur maximal 12 Sprachpaare für ein einziges Modell benutzten, wäre es einfach, mehr hinzuzufügen und einfach zu verwenden, da sie auf der gleichen Architektur arbeitet wie die NMT, die derzeit von Google Translate verwendet wird."

    "Although the researchers used only a maximum of 12 pairs of languages ​​for a single model, it would be easy to add and use more simply because it is based on the same architecture as the NMT currently used by Google Translate."

    Pretty good (that is en -> de ->en)

    Cheers

    Jon

    1. big_D Silver badge
      Headmaster

      Re: Not bad

      That is a move in the right direction. Until now the DE-EN EN-DE translations have been hilarious at best, dangerous at worst.

      Google did one safety translation that told me to "open the case, high voltage inside". Not sure if it had something to do with me using a Windows Phone at the time.

      I am currently working at a translation office and I can tell you that Google Translate cannot, generally, translate very well. Certainly not enough that it would make me worried about working as a translator.

    2. Loud Speaker Bronze badge

      Re: Not bad

      Here is a really useful intermediate language, already recognised as a standard ...

      http://www.bbc.co.uk/news/world-africa-38000387

    3. Truckle The Uncivil

      Re: Not bad

      Maybe. How is it at translating the subtext?

      1. big_D Silver badge

        Re: Not bad

        I can only speak for DE - EN, but it has problems with the actual subject most of the time, let alone the subtext. Something else you have to think about, is that translating from a language to a foreign language and back does not really mean that the foreign translation is of good quality or even accurate, just that the translation engine can understand itself good enough to get there and back again.

        A friend of mine's daughter had to hand in her Doctoral thesis in English and he ran it through Google Translate and thought it would be OK. After I stopped laughing, I re-wrote it.

        In general, before the introduction of this new AI technology, I would say that Google Translate has about a 30% hit rate for an average piece of text - by that I mean that at most 30% is good enough that it has the right subject matter and that the sentence makes any sense.

        It seems to work better, from English to German, when you use abbreviated text, for example "do not" often misses the "not" from the translation, whereas "don't" is generally translated correctly.

        I went through a phase of sending corrections to Google, but I usually don't have the time.

        Generally, I just keep Leo.org or Linguee open and just translate certain words that stump me - and sometimes it is the easy, everyday words that escape you. You know the word in English and you know the word in German, but somehow your brain doesn't make the connection.

        1. Frank Rysanek

          Re: human translator jobs

          Have an upvote for mentioning Linguee :-)

          For the last couple years, I've been trying to gradually improve my German by writing in German to our German suppliers. I tend to combine several dictionaries: one from my own (non-English) mother tongue which has some common cultural background/history with the Germans, then the Beolingus at TU Chemnitz, and I combine that with Linguee and general Google to prove that the meanings and phrases that I come up with sound remotely plausible in German. The downside is, that Sprechentlich es mir noch weh tut, weil ich die Wörterbücher nicht verwenden kann...

    4. Frank Rysanek

      Re: Not bad

      Impressive. Considering how much trouble it gives me to translate "precisely" from English to my mother tongue or vice versa, I probably wouldn't get anywhere near this precision on a "back and forth" double translation (for proof) - separated by a couple days so that I don't remember the original in any great detail.

      BTW... "da sie auf der gleichen Architektur arbeitet" : if I was to nitpick, the "sie" (she) seems to depart from the original's gender. In German, "Modell" is a neutral gender, but the translation engine chose to refer to it as a "she"... or maybe it picked up the gender from a broader context of the article? (Wrong subject on my part?) Doesn't seem so: the previous paragraph contains "the system" as a subject, which is also neutral gender in German...

  2. Charles 9 Silver badge

    To Japanese...

    "研究者は1つのモデルに最大12の言語ペアしか使用していませんでしたが、Google翻訳で現在使用されているNMTと同じアーキテクチャで動作するため、使いやすく簡単に追加できます。"

    ...then back to English:

    "Researchers used only a maximum of 12 language pairs per model, but they work with the same architecture as the NMT currently used in Google Translate, so they can be added easily and easily."

    Probably needs some more work.

    1. Frumious Bandersnatch Silver badge

      To be honest, the Japanese doesn't look too bad, though as it's a single long run-on sentence, it's hard to deal with anaphoric references. That aside, it appears that the only real problem with the final translation is not knowing what to do with 使いやすい and 簡単に, which both get translated to "easily".

      This, surely is an artefact of focusing on collocation data. On the one hand, I think that this is a very sensible approach to translation between language pairs (eg 彼は背が高い versus "he is tall"), but on the other, the more hops you take through intermediate languages, the more it becomes a case of Chinese whispers. Once you start stringing together the little islands that make up sensible, mutually intelligible utterances without any reference to the underlying semantics, you're bound to end up with an archipelago where the first and last island will definitely not be mutually comprehensible to each other.

      I don't know if you speak Japanese, or if you just picked it as an intermediate language for its strangeness factor. If you do, I'm sure that you can come up with many examples where the character of each individual language and (to take a slightly Whorfian viewpoint) the cultural backdrops and implied meanings make it difficult to translate things exactly. Stuff like the differences between I shall/will vs "going to" in English or conditional + いい[のに] (or ちょっと) in Japanese, plus all the rules for ellipsis in each language and what they means, plus, obviously, things like explicit anaphora in English vs implicit topics and referents in Japanese. Handling all of that needs deep understanding of both target languages at both a linguistic and (sometimes) a cultural level, so it's no surprise that this "island hopping" leads to mutual unintelligibility at the ends of the chain.

  3. Anonymous Coward
    Anonymous Coward

    ummm

    Precisely what did it do that it wasn't programmed to do? All this hype in pretending AI is magical and intelligent when it really isn't. Compute machines don't learn and apply that learning in a complete knowledge vacuum. Somebody programmed the damn thing on its decision trees and comparative functions, right? Cool stuff still, but its from programming, not "intelligence".

    1. Charles 9 Silver badge

      Re: ummm

      The thing is, are we any different? We don't come up with stuff from scratch, either. We take our experiences and what's been told us by others and apply them to new stuff. See where this is going? As I recall, no one told the thing to realize Portuguese has similarities to Spanish and so on, it figured that out as it went (as would most people who studies both languages often; they're both Romance languages and the two countries are adjacent geographically).

    2. curi0s1

      Re: ummm

      You should read about neural networks. They aren't programmed how to do what they do, but rather trained. After training, they can be presented with new input that they haven't been trained on, and come up with useful output.

      1. Loud Speaker Bronze badge

        Re: ummm

        and come up with output.

        FTFY

    3. Anonymous Coward
      Anonymous Coward

      Re: ummm

      OP of ummm

      My first post ever with more thumbs down than up.. interesting. And all I did was tell the truth. I studied LISP in the late 80's. It is software that runs on hardware just like the current "AI" is software that runs on hardware. This brings up the paraphrase about " to certain people, sufficiently advanced technology is indistinguishable from magic." Add a touch of serious anthropomorphism and you have the general population understanding of AI as something that isn't code or programmed, because it is "trained" and "learns". At the end of the day it is code that was written by humans and everything an AI system, or a neural network does is governed by that code, umm, period. Is it complex? Yes. Does it seem to do mysterious and unpredictable things? Most certainly. But it is not intelligent, it is code that was told what to do. Maybe someday, AI will be so mysterious that some will consider it a deity because they have no idea how it does what it does.

      1. Anonymous Coward
        Anonymous Coward

        Re: ummm

        I also studied LISP in the late 1980s. But from memory, it had 3/10s of Sweet Fanny Adams to do with neural networks.

        1. Anonymous Coward
          Anonymous Coward

          Re: ummm

          Common LISP Modules: Artificial Intelligence in the Era of Neural Networks and Chaos Theory...

          By Mark Watson

          Artificial Intelligence with Common Lisp: Fundamentals of Symbolic and Numeric Processing...

          By James L. Noyes

          Plus a google search on LISP and Neural Networks brings up a lot of info...

          Software Code written by man running on a hardware platform. It's "intelligence" is purely a case of anthropomorphism. The system translates input into numbers, compares those with what it is told to compare it to and then provides the output that it is told to provide as a result of the comparison. Dumb as a stump without good programming.

  4. Anonymous Coward
    Anonymous Coward

    "[....] and the system can even cope better with sentences that are written in a mixture of languages."

    That is where things start to get interestingly useful, as in regular conversation a lot if not most of us code switch between two or more languages.

    1. Anonymous Coward
      Anonymous Coward

      A good mixed language test would be a French review of the German Neuer Knabenchor Hamburg singing an Agnus Dei in Copenhagen's Vor Frue Kirke.

      Occasionally Google translate appears to know when NOT to translate words - but without any consistency.

      It is interesting that sometimes the "detect language" mode fails to translate a single compound word on its own. It gives the translation as the original letters having decided it is some very uncommon language. Point it at the correct language and it gets it right.

  5. Notas Badoff Silver badge
    Unhappy

    Then there are the nasty details...

    Things called words. I do wish they'd fix the niggling little gotchas like "fragrant" -> Chinese -> "sweet-smelting" "I like your hot perfume - it's burnt a hole through my heart!"

  6. Jason Hindle Bronze badge

    So when it becomes self aware

    It seems unlikely it will lead to Judgement Day, but it may well develop the personality of a snooty Parisian waiter who responds to all your attempts, at speaking French, in English.

    1. Anonymous Coward
      Anonymous Coward

      Re: So when it becomes self aware

      > it may well develop the personality of a snooty Parisian waiter who responds to all your attempts, at speaking French, in English.

      Things are definitely not what they used to be. A Frenchman *not* speaking French, and you call that snooty [sic]?

      To be fair, badly-accented French can be very difficult for a native speaker to decipher. They're not necessarily being snotty there, just efficient.

    2. mosw

      Re: So when it becomes self aware

      I used to fear our future AI overlords, but given the human overlords that have recently come to power, I now welcome them.

  7. Chris Miller

    Portuguese and Spanish are very closely related, to the point where native speakers of one can easily comprehend the other (though going into a shop in Lisbon and speaking Castilian won't necessarily get you a warm welcome). There are also 'half-way houses', such as Gallego and Portenhol.

    This would have been a lot more impressive if the AI had learnt Basque or Finnish from scratch (let alone Khmer or Sioux ...)

    1. Anonymous Coward
      Anonymous Coward

      Swedish, Danish, and Norwegian appear similar in the written form.

      It is said that a Swedish speaker has trouble understanding Danish pronunciation - but that the meaning of the sentences is not a problem.

      A Swedish speaker has no problem hearing Norwegian words - but the meaning is often different.

      1. Sweep

        Danish sounds like a Swede trying to talk while being sick.

      2. Frank Rysanek

        Re: Swedish/Danish/Norwegian

        ...I seem to recall that Norwegian and Danish are closer to each other than Swedish to either of the two... some historical reasons. But it's just a faint memory of some tour guide's explanation. (Myself coming from a slavic background.)

    2. Charles 9 Silver badge

      Part of the way neural networks work is by finding things in common. Going from one family to an unrelated one would be a bridge too far even for us without a starting point.

  8. cloth

    *the researchers found evidence that....*

    What fascinates me about this stuff (and I know enough to be dangerous) is that you don't know what the NN is going to do before it happens. Nor do you know where "the logic" is when it's finished. to that end, progress will always be slow - it's taken us 25 years to get to this state.

    When I look at the NN research of today it's not moved on hugely but it's just that we have more researchers now playing with it as the "simpler" problems of enterprise solutions have been cracked so we're got time/money to focus on it. I hope we find a new tech that beats NNs because I'm not convinced they are the answer.

    1. Frank Rysanek

      Re: *the researchers found evidence that....*

      > Nor do you know where "the logic" is when it's finished

      Actually... if you know what to look for, and you equip your ANN engine with the right instrumentation, and/or you structure the ANN deliberately in particular ways, in the end you can get a pretty good insight into what the ANN does, how it's doing that, how fine-grained the learned model is etc. This is judging by what Ray Kurzweil has to say about his speech models, and the recent "deep" image recognition projects/startups also tend to produce visualized data learned in the upper layers...

      Matt Zeiler has some nice videos on YouTube: https://www.youtube.com/watch?v=ghEmQSxT6tw

      You may object that giving some a priori "structure" to the network is cheating. Well without some a priori architecture, borrowed from natural brains, our ANN's would take geological timescales to learn something useful - using GA's to start with and then some "conventional" learning...

      This is actually where I see quite some room for further progress: adding more "architecture" to the ANN's. Not just more layers on top for more abstraction - maybe some loopy structures? Denser cross-connects to link multiple sensory and motor control subsystems? Reciprocating structures vaguely resembling control by feedback loop, but applied on mental reasoning tasks, attention, symbol manipulation... driven by goals/motives, maybe stratified by priority. I would hazard a guess that a cunning ANN macro-architecture could bring some radical results (at a technology demo level) even without "throwing even more raw crunching horsepower at the problem". Ray Kurzweil hints at how our brain is structured in "How to create a Mind" - someone should start playing with a coarse structure along those lines and extrapolate from there... Kurzweil himself merely concludes that we need orders of magnitude more compute horsepower and "the mind will somehow emerge on its own". I would not be so sure :-)

  9. Mutton Jeff

    My nipples explode with delight

    have an XKCD https://xkcd.com/902/

  10. atomez

    Translating from Portuguese to Spanish, or more correctly Castillian,or the other way around is pretty straightforward as both languages share the same syntactic structure, Latin based. It's much more difficult to translate between languages with very different syntactic structures, as in Portuguese to Japanese or even German.

    1. Charles 9 Silver badge

      And English is actually easier to go with German than with Spanish because English has Germanic roots unlike Spanish which has Latin roots. That's why I like to try it with Japanese, since Far Eastern languages have much less in common with each other. That's another good challenge ceca use of its odd semantics.

      1. Charles 9 Silver badge

        I meant to say Thai. Curse the mobile site's inability to edit.

  11. Frank Rysanek

    follow-up questions

    1) is Ray Kurzweil still at the helm? He's not mentioned in the article, nor in the referenced sources, but this is pretty much in the vein of what he was hired for, and correlated to his past work.

    2) on the title photo, what does the finger-puppet with the red hat actually say? Is it equivalent to Hello or is it some prank? :-)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019