back to article Boffins build a NAZI AI – wait, let's check that... OK, it's a grammar nazi

Pedants, imagine how much more relaxed your life would be if artificial intelligence automatically corrected grammar mistake's in online forum and social network posts. Never again would you explode with frustration and anger over misplaced apostrophe's, commas full stop's and exclamation! marks! The faults could be fixed up …

  1. frank ly Silver badge

    "... misplaced apostrophe's, ..."

    I see what you did there and in other places. Are you also a grocer?

    1. Mycho Silver badge

      Re: "... misplaced apostrophe's, ..."

      Definitely a top quality troll, no miner league stuff here.

      1. MiguelC Silver badge

        Re: "... misplaced apostrophe's, ..."

        Unfortunately, that correction would be out of scope. What a shame.

      2. Anonymous Coward
        Anonymous Coward

        Re: "... misplaced apostrophe's, ..."

        Trolling like a champ

        1. 's water music Silver badge
          Happy

          Re: "... misplaced apostrophe's, ..."

          Trolling like a champ

          Would of been if the author had included some more egregious 'errors',,,

    2. Magani
      Headmaster

      Re: "... misplaced apostrophe's, ..."

      ...misplaced apostrophe's, commas, full stop's and exclamation! marks! ...

      Well plaid, but you forgot the apostrophe in "comma's".

      1. Benchops

        Re: "... misplaced apostrophe's, ..."

        I think you'll find the first person pluralization is "We'll"

      2. David 132 Silver badge

        Re: "... misplaced apostrophe's, ..."

        So many grammar errors. I'm loosing patience.

        1. J.G.Harston Silver badge

          Re: "... misplaced apostrophe's, ..."

          Agh! And subject-verb-counting mismatches. So.... hard... to... read... without screaming!

    3. Anonymous Coward
      Anonymous Coward

      Re: "... misplaced apostrophe's, ..."

      Did find myself counting the commas, and there seems to be a lot more than I would expect in a Reg article....

    4. Borg.King
      Coat

      Re: "... misplaced apostrophe's, ..."

      I see what you did there and in other places. Are you also a grocer?

      Granville, fetch a cloth.

      (Mine's the shopkeepers coat)

  2. Anonymous Coward
    Anonymous Coward

    The correct term is "Theiyr're", no need for AI now.

  3. Giovani Tapini

    It will be the end of puns as we know them

    Any kind of playful use of language will be translated into AI generated "Newspeak" and turgid prose will come of the computer controlled word sausage machine.

    Why not simply develop an AI to write the text in the first place? its probably easier than trying to fix language it cannot really appreciate.

    I look forward to the AI art critic suggesting "It's just a pile of bricks..."

    1. ArrZarr Silver badge

      Re: It will be the end of puns as we know them

      What a Brave new World.

      1. John Brown (no body) Silver badge
        Coat

        Re: It will be the end of puns as we know them

        "What a Brave, new World."

        FTFY

    2. phuzz Silver badge
      Alien

      Re: It will be the end of puns as we know them

      Why not simply develop an AI to write the text in the first place?

      Have you met amanfrommars?

      More seriously, AI is already being used to do the boring bits.

      1. Chris G Silver badge

        Re: It will be the end of puns as we know them

        There are not now, and never will be, any AIs that could embed as many concepts into a single sentence as AMFM does.

    3. Michael Wojcik Silver badge

      Re: It will be the end of puns as we know them

      Why not simply develop an AI to write the text in the first place?

      Done long ago. One well-known example is Phillip Parker's patented book-generation system, which has been used to create hundreds of thousands of books on specialty topics. Which, yes, he sells, and apparently makes quite a lot of money from.

      Generating usable natural-language prose is actually quite easy, just like generating passable music (algorithmic generation of classical and jazz music good enough to fool expert judges has been demonstrated for decades). Creating writing that's stylistically interesting, and generating new ideas on a subject, are somewhat more difficult challenges.

      In any case, the point of Shan's system, and others like it, isn't to fix broken prose. It's to attempt to add punctuation to text streams that lack it, such as ASR (speech-to-text) output, to make it easier to parse correctly. This was in the article.

  4. CT

    "At the moment, it can only deal with commas and full stops, the most common and easiest of English's punctuation marks."

    If they were that easy how come so many people do without them writing enormous walls of text without so much as a pause as if their taking one deep breath and just letting out a single massive belch of their stream of consciousness ooh look a cat video ?

    1. A K Stiles
      Headmaster

      a pause as if their taking one deep breath

      You did that deliberately, didn't you?

      1. CT

        of coarse I did

      2. onefang Silver badge

        "You did that deliberately, didn't you?"

        Considering the goal was to leave out any form of punctuation except the final question mark, I'd say that dodging an apostrophe was deliberate.

    2. Ken Hagan Gold badge

      Possibly because they are trying to imitate lawyers, who appear to believe that punctuation is subjective and therefore has no place in legal text.

      1. JLV Silver badge

        You’re assumption may need correcting and their is an example:

        https://www.nytimes.com/2006/10/25/business/worldbusiness/25comma.html

    3. Giovani Tapini

      @CT Becausewhenyouarewritingforfacebookandonlywanttosay

      whatyouhadforbreakfastwhatpyjamasyouwerewearingyoudontwriteassuminganyoneisactuallygoingtoreadyyourwalloftextyouareminlesslyfidgetingwithyourphoneinsteadofinteractingwiththosearoundyouwhichleadstoyouhavingnofriendsapartfromyourphoneandtellitallaboutyourlifebecuasethereisnootherrealityaroundyou.

  5. Wellyboot Silver badge

    oh. Harvey Mudd...

    First thought was the old Star Trek comedy villain! (Harry Mudd)

    A quick web-trawl seems to show that Mudd is a decent STEM college.

    and back to the comments in hand -

    What! Will! Happen! To! Reg! Yahoo! Headlines!

    1. werdsmith Silver badge

      Re: oh. Harvey Mudd...

      Harvey Mudd is one of the Claremont colleges where I did 6 months (actually at Pitzer but I think the swimming pool I used was at Harvey Mudd). Memory is a bit cloudy, same for a lot of people at those colleges. There was a lot of sex.

  6. jake Silver badge

    Futile.

    Written language (especially kludge known as "English"!) is entirely too flexible for a mere computer to figure out. See such (t)witticisms as "Ode to a Spell Checker" for one way to completely balls-up an AI-bot that most readers wouldn't even realize was an issue. There are many more.

    1. AndyS

      Re: Futile.

      For the curious:

      Ode to the Spell Checker

      Eye halve a spelling chequer

      It came with my pea sea

      It plainly marques four my revue

      Miss steaks eye kin knot sea.

      Eye strike a key and type a word

      And weight four it two say

      Weather eye am wrong oar write

      It shows me strait a weigh.

      As soon as a mist ache is maid

      It nose bee fore two long

      And eye can put the error rite

      Its rare lea ever wrong.

      Eye have run this poem threw it

      I am shore your pleased two no

      Its letter perfect awl the weigh

      My chequer tolled me sew.

      1. jake Silver badge

        Re: Futile.

        Or, if you want the hole thing, with attribution:

        https://forums.theregister.co.uk/forum/containing/811857

      2. Anonymous Coward
        Anonymous Coward

        Re: Futile.

        "Eye halve a spelling chequer [...]"

        Living in a small village near Stockholm was a good place to learn Swedish - which I did mostly by reading Asterix the Gaul. That gave me a fairly good grasp of everyday usage - but did little for my pronunciation.

        One day I went into the bakery shop and used my new skills to ask for my favourite cake - a long pastry crusted with nuts. "En av den där nötter kakor, tack". (one of those nut cakes please).

        I knew that "den" was locally pronounced as "dom" - but wasn't sure about "av" so tried my best guess.

        She picked the cake up - good - and then started to cut it in half!

        My mistake was to pronounce "av" sounding like "halv" (=half) - rather than the same sound as the English "of". Presumably there was a prior context for customers only wanting a half of that cake.

        1. onefang Silver badge

          Re: Futile.

          "Living in a small village near Stockholm was a good place to learn Swedish - which I did mostly by reading Asterix the Gaul. That gave me a fairly good grasp of everyday usage - but did little for my pronunciation."

          I have Esperanto translations of Asterix (er Asteriks I mean) books for that same reason. Though apparently my pronunciation is perfect. That's why they moved me to the advanced Esperanto class, so they could all listen to my pronunciation in awe, despite the fact I had no idea what it was I was saying. Which is why I left those classes, it wasn't teaching me anything. I later found out I pronounce Esperanto with the same thick Aussie accent I pronounce English with, just all the other Aussie Esperanto students and teachers didn't notice.

          1. Anonymous Coward
            Anonymous Coward

            Re: Futile.

            " I later found out I pronounce Esperanto with the same thick Aussie accent I pronounce English with, [...]"

            My Swedish colleagues in the Stockholm office said my accent was good - like a native of Gothenburg. They then explained that the Gothenburg Swedish accent is equivalent to the Scouse accent in English.

    2. Michael Wojcik Silver badge

      Re: Futile.

      Written language (especially kludge known as "English"!) is entirely too flexible for a mere computer to figure out

      So are you claiming human beings rely on something formally more powerful than a Turing machine to interpret language? What might that be?

      Of course this is a long-standing debate. Searle, though he argued forcefully against one particular approach ("symbolic manipulation") to strong AI with his Chinese Room thought experiment, believed that the human mind was a mechanical effect, and therefore that someday, assuming continued progress, we would eventually have machines that were human-mind-equivalent. Penrose believes otherwise, and thinks human minds are formally more powerful. There are many others on both sides.

  7. Sgt_Oddball Silver badge
    Coffee/keyboard

    not a religious lot then...

    Surely they should have weighed in on tabs vs spaces... much more relivant to the audience.

    (Tabs forever)

    1. Anonymous Coward
      Anonymous Coward

      Re: not a religious lot then...

      Upvoted until I saw the final line and switched to downvote!

    2. Martin
      Headmaster

      Re: not a religious lot then...

      Downvoted for tabs AND relivant...!

  8. Anonymous Coward
    Anonymous Coward

    A panda in the zoo eats shoots and leaves.

    The gangster in a restaurant eats, shoots, and leaves.

    1. Arthur the cat Silver badge
      Mushroom

      The gangster in a restaurant eats, shoots, and leaves.

      Oxford commas are a religious war by themselves.

      1. Mycho Silver badge
        Trollface

        I propose we start calling the pluralising apostrophe an inverted oxford comma.

      2. Symon Silver badge
        Coat

        Oxford comma. Who knew what old Mr. Mandela's hobby was?

        "By train, plane and sedan chair, Peter Ustinov retraces a journey made by Mark Twain a century ago. The highlights of his global tour include encounters with Nelson Mandela, an 800-year-old demigod and a dildo collector." -- The Times

      3. Anonymous Coward
        Anonymous Coward

        "Oxford commas are a religious war by themselves."

        Was " shoots, and leaves" an Oxford comma? It was merely a pause - not a comma separated list of items.

        1. Michael Wojcik Silver badge

          Was " shoots, and leaves" an Oxford comma? It was merely a pause - not a comma separated list of items.

          I'm afraid you're wrong; it was indeed a serial comma. The series in question is three verbs in a compound predicate. (They could also be described as three clauses, the latter two abbreviated. It comes down to the same thing.)

          Truss's eats-shoots-leaves example isn't actually much of an argument for or against the serial ("Oxford") comma, because the comma that's important for distinguishing the sense of the two constructions is the one between "eats" and "shoots". The second comma is largely irrelevant to interpretation.1

          The Ustinov-Mandela example someone else quoted above is a better one. In general, the serial comma really pulls its weight in cases like this, where it helps the reader distinguish between a series on the one hand, and an appositive or parenthetical phrase on the other.2

          The interesting thing, to me, about the serial-comma war is that it cuts across the lines of the other Great Comma War, between the "naturalists" and the "scientifics".3 The former want English punctuation to reflect style, pacing, and often the rhythm of speech. The latter want it to conform to some sort of grammatical principles: this construction calls for a comma, and that one does not.

          You might think that the scientifics would endorse the serial comma, say, because it can clarify an ambiguous phrase. But it seems plenty of them simply classify it as "unnecessary" and therefore undesirable. And similarly the naturalists are divided between those who abhor it as an ugly interruption, and those who feel its omission is lazy and jarring.

          And then there's the ongoing fight over comma typography, specifically whether commas should be moved within closing quotation marks, in the style still preferred by many US copy-editors, or left unmolested when they aren't part of the quotation. It's a holdover from the days of lead type, and now pointless, but habits die hard.

          1Which makes it no less contentious, of course, since proponents and opponents are perfectly happy to wage this war over questions of style, euphony, and consistency.

          2Alas, a great many writers have trouble with appositives in general, or rather with treating adjectival phrases as appositives. I particularly note this when people put unnecessary commas after job titles: "Department chair, Bob Smith, said...". Those commas are not preferred and serve no purpose - "department chair" is an adjectival noun phrase preceding the compound noun it modifies. Now, if the phrase were "The department chair, Bob Smith, said..." then "Bob Smith" is an appositive phrase, and it is customary to set those off with commas. It's an appositive because "the department chair" is a complete noun phrase on its own. Really, it' s not hard.

          3I'm ignoring the war between descriptivists and prescriptivists, because the latter are patently wrong and there's no point in discussing that further.

    2. Pseudonymous Howard

      Commas can save lives!

      Let's eat, Grandpa!

      or

      Let's eat Grandpa!

    3. MaltaMaggot

      well... if you've heard the one about the panda entering the brothel, bento box in hand, then you'll know that the panda eats, shoots, and leaves as well...

  9. Arctic fox
    Headmaster

    "..... that the word "but" is more likely to be followed by a comma....."

    It should then be trained to assassinate anyone who follows "but" with a comma. Icon? What else?

    1. Michael Wojcik Silver badge

      Re: "..... that the word "but" is more likely to be followed by a comma....."

      It should then be trained to assassinate anyone who follows "but" with a comma.

      I was tempted to agree, but, on reflection, it occurred to me that there are many cases where the clause introduced by "but" will begin with some type of phrase that is traditionally set off with commas, such as an adverbial.

      That said, there is a nasty tendency among some writers these days to move the comma that traditionally appeared before a coordinating conjunction (such as "but") to after it, and this should be greeted with scorn and derision.

  10. Primus Secundus Tertius Silver badge

    Commas and clauses

    My experience of reading junior engineers' English was of seeing clause after clause separated by commas, with only the occasional full stop. No other type of punctuation mark.

    It was English written as it is spoken, but often with very limited vocabulary. No concept that writing is a more formal performance.

    One of them told me once that I was the first person who had ever gone through their writing to point out the mistakes. This from a person in their early twenties.

    1. Nifty

      Re: Commas and clauses

      "My experience of reading junior engineers' English"

      Do we have to call in the 'use of the noun 'engineer' in Anglo-Saxon culture police in?

      Or would that be the same engineer that fixes the taps in our washrooms?

  11. Anonymous Coward
    Anonymous Coward

    Grammar Nazi?

    Is that a racist old woman?

    1. Chris G Silver badge

      Re: Grammar Nazi?

      No, " Ve haff vays of making you talk, correctly!"

      1. Laura Kerr
        Headmaster

        Re: Grammar Nazi?

        'No, " Ve haff vays of making you talk, correctly!"'

        Diese Kommas sind verboten! Bring ihn raus und erschieß ihn!

        Die Korrektur:

        No. "Ve haff vays of making you talk correctly!"

      2. Anonymous South African Coward Silver badge

        Re: Grammar Nazi?

        No, " Ve haff vays of making you talk, correctly!"

        Nein nein nein nein.

        "Ve haff vays of makink you tok, sveethot"

  12. Frumious Bandersnatch Silver badge

    hahahaha... parse this, AI fool!

    I like The The. I can say that without a but. Therapy? is more fun, though.

    1. Symon Silver badge
      Holmes

      Re: hahahaha... parse this, AI fool!

      The The. Yep!

      "It was when Johnson and Christopherson flew to South America to film the videos for "Infected" and "Mercy Beat" that events started to spiral out of control. Filming in the Peruvian jungle in Iquitos, Johnson used the services of a local Indian tribe as guides. The Indians introduced Johnson, already an enthusiastic user of drugs, to the hallucinogenic concoctions used in their tribal rituals. The video for "Mercy Beat" captures a scene where during filming the crew were attacked by a rally of Communist rebel fighters, angry at the appearance of what they considered Western intruders. Johnson confirmed that the scene was genuine and unscripted, and admitted that at the time he was "so high", recalling the madness that had ensued: "Someone produced a snake which I was grappling with, and I hate snakes. A monkey bit me, and then me and this guy, who I'd only just met, cut each other and we became blood brothers, rubbing blood over each other's face, stuff like that." "

      https://en.wikipedia.org/wiki/Infected_(The_The_album)#Infected_video_film

  13. Charles King

    What we really need

    For this to really be useful, it needs to be built into those smart speakers they have these days. Then whip up a bit of code to have it scan the audio from my TV and shout 'That's *fewer* goals, you fucking moron! FEWER!' at appropriate times while I sit and relax on a Saturday afternoon. That would save me a lot of work.

    ;>

    1. 's water music Silver badge

      Re: What we really need

      Then whip up a bit of code to have it scan the audio from my TV and shout 'That's *fewer* goals,

      I trained my eldest kid to do this. She can't help herself now. Hasn't increased her popularity at school.

      1. Michael Wojcik Silver badge

        Re: What we really need

        The rejection of "less" for discrete quantities is a relatively recent trend - it started in the seventeenth century. Before then "fewer" and "less" were used more or less interchangeably. But then many of our contemporary shibboleths were introduced by the Augustans in the seventeenth, typically based on rules they derived from classical languages (such as the prohibition on "split infinitives") or etymology (as is the case with fewer/less).

        Many of these probably won't survive much longer. Usage restrictions that don't affect interpretation are typically preserved as class markers, and spoken and written language seems to be falling out of favor as a category of class distinction.

        1. Francis Boyle Silver badge

          Re: What we really need

          I understand what you're saying but I don't "we need to buy fewer milk" has ever been grammatical English.

          1. onefang Silver badge
            Headmaster

            Re: What we really need

            'I understand what you're saying but I don't "we need to buy fewer milk" has ever been grammatical English.'

            As always, when pointing out the grammatical mistakes of others, you have made your own. Isn't there a law about that or something?

            1. jake Silver badge

              Re: What we really need

              Muphry's Law, early 1990s. There are several variations on the theme (Skitt's law, Bell's first law of Usenet, etc.). It has almost always been considered an obligation to include an error of your own when correcting another's post on Usenet, Fido, email lists & etc.

  14. PNGuinn
    Go

    I'll refrain from further comment ...

    ... Until the more upright netizens have had a chance to train it and turn it into a really intelligent ... potty mouthed, racist, bigoted ....

    Yay! Tay mk ii.

    Plz canz we av an Clippy wiv ai's??

  15. Anonymous South African Coward Silver badge

    Pedants, imagine how much more relaxed your life would be if artificial intelligence automatically corrected grammar mistake's in online forum and social network posts.

    Beautiful. Made the inner grammar nazi squirm and twitch :)

  16. Spanners Silver badge
    Pirate

    The problem

    Will this thing be trained to do US "English" or the English used by just about everyone else?

    People have already mentioned the Oxford Comma, which my, Oxford educated, English teacher taught us not to do. I suspect that there are other variations from correct English that may end up in the system if this system is set to use LeftPondian language.

    They do seem to be keen on the comma splice and an unhealthy quantity of exclamation marks. What else I wonder...

    1. not.known@this.address Bronze badge

      Re: The problem

      What Spanner said.

      And how is it going to cope when it starts coming across things like quotations in foreign languages? Convert them to English even if the whole point of it being there is to inform the reader that the speaker *might not* be an honest, deity-fearing English Gentleman but could be one of them damn furriners in dsguise...

      It's bad enough I have to suffer spiel chuckers that try to change every spelling to Leftpondian without them starting on grammar and sentence structure too.

      Here's a really radical idea - how about teaching children how to do this properly at school, rather than filling their heads with trendy nonsense like phonetics to teach spelling.

      1. FlossyThePig

        Re: The problem

        @not.known@this.address

        how about teaching children how to do this properly at school, rather than filling their heads with trendy nonsense like phonetics to teach spelling.

        In the '50s I learned to read using phonics. Since then trendy methods have come and gone and we are back to using phonics that was used to teach my grandchildren how to read. You do learn the differences in spelling words like "F"arming and "PH"onetics (how do you spell "ff").

        I had a colleague a few years ago who was (mis)taught to read using the Initial Teaching Alphabet. He admitted that even in his forties he had difficulty reading.

      2. Anonymous Coward
        Anonymous Coward

        Re: The problem

        "spiel chuckers"

        Why bring M*A*S*H into it?

      3. J.G.Harston Silver badge

        Re: The problem

        It's not phonetics that's the problem, it's phonics.

    2. Primus Secundus Tertius Silver badge

      Re: The problem

      @Spanners

      There are some 300 million Americans and 60 million British. So for obvious commercial reasons the American version will be developed first.

      At a seminar a questioner asked about American attitudes to British English. An answerer said there are two conflicting attitudes: first a feeling that British English is something special; and secondly that most British people use very poor English.

      1. the Jim bloke Silver badge

        Re: The problem

        I heard someplace that the largest population of 'English' users are actually in Asia.

        For this to be correct on a user-population count, it would need to be correcting grammar into Engrish.

        1. Laura Kerr

          Re: The problem

          Or Hinglish.

  17. Anonymous Coward
    Anonymous Coward

    I did NAZI

    that coming.

    (gets coat, and takes his meds before someone finds him through IP address, phone triangulation via FM radio leakage or any one of a dozen other ways)

  18. Jason Bloomberg Silver badge
    Coat

    Yadda, Yadda, Yoda

    Wake me up when it gets to the advanced stage when it can figure out if and means and or and means or and or means or or or means and and quite possibly both.

  19. Yet Another Anonymous coward Silver badge

    Am I missing something ?

    We already have excellent grammar checkers built into stuff like Word.

    So all this did was throw Project Gutenberg at the simplest form of neural net and look at the frequency of words near commas and full stops?

    1. CT

      Re: Am I missing something ?

      Minor correction: We already have grammar checkers built in to stuff like Word.

      1. Anonymous South African Coward Silver badge
        Trollface

        Re: Am I missing something ?

        An even more minor correction : We already have grammar checkers built in to stuff things like Word up totally.

        1. John Brown (no body) Silver badge

          Re: Am I missing something ?

          Allow me to offer another additional correction: We already have grammar checkers built in to stuff things like Word up totally into American English.

          1. Primus Secundus Tertius Silver badge

            Re: Am I missing something ?

            I find it worth seeking Word's opinion of my writing. I am a poor typist and it does spot a lot of typos.

            You can tell it to use British spelling or even more exotic spellings, but its 'grammar' check seems to reflect what is fashionable in California.

            1. Alistair Silver badge
              Windows

              Re: Am I missing something ?

              @PST:

              Eeeeeeeeeeeew, gag me with a spoonspune!

          2. Michael Wojcik Silver badge

            Re: Am I missing something ?

            We already have grammar checkers built in to stuff things like Word up totally into American English.

            While we're at it, they're not "grammar checkers". They're systems that apply a bunch of mostly-inappropriate heuristics, nearly all concerned with usage, mechanics, diction, and other things which are not grammar, to prose which has been mechanically chunked but not actually parsed.

            Those things may help marginal writers massage prose into something closer to someone's idea of preferred form, but they are by no means a substitute for learning to write well.

            And they have nothing whatsoever to do with the system described in this article. Really, I can't imagine how the OP thought they're at all relevant.

  20. Dropper

    Tip of the hat

    For the deliberate attempts to antagonize the few on this site that past there English language exam's.

  21. JeffyPoooh Silver badge
    Pint

    Commas (missing or extraneous) can change meaning

    So how would the "AI" be able to determine the original intent?

    Thus, I call (partial) "AI BS" on this one.

    But I'll grant that it could work most of the time, only occasionally causing a World War due to an unexpected 'AI correction' in a treaty.

    Eats shoots and leaves. = Panda

    Eats, shoots, and leaves. = Clint Eastwood

    1. onefang Silver badge

      Re: Commas (missing or extraneous) can change meaning

      "Eats shoots and leaves. = Panda

      Eats, shoots, and leaves. = Clint Eastwood"

      Eats roots and leaves. = Wombat

      Eats, roots, and leaves. = Aussie bloke.

      1. jake Silver badge

        Re: Commas (missing or extraneous) can change meaning

        Aussie blokes can't shoot? That would explain lots of things.

    2. Michael Wojcik Silver badge

      Re: Commas (missing or extraneous) can change meaning

      So how would the "AI" be able to determine the original intent?

      Thus, I call (partial) "AI BS" on this one.

      Argh. Did you read the article?

      The point of punctuation-replacement systems, such as this one, is to take text streams that lack punctuation (such as from ASR) and attempt to inject appropriate punctuation to improve parsing.

      "Determin[ing] the original intent" isn't the goal. Yes, there are ambiguous phrases in natural languages. Every single person who works on natural language processing is aware of that, and the many people who commented on this article to point it out may congratulate themselves on having made what might be the most obvious point conceivable.

      A punctuation-replacement system is a model-based transformation. That model could, in principle, be extremely sophisticated. It could build competing parse trees and select among them based on sentiment, metadata, rhetorical structure analysis, or other secondary features. It could keep whole-document context. It could use a world model to determine probable meaning of text segments. Or it could just be something like an LSTM network (or even something simpler) trained on a large corpus.

      But it won't "determine the original intent", any more than human readers do. That's the intentional fallacy. Writers (or speakers) don't transmit their intent through language to readers (or listeners). Readers construct interpretations, which will correspond to some degree with the writer's interpretation.

      And there's no "BS" here. Shan claims the system injects punctuation with an F1 of around 0.7, if my memory of the article serves (I'm not going back to check because the details don't really matter). That's the claim: it's a specific one, about the measured output of the system run against a particular set of inputs, compared to the ideal output.

  22. onefang Silver badge

    'tis only wen ya noes da rulez, dat ya can break 'em, coz reasons.

    With apologies to every English teacher that has ever had the misfortune of having me in their classes.

  23. Alistair Silver badge
    Windows

    AI can figure out comma's period's?

    I tought commas were male.

  24. W.S.Gosset Bronze badge

    *cough*

    > The AI thus ought to pick up that the word "but" is more likely to be followed by a comma than a full stop

    "preceded"

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019