back to article Australian test finds robot essay assessors on par with human teachers

Software has emerged as the equal of humans when it comes to marking essays in an Australian study. The test of test-marking software was conducted by the Australian Curriculum, Assessment and Reporting Authority, which administers standardised tests called the National Assessment Program – Literacy and Numeracy (NAPLAN). …

  1. a_yank_lurker

    Not sure

    What kind of essays were being written? If it was for a class, this might not be so successful but for mass exam (US SAT), it might work.

    1. Michael Wojcik Silver badge

      Re: Not sure

      Yes, it's the latter sort, according to the article. The rubrics for standardized-test essays are very narrow and shallow, since they seek to eliminate as much subjective judgement as possible.

      Even in the relatively innocent early '80s, when machine evaluation of exam essays was infeasible, and even for the US AP English exam's essays (which were exercises in literary criticism and as such weighted substance rather more highly than the sort of "can Johnny write" test described here), the grading rubric was a substantial document, several pages long. It allowed quite a bit of latitude but took pains to try to emphasize a consistent interpretation of those qualities the exam was supposed to test.

      In the '90s, when portfolio evaluation was the rage in college composition classes, instructors would hold calibration sessions where they'd all evaluate the same set of portfolios, then compare and discuss the grades they assigned. That's a much more nuanced and useful way to get consistent human judging, but of course it's very resource-intensive.

      There's a huge body of research on evaluating writing, which standardized testing companies blithely ignore. It's quite an active area of research, apparently, with many contentious disagreements. I have friends in the industry who don't believe evaluation, as it's currently conceived, is even meaningful. It's not that they think everyone's prose is equivalent - just that the ways in which we conceive of writing as "better" or "improving" are so subjective, culturally specific, and inconsistent that we should stop pretending the metrics we've been using actually mean anything.

      So what this present study boils down to is the nearly tautological observation that if you can reduce the evaluation of writing to something sufficiently mechanical, then you can mechanize it. Well, yes. Whether you've achieved anything useful thereby is rather another question.

      1. This post has been deleted by its author

      2. antiquam bombulum

        Re: Not sure

        The article is not so much about 'automated marking is always better for every purpose', but a much more specific one about whether, for the purposes of marking the writing component of the NAPLAN test, an automated system could do at least as well as human marking.

        I think much of the intent of the NAPLAN marking automation was directed at optimising the logistics of the process. The time taken to manually mark was of the order of 6 weeks, if memory serves, and there was considerable pressure to reduce the time between sitting the tests (April, I think) and final results being published (September?). Despite hiring experienced people (usually former teachers and occasionally, principals) and conducting training beforehand, it was still necessary to remove markers who could not mark consistently (i.e. against pre-marked test scripts randomly inserted into the system). So if a piece of software can do the job consistently and at least as well as experienced markers, it's a simple choice.

        Speed of marking obviously goes up as well as the consistency. Handwriting difficulties are removed (in the fully automated process, the students type their answers, rather than hand-writing them and then having them scanned, as at present). A considerable amount of garbage-removal is obviated because there are reduced opportunities for mis-identifying scripts - it was a constant bugbear that school staff would cross out the pre-printed name on a test book, hand-write a different student's name on the cover and then give it to the new student (usually done when the original student was absent and they'd run out of test books). They would leave the barcode that identified the original student and the test score would get automatically assigned to the original student. The list of things like this that have to be hunted down and cleaned up is lengthy. Eliminating the scope for stuff like that improves both timeliness and accuracy. (Any process that involves attempting to obtain the coordinated action of more than a handful of teachers and principals is like trying to herd cats and is best avoided.)

        The wider issue of 'can automated marking get at the true subtleties and creative abilities of humans?' is a complex argument, but, for every example of where software falls short, you can point to equally many where highly experienced and knowledgeable humans do too. The Leavisite 'wars' are a good example of where it can go wrong in a particularly destructive way. But on a more practical level, where a student has been given mark that a school or parent believes is anomalous, a re-mark can always be requested. Re-marks do occasionally happen under the present system, and almost certainly would under a fully automated one too.

  2. Old Handle
    Go

    Next Step

    Someone needs to apply the same principles in reverse and make an automatic essay generator.

    1. frank ly

      Re: Next Step

      The essay generator could pass its output to the essay marker to get feedback of its own score, then adjust its own output so as to maximise its score. If done properly, this would result in a 'perfect' essay.

      1. Michael Wojcik Silver badge

        Re: Next Step

        The essay generator could pass its output to the essay marker to get feedback of its own score, then adjust its own output so as to maximise its score. If done properly, this would result in a 'perfect' essay.

        If the generator is simply running the evaluator's model backwards, as Shannon suggested decades ago, there's no need for it to check the output against the evaluator. It already knows what the result will be.

        If, on the other hand, the generator uses a different model, then you have the same situation as you have today with "Turing Test" chatbots, many of which are trained by running them against NLP evaluators in just this manner.

        It's a standard technique, in other words.

    2. Michael Wojcik Silver badge

      Re: Next Step

      Someone needs to apply the same principles in reverse and make an automatic essay generator.

      Been done many times, with varying degrees of success. Modern chatbots are very nearly essay generators, and could easily be tweaked into them. Academic-paper generators are well-known.

      Automatically generating salable books is also an established field, and would be even more widespread if it weren't for Philip Parker's (very lucrative) patent.

      There are more interesting uses of Natural Language Processing in the writing classroom. One project I've been involved in uses it as one component of a multivariate system that ranks students on how helpful they are to their peers when they engage in peer review of each others' writing. The rankings aren't used to grade the students' class performance, but to help them see whether their peers find their feedback useful, and help instructors see which peer groups are performing well and which ones are having difficulty.

  3. Anonymous Coward
    Anonymous Coward

    In defence of teachers

    It would be nice to give teachers back more time.

    With two siblings who are teachers, I know that significant un-paid hours are spent marking tests, prepping plans, etc.

    The extra annual holidays don't go close to compensation.

    Apple?

    1. Filippo Silver badge

      Re: In defence of teachers

      Hear hear. I guess you got downvoted by one of those people who are convinced that teachers only work while in class. Which makes just about as much sense as believing that farmers only work at harvest.

      At the school where my SO teaches, 50+ hours work weeks are fairly regular and the current week looks like it could break 60. Yet there is no overtime pay at all, and a large part of that work is actually unpaid. If she could somehow trade the extra holidays (which, btw, aren't three months, not even close) for an enforced timetable, she'd do it in a heartbeat.

      1. DaveyDaveDave

        Re: In defence of teachers

        I don't think anybody thinks teachers only work while in class. We all know that they have lots of planning and marking to do, because every single teacher feels the need to constantly remind us of the fact.

        Those of us who are somewhat less sympathetic would tend to point at the fact that they knew this when they took the job, and they still know it now that they are doing the job, and are entirely within their rights to leave, but they don't. OK, I know lots of teachers do leave, but those who are still teaching but complaining about it clearly aren't leaving. Evidently something about the job (the feeling of pride when little Johnny finally learns to count past 3?) makes the cost/benefit decision mean it's worth staying, for that particular person. Exactly the same way everyone else decides whether or not to stay in their job, in every other walk of life.

        Of course, the fact that I've already started seeing my teacher friends on Facebook smugly counting down until their 2 week Christmas holiday, while I have to count myself lucky if I get to leave a bit early on Christmas eve doesn't help me feel any more sympathetic.

      2. Phil O'Sophical Silver badge

        Re: In defence of teachers

        a large part of that work is actually unpaid.

        Unless teachers are paid an hourly wage, which I doubt, none of it is "unpaid". Like the rest of us, they're paid a salary to do a job, and like the rest of us they sometimes find that job encroaches on times they would prefer to have free for personal use.

        They may not consider their salary sufficient compensation for the work they do, that's a different issue, but the work is all paid for.

      3. Bob Dole (tm)

        Re: In defence of teachers

        >>I guess you got downvoted by one of those people who are convinced that teachers only work while in class.

        Quite the contrary. I'm well aware that teachers work while not in class, grading papers etc. I'm also well aware that the MOST OF the REST of the world ALSO performs work while not at the office and yet they don't get 3 months off in the summer and extra long other holidays. In other words, please stop pretending that teachers work longer hours than all the rest of the work force. They don't - not even close.

        1. Anonymous Coward
          Anonymous Coward

          Re: In defence of teachers

          RE: Bob Dole (tm)

          C-

          See me after school

          1. Bob Dole (tm)

            Re: In defence of teachers

            >>See me after school

            I'll be at my desk job when school gets out; but I couldn't probably be there around 9PM, does that work for you?

    2. Anonymous Coward
      Anonymous Coward

      Re: In defence of teachers

      > It would be nice to give teachers back more time.

      Well, if they can have automated essay marking programs, let the students use automated essay writing programs, it's only fair. Then the teachers can get 100% free time.

      1. e^iπ+1=0

        Re: In defence of teachers

        "Well, if they can have automated essay marking programs, let the students use automated essay writing programs, it's only fair. Then the teachers can get 100% free time."

        That's all well and good, however how are automated essays going to help in subjects where essays are relatively unimportant such as, say, maths?

        Don't tell me - automated maths solutions. Nowadays one can bring in a device to certain maths exams which facilitate answers, such as a calculator.

        1. jonathanb Silver badge

          Re: In defence of teachers

          Given that there is usually a single correct answer to a Maths question, that should be easier for a computer to mark than an essay.

    3. Michael Wojcik Silver badge

      Re: In defence of teachers

      Responding to student writing was why I quit teaching. Hour after hour of reading their work and trying to decide what a judicious amount of comments would be - how to make the most important critiques and most useful suggestions without drowning them in feedback. Simply mind-numbing after a while.

      Oh, and Phil O'Sophical: In the US, at least, many teachers are paid hourly. Graduate assistants and a lot of other part-time faculty generally have contracts that are written in terms of hourly wages, with a fixed stipend computed based on a set number of hours per week. The latter figure generally bears only a passing resemblance to reality, at least for the better sort of teacher.

  4. JLV

    Anyone see a parallel with buzzword-heavy professional writing? If all you are assessing are the vocabulary, jargon and structure any crap will do, if presented nicely. Creativity and relevance? Optional ;)

    Seriously then why bother putting your heart into it if a machine is grading? Much better off figuring out ways to game the system.

    1. Terry 6 Silver badge

      What this really points to is that the way that education has been influenced by press campaigns and lobby groups and a demand for measurable results at the expense of real education. We teach what can be measured, rather than finding a way to measure what we teach.

      The teaching has become mechanical so the success criteria can be measured by a machine

    2. Michael Wojcik Silver badge

      Much better off figuring out ways to game the system.

      That's true of nearly all of the standardized testing I've participated in. (There are some exceptions, where human judges were employed, at significant expense, and asked to judge such elusive attributes as creativity and relevance. The AP English exams I mentioned in an earlier post are an example.)

      Perhaps the most frequent critique of the increasing spread of standardized testing (for example with No Child Left Behind in the US) is that it leads to "teaching to the test", which is simply one way of gaming the system. When you heavily bias the payoff matrix in a predictable way, that's what you get.

  5. Anonymous Coward
    Anonymous Coward

    This isn't a problem if the teachers use it to help them

    Teachers probably waste a lot of time marking for stuff that would be better performed by a neutral unbiased machine anyway. This will give them more time to read the essay for quality of thought and defense of the author's position (at least in theory, in practice they might spend more time on Facebook, but either way the quality and consistency of marking on basic grammar etc. should go up)

    You could even make a game of it, if a student disputes the machine marking them down on something they don't believe is incorrect, and can prove their position to the teacher - extra credit!

  6. Voland's right hand Silver badge

    Who told you that teaching is well educated middle class

    It was. Like 50 years ago. It has been anything but middle class in the UK for a few decades now and we are starting to consume the consequences.

    1. J.G.Harston Silver badge

      Re: Who told you that teaching is well educated middle class

      When my grandparents were teachers it was a profession on a par with being a solicitor or doctor. Since then teachers' unions and the government have worked hand in glove to turn it into a job, unions so they get their lumpen "mass of labour" membership they depend on and government so they can get their lumpen "mass of labour" they can control.

  7. Adam 52 Silver badge

    Ideas

    One of the scoring criteria is "ideas". The algorithm and the human score pretty much the same. I'm not sure how a computer can mark originality but it looks like the teachers weren't either.

    1. Anonymous Coward
      Anonymous Coward

      Re: Ideas

      I'm not sure how a computer can mark originality but it looks like the teachers weren't either

      In a school context, you have pupils, not students. As such they've not really had much choice in the subjects they study, they are being forced-fed what is regarded as a minimum standard of education, and the idea of originality of thought it not really relevant. At higher and tertiary education that changes (in theory, less so in practice), but for schools I can't see why you're expecting a fourteen year old to offer much originality.

      Possibly you were a child prodigy (I certainly wasn't), but as a rule the mass-education system doesn't cater for, nor have many children in that category.

      Having said that, this is an article about Oz. And there's more than a few people think the Australian education system can only be explained as the witchy revenge of the "indigenous peoples" (you know of whom I speak).

      1. Phil O'Sophical Silver badge

        Re: Ideas

        I can't see why you're expecting a fourteen year old to offer much originality.

        That says a lot for your expectations of the educational system. When I was 14, originality in essays certainly was expected, and marks given for it. We certainly weren't prodigies, just average (although one of my classmates is now a well-respected SF author).

        As always, tell children that you don't expect them to go aywhere, and unsurprisingly, they won't. Give them a challenge and you may be surprised how they rise to it.

        1. Pompous Git Silver badge

          Re: Ideas

          As always, tell children that you don't expect them to go aywhere, and unsurprisingly, they won't. Give them a challenge and you may be surprised how they rise to it.

          Depends on the child. Shortly after I met Billy Thorpe, he purchased an Aston Martin DB4 sports car*. He immediately drove all the way from Melbourne to Brisbane where he went to school and did a few donuts in the school car park while giving the bird to the teachers who told him he would never amount to anything.

          * When Billy first tried to buy the car, the salesman told him to piss off because he was far too scruffy (long hair) to be able to afford such a vehicle. So Billy came back from the bank with a suitcase full of cash that he emptied all over the salesdroid's desk. "Now sell me that fucking car!"

          1. Yugguy

            Re: Ideas

            To be realistic, for every one Billy Thorpe there's several hundred McDonald's workers about whom the teachers were absolutely right.

            1. Pompous Git Silver badge

              Re: Ideas

              Which is absolutely true though not particularly relevant. There's a whole bunch in the middle there with no great aspirations, but nevertheless perform a valuable contribution to the community, no matter how menial. When I was conducting job clubs, my first task was to build the self-esteem they needed to actually go out there and do the necessary to find useful work; self-esteem that teachers had gone to some trouble to eliminate. For someone whose background is two generations on the dole, any kind of a regular paying job is a great achievement.

              No matter how well-qualified they are and experts in a certain area, nothing gives teachers the right to destroy the aspirations of those they are supposedly committed to help.

          2. Vic

            Re: Ideas

            When Billy first tried to buy the car, the salesman told him to piss off because he was far too scruffy

            Lionel Richie has a similar story about going to buy a Cadillac. The salesman had no idea who he was, and was ready to tell him to get stuffed. He eventually rang the bank (after much insistence). I would love to have been a fly on the wall...

            Vic.

        2. Yugguy

          Re: Ideas

          Indeed.

          In 1981 I took the 11plus exam, at two different schools. Along with the usual IQ type questions, pattern analysis, language comprehension etc., both independent exams required me to write a short fictional story which could not even now be rated by a machine (yet).

          The point being that at age 11 they expected us to show imagination and originality.

          There is no such requirement in today's 11plus.

          1. Michael Wojcik Silver badge

            Re: Ideas

            a short fictional story which could not even now be rated by a machine (yet)

            Sure it could. Here's a sample implementation:

            int RateFictionalStory(const char *text) {return 7; /* out of 10 */ }

            Whether any extant algorithm could produce an evaluation of that story, according to some rubric, which on average came close to agreeing with, say, the evaluations of a panel of expert human judges ... well, that's a different question.

            But in fact there are many things that NLP algorithms can determine about a candidate piece of fiction, and some of them are even interesting and useful. I'm no fan of automatic analysis of writing as a way to evaluate writers; I don't see what good that sort of instrumentalist reduction does, except perhaps as a very basic skills check, and relying on it for that just demonstrates an essential failure in the educational system. But that doesn't mean we don't already have the capability to make mechanical judgements of fiction on a number of interesting metrics.

            NLP has come a long way from things like the Gunning Fog Index.

    2. Michael Wojcik Silver badge

      Re: Ideas

      I'm not sure how a computer can mark originality

      First, some assumptions: The subject matter is constrained and not esoteric. The writers are not subject-matter experts or expert writers. The evaluation system has a large corpus of work on the subject. We're interested in conceptual originality, not, say, originality of style.

      Then as long as you have a decent semantic-content extraction algorithm, you just create a model of the topic based on your corpus, and then for each candidate essay compute a series of metrics in increasing dimensionality: How far does the candidate differ from the first-order semantic content (individual ideas, very roughly speaking)? How far does it differ in second-order semantic content (ideas paired by a relation)? In third-order (networks with three significant concepts)?

      Doing that even for third-order relationships, in this kind of constrained environment, is likely to give you an "originality" metric that nearly always corresponds closely with how human expert judges would rate the candidates, and is relatively difficult to game.

      There are other abstract attributes which are much more difficult to reduce to this sort of model-correspondence, however. Beauty (aesthetic appeal of the prose style, imagery, etc) is one. Logical consistency is another, as is psychological consistency. (Those two require sophisticated world models which have historically proven very tough to build.)

      In other words, it's not that hard to mechanically analyze what a sample of prose is. It's hard to analyze what it does for a human audience. But the beauty of standardized tests, for the people who want to see this kind of tech, is that they don't want to measure the latter, because it's too subjective.

  8. James 51

    What happens if someone writes a good essay but it falls outside of the standard format/criteria? Or takes an orginal slant that doesn't easily fall into a particular pigeon hole? Computer says no could be a real problem, particularly if you've got handwriting like mine.

    1. Anonymous Coward
      Anonymous Coward

      handwriting?! I thought this was banned around the times of Terminator 3 - Rise of the Keyboard?

    2. Pompous Git Silver badge

      What happens if someone writes a good essay but it falls outside of the standard format/criteria?

      Back in the late 60s my English teacher had the marks on on a bunch of essays blotted out and remarked them a year after his original assessment. One essay stood out for coming dead last the first time round and near the top when he remarked it. He recently wrote his autobiography and I'm toying with the idea of marking it as if it was a class assignment :-)

    3. Glen Turner 666

      @James51 and originality

      The NAPLAN test is the worst sort of high stakes testing. Writing a essay outside of the standard criteria will --- even with humans marking --- get you poor results as it won't fit within the marking rubrics. These rubrics -- 'marking criteria' would be the less jargon phrase -- are designed to allow no scope for creativity. As a trivial example of creativity: if you gave the answer as a poem that would garner no additional marks and would threaten the marks allowed for grammar and spelling.

      The NAPLAN system is gamed by schools, with weeks of "teaching to the test" being commonplace. Although the government denies it, the NAPLAN preparation constrains the time available for actual teaching of material. In particular the Year 9 NAPLAN falls exactly when algebra is being taught and at a recent corridor chat at a teaching conference there was consensus that there was a fall in student ability in basic symbolic manipulation because NAPLAN has vacuumed time away from that foundation skill.

      The government denies the tests are high stakes. But in reality they gateway admission to all advanced programmes. Even for trades programmes oversubscribed programmes are often determined by NAPLAN ranking -- why wouldn't you drag up your school's average given the opportunity?

      1. david 12 Silver badge

        "teaching to the test"

        I don't suppose you even have any idea what that meant, and why it was supposed to be bad. So, for your benefit, and for the benefit of any interested readers, I'll note:

        I doesn't mean "learning"

        Using a test to define the curricula is not itself a bad thing.

        If, as is common, a test helps teachers and students focus on the intended subject matter, that is a good thing. It doesn't magically become a bad thing just because some under-educated teacher has learned the phrase "teaching to the test" only as an insult.

      2. Tim Roberts 1

        Re: @James51 and originality

        Quite! NAPLAN is a scam if for no other reason is that it purports to assess something that doesn't exist.

        Until we get the National Curriculum properly underway - something I support by the way - NAPALM (deliberate miss-spelling) is a waste of taxpayer money.

        Few parents realise that they can pull their children out of the test, and I believe that is an option for schools as well. If I still had school aged children they definitely would not be sitting the test.

        The stress it causes some children at the school where I work is terrible.

        1. antiquam bombulum

          Re: @James51 and originality

          Schools can only withdraw a child on the grounds of disability, whether this is physical or psychological. Each instance must be justified by the principal. Students sometimes do have meltdowns about the test and schools are free to withdraw any child for whom there are good grounds for believing that they may suffer undue stress from doing the test.

          Parental withdrawals have in some states been showing an increase, so there is some evidence for either an increased awareness by parents of their right to withdraw, or perhaps of more parents having reservations about the test where their child is concerned.

          Withdrawing children is not a simple issue as absence of NAPLAN scores can affect assessments of applications to selective-entry schools, for example. A lot of the heat about the test has come from the way the media jumps on each year's results and looks for evidence of systemic failure. Early on they wanted to create 'league tables' of schools, again to highlight what they see as underperforming schools. It is a pity that the press have decided to use NAPLAN results in this way because it was intended to help schools and students by measuring against a common benchmark. Had the stakes not been so ludicrously raised as a result of the press wanting to use NAPLAN results demonise schools, it might have been possible to make the appropriate use of it - as just another view of each child's performance to give some sense of balance amongst all the other pieces of information.

  9. mr.K

    Neverending job

    Teaching is one of the areas that you can nearly pour endless amounts of resources into and still have an effect. In addition there is a limit to how big groups of children you can manage as one adult. It does not matter whatever technical aid you have at your disposal. A rather big part of a teacher's job in a classroom is to keep the pupils focused and maintain some sort of order. That these people are able to do that with thirty children is quite amazing, and do not think you can expand much on that number. The idea that you can hand thirty 14-year-olds some sort of tablet and leave the room and learning will happen is rather stupid.

    So the teacher is not going anywhere. What can happen of course is that he or she will get to spend more time in the classroom and thus have more hours per week actually teaching. So in the long run, maybe fewer teachers are needed, but not by much. But technology will free up resources. This is one thing, which I think is far off if you are to grade in any meaningful way. The concept of flipped classroom with lectures on video is a far more promising venue. But since you can't get rid of the teacher these extra resources will hopefully result in more teaching and learning.

    1. david 12 Silver badge

      Re: Neverending job

      "That these people are able to do that with thirty children is quite amazing, and do not think you can expand much on that number."

      Certainly you can expand on that number. I understand that "history" is not an important subject for the average class-room teacher, but sometimes I think it would be nice if they were given just a little more context for the skills they should be required to learn.

      Class-room sizes sit at their present size because it is cheaper to quallify more teachers for small classes than fewer teachers for large classes. Extra skills are required for large classes, and there is no point to teaching those skills if you can graduate as many small-class teachers as you want anyway.

  10. G.Y.

    It is said the writing part of the US SAT can be graded from across the room ...

  11. Primus Secundus Tertius

    Language Assessment

    I am surprised at the proposition that computers can mark essays. The checking in MS Word for e.g. singular subject needs singular verb gets confused in any sentence with two or more clauses.

    Are these essay marking software products really that much better than Word?

    1. Michael Wojcik Silver badge

      Re: Language Assessment

      Are these essay marking software products really that much better than Word?

      They had better be, because Word isn't even in the same category. It's almost a meaningless comparison.

      Well, maybe that's not entirely fair. Word does employ some better-than-a-toy versions of some NLP algorithms, such as its summary-generating feature. But Word's usage-checking (usually incorrectly called a "grammar-checking") engine isn't among them.

      Jurafsky and Martin, a widely-used textbook in the subject, is a good introduction to the field. Of course it doesn't represent the latest or most complex techniques - for that you'd have to read journals and conference proceedings. It's a very active field.

  12. Brock Knudsen

    Should have it also search random phrases on the internet

    Let it find any rampant plagiarism and cheating as part of its algorithm and it gets even more accurate. Half the class gets zeroes.

    1. Michael Wojcik Silver badge

      Re: Should have it also search random phrases on the internet

      That's already a common NLP application, available commercially in various forms such as turnitin.com. I'd be surprised if vendors pitching standardized-test evaluation don't include that feature as a matter of course.

  13. Anonymous Coward
    Anonymous Coward

    Teacher hours...

    ...are a combination of rigidly-regimented high pressure minute to minute performances in front of groups of dozens to hundreds of students during hours that are tightly prescribed, in addition to unscheduled time required to meet all the performance requirements of the job. It defies direct numerical comparison to most other types of work. If anyone believes that teachers are too eager to remind the general public that they don't do easy work with short hours, they need to consider that it is simply a reaction to segment of our population that has been bashing teachers for many years by people who interestingly, decided not to be teachers.

  14. Bill Michaelson

    The type of grading feedback that a computer can give to a student...

    ...is probably valuable enough to apply to less than 5% of student work. Beyond the most rudimentary assessment of spelling, grammar and simple structure, it is nearly useless. Additionally, if students learn to apply the formulaic style that is likely to elicit the best grades from the machine, we are probably doing more harm than good. There is something to be said for ignoring simple, rigid rules and allowing some creativity to flourish. I doubt the application in question is sophisticated enough to strike a good balance. Very likely, there is no balance at all.

    Training and education can be seen as distinct and complementary processes.

    1. Michael Wojcik Silver badge

      Re: The type of grading feedback that a computer can give to a student...

      Most standardized testing isn't designed to be valuable for the student. It's supposed to be useful for the bureaucracy.

      There is something to be said for ignoring simple, rigid rules and allowing some creativity to flourish.

      Yes, and there's too little of that even in many college-level composition classes in the US (which traditionally has favored creativity more than tertiary education systems in most other countries). When I taught writing, I advised my students against using style guides, because they tend to lead to undistinguished prose. Richard Ohman's classic essay "Use Definite, Specific, Concrete Language" - an argument against the eponymous recommendation in Strunk & White - does a great job of illustrating the problem.

  15. Thorne
    WTF?

    ???

    Ok I can see how it works checking spelling, gramma and punctuation but how does it work telling how well you actually answered the question?

    Things like English and General Studies can have multiple correct answers depending on your point of view and how you argue your point.

    It might make marking easier by picking up the little mistakes but not how well someone understands the course material.

  16. shovelDriver

    Software, uh, Intelligence, Optional

    "Software has emerged as the equal of humans when it comes to marking essays "

    I suspect the "testing" actually says more about the test designers - and administrators / proctors - than anything else.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like