back to article Hire-car data scraper becomes Catcher in the Rye

Software used to target ads for rental cars has been successfully applied to keeping British youngsters in education or employment after leaving school. The predictive risk modelling software from IBM was turned to an unusual use by the Kent-based Medway Youth Trust after an employee had a brainwave. Aware that it is easier …


This topic is closed for new posts.
Anonymous Coward

medical records?!

presumably they get informed consent before using them for a non-medical purpose like this.

Silver badge

Thin edge of the wedge time...

Pre-Crime anyone?

Data trawling by private companies of personal data in order to identify pre-crime?

Consent signed by teenagers who often can't legally consent to other things because they're technically children?

Are we happy with this?

Personally, not. Not one tiny bit.


Bad measure of sucess

"of the 732 Year 11 students identified by the software in February, 648 were currently in some kind of further education or job"

That isn't a good measure of how well the system works. If the system is getting a lot of false positives (i.e. marking kids at risk when they will do just fine on their own, then of course the stats will look good).

Surely the measure of success should be how many kids in the area, and of that age are classified as NEETs.

I'd also be interested to see what the stats are for those who refused to be part of the scheme, There's a risk that this is a self selecting group.

Anonymous Coward

Like Autonomy but not like Autonomy?

How does this kind of "text search" thing distinguish between

"I went to see my GP to talk about my alcohol problem. He has a cold. Lots of GPs have colds".


"I went to see my GP to talk about my cold. He has an alcohol problem. Lots of GPs have alcohol problems."

Can it make that distinction?

Does it matter if it can't?

Incidentally, when I said "like Autonomy", I didn't mean "similar to Autonomy", I meant the variant involved in "do you like Autonomy".

But you knew that didn't you because you're not a computer.

Or maybe not.



Whether it can distinguish between those two cases depends on the types of algorithms it uses. A bag-of-words algorithm doesn't preserve information about word arrangement, so it couldn't; but there are many NLP (Natural Language Processing) algorithms that do operate on phrasal chunks or parse trees, and are quite capable of operating on associations between agents (like the speaker and the GP) and attributes (like colds and alcoholism).

I don't know this SPSS product, but the old SPSS was a good firm, and IBM's strong in NLP research, so I wouldn't be surprised if the product does operate on grammatical and semantic relationship information.

(Some NLP factions at IBM are heavily into bag-of-words algorithms, such as latent semantic analysis, which they patented back in '88. But IBM's a big firm, and SPSS are a recent acquisition.)

Anonymous Coward


>Using data to predict who's going to get into trouble

Marshmallow test

A lot cheaper and probably more accurate.

Why do we always need software to tell us things that have been known about and studied for years?

This topic is closed for new posts.


Biting the hand that feeds IT © 1998–2017