back to article Use of big data can lead to 'harmful exclusion, discrimination' – FTC

Businesses should take steps to avoid causing "harmful exclusion or discrimination" when using aggregated consumer data that they have analysed, a US regulator has said. In a new report the Federal Trade Commission called on companies to check how representative their data sets are, whether their "data model" takes account of …

  1. BebopWeBop Silver badge
    Unhappy

    The problem is that as we all know, many companies will simply ignore corner cases - it is too expensive, and many people making the decisions trust the correlations and confuse them with causes because they frankly don't understand how the analysis was arrived at.

    A simple, and topical example is flood insurance. Insurance companies make use of postcode data to assess flood risk. It will work in, say, the middle of Carlisle where postcodes uniquely refer to a very small area. 45 miles away, the postcode refers to a much larger area. A colleague of mine, and a number of his neighbours found that they were either refused insurance or had massive premium hikes by a number of companies, because this area includes a stream that floods, and having seen it this winter, a 2 metre rise in water levels is quite impressive.

    All well and good, but there is only one houses down there. All the other houses in that post code are at least 50 metres above the level of the water. It seems those companies have made a decision to use what in practice is a very crude and inaccurate risk assessment simply because it works elsewhere. They won't change because their business processes would make it too expensive to do so.

    A trivial example, but there are much more subtle decisions being made on the basis of complex data analysis and I fear no matter what organisations such as the FTC 'mandate', they will carry on discriminating though ignorance.

    1. LucreLout Silver badge

      Insurance companies make use of postcode data to assess flood risk

      Well, yes, because calculating the likelihood of a postcode experiencing flooding is trivial and cheap compared to calculating the specific risk to each individual property or part thereof.

      It will work in, say, the middle of Carlisle where postcodes uniquely refer to a very small area. 45 miles away, the postcode refers to a much larger area.

      Well, yes, it will be. However, the vast majority of people live in towns & cities with small post codes, or in the country at an elevation which never floods. Insurance being a competitive industry, it's hardly surprising that they use the cheaper model for calculating their risk.

      Since Tesco opened nearby the number of car accidents in my postcode has soared, as have insurance prices. The irony being that neither the vehicle hit nor the vehicle hitting it is ever a local one - its empty headed shoppers competing to be the first to buy TV dinners and pino that are banging into each other, not those of us who live here. It is what it is.

      1. Anonymous Coward
        Anonymous Coward

        I'm a little surprised elevation data hasn't been added to these insurance models - it'd be pretty trivial.

        1. LucreLout Silver badge

          @AC

          Elevation is important, but you can be plenty elevated when a river bursts its banks and still end up wet.

  2. Thesheep

    Using people can lead to 'harmful exclusion, discrimination'

    At first I wondered why the 'big' in this report. After all they firstly talk about aggregated data...

    Then I realised that what they really mean is that people are very good at ignoring the things they are taught about - especially being aware of, and avoiding, biases. There are whole book shelves on this stuff. There are even pop-science books on it (Thinking Fast and Slow springs to mind).

    This isn't a problem with data, 'big' or otherwise, it's a problem with dumb people doing things in a dumb way, drawing dumb conclusions, and taking dumb actions. And then forgetting to monitor the outcomes. It applies to pretty much every form of human action since the first inference was drawn.

    1. Mark 85 Silver badge

      Re: Using people can lead to 'harmful exclusion, discrimination'

      This isn't a problem with data, 'big' or otherwise, it's a problem with dumb people doing things in a dumb way, drawing dumb conclusions, and taking dumb actions. And then forgetting to monitor the outcomes.

      Ah.. politicians and most of the inhabitants of the C-Suites then.

  3. Stephen 1

    Weapons Of Math Destruction

    The use of unaccountable algorithms is very disturbing indeed. I would recommend watching Cathy O'Neil's presentation on "Weapons Of Math Destruction".

    Boing Boing has a a link and summary here:

    http://boingboing.net/2016/01/06/weapons-of-math-destruction-h.html

  4. LDS Silver badge

    "employees who live closer to their jobs stay at these jobs longer"

    Did they need big data to understand it?

    1. Richard Jones 1
      Flame

      Re: "employees who live closer to their jobs stay at these jobs longer"

      Yes, because they were too stupid or not allowed to think by their 'lets-all-be-stupid' employer and have to use a poorly programmed computer. It started with banks, now it is everywhere.

      1. Chika
        Flame

        Re: "employees who live closer to their jobs stay at these jobs longer"

        I'm not even convinced that stupidity is the only answer. It's cost-cutting too.

        Each user of this kind of data model does so because they believe that it represents the panacea of decision making. They give it the name "big data" because the actual name, generalisation, has far too many negative connotations to allow it to be used, then they wonder why the conclusions they arrive at don't always work. As often said, all generalisations have exceptions.

        Actually, the way in which big data is used is worse than ordinary generalisations since it can often include layers of generalisations, one of which was mentioned in the article. But it's cheaper and easier to deal with data than it is to deal with people.

  5. AceRimmer
    Headmaster

    "Companies should remember that while big data is very good at detecting correlations, it does not explain which correlations are meaningful,"

    Big data detects nothing, it's just data.

    Statistical Analysis on (big) data does the detection

  6. Ken Hagan Gold badge

    GIGO ?

    I've always understood this to include "bias in, bias out" as a particular case. Students in the hard sciences are taught about systematic error which is similar. Do the business studies crowd have nothing similar?

  7. Anonymous Coward
    Anonymous Coward

    It appears that big data is the new "management consultancy"; it costs so much that people feel obliged to act on what it tells them.

    1. allthecoolshortnamesweretaken Silver badge

      Hey, cunsultards need to eat, too...

  8. Anonymous Coward
    Anonymous Coward

    Positive discrimination for data.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019