back to article WANTED: A plan to DESTROY metadata, not just retain it

Australia's data retention proposal suggests the nation's telcos and ISPs need to store data for two years. But agencies accessing the data can seemingly keep it forever and are not, to date, required to securely store or destroy data they retrieve from the nation's putative data trove of personal information, miscalled " …

  1. veti Silver badge

    Unfortunately,

    ... "what's no longer needed" is, the police will strenuously insist, an empty set. You never know when some tidbit of information from 1993 is suddenly going to turn out to be the key to a current investigation.

    Personally, I think our only plausible salvation from police abuse of this system is police bureaucracy, which I would model as follows:

    - Every time an officer runs a query on a database, force them to complete a 'reason' field referencing a case they're working on.

    - Insist that results should be recorded only on dedicated police equipment (if that means buying everyone a brand-new top-of-the-line tablet every year, then fine, that's a small price to pay). Copying the results to any other media, including paper napkins or Post-it notes, should be a serious disciplinary offence.

    - Program the devices to automatically delete any database query results after, let's say, 24 hours, unless the officer flags them as "interesting".

    - Periodically, conduct an audit where you randomly review a handful of cases on the officer's tablet, and require them to justify the results they've flagged as "interesting".

    1. big_D Silver badge

      Re: Unfortunately,

      Flag them as interesting? No.

      I would say that they need to be given an official evidence number (and have to therefore obey the rules for evidence gathering) for an active case, otherwise they will be deleted.

  2. P. Lee

    Mission creep

    (What's with the "meta-data" rubbish in the headline? Its data. Actually, its like having a CCTV camera pointed at your screen all the time, but which can't see the actual user. Only its less a bit less accurate than that.)

    As you increase the age of data you decrease its reliability and usefulness.

    IP source/destination & port numbers might be fine for today's data. What happens when it was two years ago? Who was living at the house? Now you need to keep name/address/account data as well as the traffic data. So you have another privately-held database to complement the land registry. This isn't current data, this goes back... forever?

    That kiddie porn that was accessed - was the house owner's owner's daughter back from college for that weekend? Did she use Dad's computer? Has the disk or the laptop been changed, so you can't tell who had access? Is the Mr Sayeed who was living there seven years ago the same Mr Sayeed who is there now? How about the house-share full of college students? Can you track them down, do you even know if the people on the rental agreement are real ones who were there? Even for phone data - did Ms Sayeed use her Dad's phone to call her zealous boyfriend? Did she use it as a hotspot?

    The thing about "digital footprints" is that they might be accurate enough for ad-slingers, but they aren't anywhere close to accurate enough for legal purposes and the further back you go, the more difficult it is to see the unknowns and the easier it is to make assumptions.

    "Don't tell anyone your pin" is all well and good, but its hard to use a phone a lot and have a pin no-one has seen, even if they aren't looking. If they are trying to cover their tracks by secretly using your kit, its pretty hard to stop them. Its your phone, its been "secured" and you were in the house. "It wasn't me but I don't remember who it could have been" will be difficult to pull off, if you look suspicious.

    We haven't even begun to look at what happens when websites move addresses.

    The data shouldn't be collected because it is often misleading. That doesn't bother advertisers and (other) criminals who don't care about failures or historical accuracy and for whom a scatter-gun approach is valid. The longer you store it, the worse the situation becomes so you need more data to try to keep it accurate, but still, as events fade from human memory, the data becomes more and more misleading.

  3. Winkypop Silver badge
    Devil

    You're all guilty of something!

    The Feds just haven't written up the charge sheet yet.

  4. Britt Johnston
    Go

    This needs extending to all applications (and includes security)

    I recently went to a bookstore, where the cashier asked if I was eligible for a loyalty discount. I had no idea, so we delved into the company database together. Of the four family members under our current address, my address was correct. They had not registered my wife (she didn't ask, presumably) one daughter has been abroad for 5 years, the other died 5 years ago: so 50% success. But the example indicates some practicable solutions.

    - regularly mark metadata with no transaction activity as "for deletion"

    - automate this process, and use only consolidated data in evaluations

    - allow record holders and those affected web access to propose corrections for wrong and near-duplicate data (including credit ratings)

    - add a functional data complaints procedure

    I don't know how many big databases could meet these criteria with a web add-on. I know that most of my employer's ones wouldn't, as I've been working on a data consolidation programme for the last 4 years.

    We also found that I had never been offered any loyalty programme, perhaps a missed opportunity. Not updating the mistakes certainly was.

    1. Pascal Monett Silver badge

      Counterpoint :

      The daughter that lives abroad now might come back at some point, either permanently or just visiting, and be happy to find that her account is still available and her brownie points still recorded. Of course, finding out that her account expired after a given period is not too much of a heartbreaker either.

      The fact that her address is wrong is irrelevant - it shouldn't even be in there anyway. Are they going to mail her something ? Send goods via post ? Don't think so, so it is not pertinent data.

      1. Britt Johnston
        Unhappy

        More data than necessary

        You are right, in the sense that this data needs limiting - though systems use additional data to make identification unique. Does your customer system have any John Smiths?

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon