back to article Google takes the PIS out of advertising: New algo securely analyzes shared encrypted data sets without leaking contents

Google on Wednesday released source code for a project called Private Join and Compute that allows two parties to analyze and compare shared sets of data without revealing the contents of each set to the other party. This is useful if you want to see how your private encrypted data set of, say, ad-clicks-to-sales conversion …

  1. JohnFen Silver badge

    A tiny step

    This sounds like a "better than nothing" sort of approach to the underlying problem to me.

    "The city's rider data set and the point-of-sale data set from merchants can be processed using Private Join and Compute in a way that allows the city to determine the total number of train riders who made a purchase at a local store without revealing any identifying information."

    In my view, the real problem is that the merchants can identify which individuals purchased what, and the city can identify which individuals traveled where and when. Sure, it's great to give them the chance to compare notes in a way that is a bit less invasive, but the privacy incursion has already occurred before that happens.

    1. Pascal Monett Silver badge

      "the merchants can identify which individuals purchased what, and the city can identify which individuals traveled where and when"

      Not if the individuals pay in cash - which is likely happening less and less what with all the phone payment and contactless stuff that is happening these days. But honestly, you can't blame the merchant for recording the fact that you bought something, nor can you blame the city for recording the fact that you used its transport system. What I mind very much is somebody going all Big Data on the two different data sets, and this tool (in beta and forever will be until it is dropped, as usual for Google) allows for that.

      What I would accept more readily is the city gives it a try and asks the merchants to tell it if they had a better day because of it. Nobody is messing with anybody else's data and they still get the answer.

      This is just another excuse to hook up third-party data, and I am against that by principle.

      1. JohnFen Silver badge

        "Not if the individuals pay in cash"

        True, which is why I pay with cash about 99% of the time.

        "But honestly, you can't blame the merchant for recording the fact that you bought something, nor can you blame the city for recording the fact that you used its transport system."

        "Blame" is too strong, but I can, and do, strongly disapprove of them doing that for any purpose beyond payment processing.

        "What I mind very much is somebody going all Big Data on the two different data sets"

        We are in complete agreement here.

  2. JassMan Silver badge

    Liberal splashings of snake oil

    Unless Google are planning on passing unique identifying info from all their customers' (merchants, travel operators, congestion authorities, etc) into their own, parsing the data and then finding the intersecting sets, I don't see how this can work.

    Point 1. When I buy in a glass fronted shop, I never give authority to email me (not even electronic receipts), nor share any data I might have had to fill in for the guarantee.

    Point 2. I always use one card for travelling (contactless) and a different card for high value purchases. I have also have a third very low balance card for use in places where skimming is more likely such as petrol stations, restaurants.

    1. Wellyboot Silver badge

      Re: Liberal splashings of snake oil

      Selling snake oil - The foundation of many interweb business opportunities

      +1 for point 2. - A sensible fraud mitigation technique that many more people should use.

    2. Anonymous Coward
      Anonymous Coward

      Re: Liberal splashings of snake oil

      Point 1: I can generate aliases on the fly for my email system, so any email address I give out is in supplier+code@specifically_set_up_domain.com format. The result is that any mass marketing leak instantly and provably points at the perpetrator.

      Point 2: I NEVER use a contactless card, nor does mine have a mag swipe that is worth cloning. With Revolut, you can disable both, and I worked on research that allowed us to read a contactless card from some 10m distance (in the lab, which means it's not impossible to hit 3..4m in the real world). I have one throwaway for small, untrusted purchases, a major one for big/trusted real world purchases, a virtual one for electronic subscriptions and I use virtual cards for anything else on the principle that they won't be a problem when they leak (they change every purchase).

      Point 3: I aggressively filter out trackers and presence beacons in my web browsing, and that includes stripping tracker features from incoming URLs and some 127.0.0.1 redirects in a hosts file (but that only works properly because I never surf on my phone).

      Point 4: I never, ever give my mobile phone number out. Everyone including LinkedIn is trying to get it, but Facebook will only ever get it because it acquired it via WhatsApp, which is something I can't do much about just now. I am busy with some idea on proxying, but it's too early to tell if that will work.

      Oh, and I have stripped all card information from PayPal. Anything that involves PayPal as processor gets by default a virtual card. It still allows them to trace the purchases I so make, but at least a breach of account or mass data theft à la Equifax won't matter much.

      Yes, it's a lot of work to actually enjoy something that is supposed to be a default Human Right, isn't it?

  3. Wellyboot Silver badge

    Given that most companies want the analytics and would prefer not to be accused of snooping, I think this is a more likely scenario.

    >>>the greatest benefit of this technology accrues to companies that would have otherwise foregone outsourced analytics altogether for fear of privacy problems.<<<

  4. Anonymous Coward
    Anonymous Coward

    Oh. Whoop. De. Doo.

    Another way to get:

    Yet more ads for stuff I ALREADY HAVE.

    Yet more ads for suppliers I ALREADY KNOW ABOUT

    Yet more ads for stuff I have already looked at and DONT WANT

    Yet more ads for what is "Trending" (as if I should give a rats arse.)

    Whoop indeed. De. Doo.

  5. NonSSL-Login
    Stop

    Algorithmic Hash then?

    Sounds like an advanced kind of hash with lots of potential flaws which will probably be used to try and get around GDPR data sharing.

    Recently had a form to fill in which stated the data would be anonymised and shared with governments. Reading further thought government white papers and official pdf's, it seems the data us actually pseudo-anonymous, specifically so they can use the same algorithm/hash on the same sample points (ie, name, DoB) in other 'anonymised' data-sets and get the same unique identifier and match up the different 'anonymous' data-sets to the same people/family. Thus creating a big combined dataset of one person/family unit from different data sources that claimed on paper they would be completely anonymous.

    So I tend to look deeper now at anything that claims to make my data anonymous via some hashing or similar tricks. In the above case I opted out of the data share and do every chance I get now.

    Just remember, they are only doing this so they can share and monetise data and it's likely your data.

    Never trust those wanting your data and saying they will keep it secure. It's like saying your data in storage is safe because the site uses SSL and has a padlock icon from an AV company.....

    1. Anonymous Coward
      Anonymous Coward

      Re: Algorithmic Hash then?

      You have hit the nail on the head. Whilst this is an interesting technology, the motivation for it is to attempt to circumvent data sharing laws. It needs to be recognised as such and slapped down accordingly before it becomes an acceptable boundary case.

  6. Milton Silver badge

    Interested?

    For anyone intrigued by this concept, poke around using "homomorphic encryption" as a jumping-off phrase. As techies you'll find it leads the curious mind to some interesting places.

    The use cases in this article seem a little stretched (clouded not least by one's inability to trust the hilariously monickered Don't Be Evil with any data, ever) and one wonders what's wrong with some sensible anonymisation algorithms, but there are certainly situations where encrypted-but-queryable data sets would be of tremendous value.

    Although fully homomorphic encryption (FHE) is of debatable practicality right now—performance is a massive dud, and query types and operations are severely restricted—things may change when quantum computing matures. I have an inkling that FHE might be one of those things that will, as it were, sprint from the pack and surprise us when QC's limitations (error rates; scaleability; provability) are solved.

    And given that Google are deeply involved in quantum processing ... watch this space.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019