back to article Chap asks Facebook for data on his web activity, Facebook says no, now watchdog's on the case

Facebook's refusal to hand over the data it holds on users' web activity is to be probed by the Irish Data Protection Commissioner after a complaint from a UK-based academic. Under the General Data Protection Regulation, which came into force on 25 May, people can demand that organisations hand over the data they hold on them …

Page:

  1. Doctor Syntax Silver badge

    The report just says he's asking for browsing activity off Facebook. It's not clear whether he also has a FB account or whether he's a non-account holding innocent bystander.

    1. big_D Silver badge

      That doesn't make any difference under the eyes of the law. They are collecting the information, so they have to hand it out in a reasonable time. They also have a legal requirement to hold the minimum amount of data on a person in order to provide their service.

      If they have so much data, that they themselves can't access it all in a timely manner, then it breaks that part of GDPR as well.

      1. Aqua Marina Silver badge

        Can we please have the same for Google and Apple? Pretty please!

        1. bombastic bob Silver badge
          Devil

          Can we please ALSO get the same with MICROSOFT (in addition to Google, Apple, others) ?

          You know, that 'Microsoft Logon' that they strongarm* you into using, JUST so you can access your _OWN_ Windows 10 PC? What info is being stored along with THAT??? Hmmm???

          Yeah I think the U.S.A. needs a GDPR, too. And _ONLY_ 'opt-in' authorization for data collection. And the ability to edit/erase the data. And so on. It can't be THAT hard for FB and the others to write a simple generic SQL query web interface to do this. It's just they don't wanna unzip their pants and let people see what's REALLY behind the curtain...

          * last time I had to build a Win-10-nic VM with a very recent downloaded ISO image from MSDN, I ran into the same 'how do I prevent having to use a Micro-$#!+ login" problem... as I'd forgotten the 2-step hoop jump you have to do to make this work. Eventually I remembered, but it _IS_ strong-arming when you force people to do this JUST to avoid your tracking/slurping/cloudy/online logon for a LOCALLY INSTALLED COMPUTER. In other words, they *NEVER* *FIXED* *THIS*.

          1. TheVogon Silver badge

            "You know, that 'Microsoft Logon' that they strongarm* you into using, "

            My Windows 10 seems to work just fine with a local user account and password only. Using a Microsoft ID / the Windows Store is optional.

            1. Tony Paulazzo

              My Windows 10 seems to work just fine with a local user account

              But to be fair there's a huge 'sign-in with MS account or create one' and a tiny line near the bottom of the screen with 'local account login' that in no way implies MS is trying to hide the option.

              As for the GDPR and tracking details, someone needs to create an 'easy to ask for all my data' website for all Europeans to really fuck MS, Google, Facebook et al.

              It would be really funny.

      2. ratfox Silver badge
        Paris Hilton

        If he doesn't have an account, it might be difficult for Facebook to identify his data, though. They might well have a complete history of what AnonymousUser142857 has done the web, but I'm not sure how they could connect that with Joe Bloggs from Ipswich. Google certainly also creates a profile of users that have no account, but the My Activity website only works if you are logged in to a Google account.

        Which leads to the depressing idea that you have to create an account with them so that they can tell you exactly what they know about you. On the other hand, depending on what the law says, that might mean that they are not allowed to create a profile of you if you don't have an account with them. Hmmmmm...?

        1. Gotno iShit Wantno iShit

          They might well have a complete history of what AnonymousUser142857 has done the web, but I'm not sure how they could connect that with Joe Bloggs from Ipswich.

          Dear Faecbook, please provide a copy of and then delete all data held by yourselves regarding the owner of phone IMEI aa-bbbbbb-cccccc-ee. Regards, Joe from Ipswitch.

        2. JohnFen Silver badge

          "If he doesn't have an account, it might be difficult for Facebook to identify his data, though"

          If that's actually true, it's a very strong argument that Facebook (and Google, etc.) needs to stop collecting and storing this data entirely. Or, at a minimum, stop storing it.

      3. David McCarthy

        I have this notion that GDPR says you have to tell people what data you're collecting and why. This would apply wether or not you have an account. This clearly isn't happening.

        Also, only the necesary data should be collected (by default) to enable the service to be provided (to the individual). If they don't have an account, or are not logged in ... then Facebook isn't providing a service ... and so shouldn't be collecting ANY data.

        1. Mike 33

          You can argue that on a website with a 'like button' or 'login with Facebook', that Facebook is providing data processing and the site itself is the data controller...

          Which potentially also means you can request the information from the website owner itself...

          1. Anonymous Coward
            Anonymous Coward

            > You can argue that on a website with a 'like button' or 'login with Facebook', that Facebook is providing data processing and the site itself is the data controller...

            Nope. That "like" button being on some site or another is irrelevant: that only determines how you end up accessing Farcebook's systems and is no different than, say, landing on the Farcebook site from a search engine results page vs typing the hostname on the address bar. Either way, your browser will be opening a connection to Farcebook and they will be collecting or storing information directly from/to your systems. That information may (will) include the site that you had visited, but that's it.

            TL;DR: the website hosting the like button is neither a data controller nor a data processor in relation to data held by Farcebook about you.

            For OAUTH2 logins that's slightly different, but that's not important right now.

        2. GeekyDee

          Dear user,

          We collect everything because we can and noone can stop us as we have all the money

          Regards, Mark

      4. Remy Redert

        It does matter if they can't prove informed consent was present when they gathered this data. Of course that's separate to his request for all the data they have on him.

        Get all the information they have, then sure then for the data they have because you never have consent.

      5. StewartWhite

        Re. "If they have so much data, that they themselves can't access it all in a timely manner, then it breaks that part of GDPR as well.", it' ain't necessarily so. In the UK the ICO has already ruled that Queen Mary University London (QMUL) did not have to comply with a request to provide the raw data used to produce the results of the discredited PACE trial. This was under the Freedom of Information Act rather than GDPR but it's quite conceivable that the precedent will remain in the UK. The spurious argument that the ICO accepted was that the database is very large (it isn't), that the process involved in extracting the information would be too onerous (it wouldn't) and that they would be "creating the data" (they wouldn't). Facebook could be able to use this kind of perverse logic with the additional points that their dataset is far larger and more complex than QMUL's.

        "QMUL explained to the Commissioner that the relevant raw data is held in a very large database of 3000 variables with 640 rows. It went on to explain the steps required in order to provide the information to the complainant. The Commissioner considered the explanation of the steps required to locate, retrieve and extract the information. He determined that the application of section 12 was not appropriate in the circumstances of the case. QMUL was, in fact, stating that it would be

        ‘creating’ the information and the information was therefore ‘not held’."

        https://ico.org.uk/media/action-weve-taken/decision-notices/2015/1043578/fs_50557646.pdf (section 12)

        1. Anonymous Coward
          Anonymous Coward

          @StewartWhite

          The argument that they have so much data and its not indexed properly in Hive is in fact false.

          FB does in fact have the ability to index and access the data quickly. However, they don't want to do it.

          Yes, I am posting anon because I am both familiar with their environment, as well as a 'Big Data' expert. They could easily afford the cost of adding indexing as well as converting the data that they store in Hive. Actually the truth is that Hive is the SQL-lite language which is used to query data that is stored in files on HDFS which could be raw log files but are really parsed files stored in parquet. They could use HBase (which they have) to be secondary indexes, and then join them against the primary or base table. (The underlying storage mechanism is abstracted so you can have one table in Parquet, another in Hive's native ^A, ^B format, or comma delimited or even HBase. )

          The whole section of the article on the 'Hive mind' is pure spin by FB and it falls flat. While its true that they don't have the capability to do these queries in a timely fashion, its more due to a lack of CPU than a lack of technology or money. They could and probably have already expanded in to a compute / storage model and using Kubernetes can spin up compute clusters that can run their 'hive' or SparkSQL queries against the data.

          So I call BS on FB.

          1. Mark 85 Silver badge

            Re: @StewartWhite RE: Anonymous Coward

            If what FB says were true, then why would they store all that data in their "Hive"? Hives cost money, so obviously, it's BS. FB has become the new Big Brother.

            1. Anonymous Coward
              Anonymous Coward

              @Mark 85 Re: @StewartWhite RE: Anonymous Coward

              Hive is a query language that gets translated in to a map/reduce job.

              The data is translated from a log message in to a semi-structured state, stored potentially a couple of different storage formats... (csv, parquet, hbase, orc, etc ...) Stored on HDFS. The key is that these files are stored in a key/value manner with further partitioning within a hierarchy of directories.

              Its not always the most efficient and there are other tools that can be used in a combination w Hive to improve performance. Like using spark, presto, drill, hbase, tez, etc ...

              So while what they said is partially true, the key is that they could spend money and could manage their data centers better. But then again FB has this inbred desire to use FOSS only with few exceptions.

              1. whitepines Silver badge

                Re: @Mark 85 @StewartWhite RE: Anonymous Coward

                Don't see where FOSS is the problem here. Facebook could either buy proprietary tools or pay developers to write new FOSS ones. The problem seems to be getting them to pry open their wallet to follow the law. Hopefully that will be dealt with in an example-setting way (fines, efc.)

          2. JohnFen Silver badge

            Re: @StewartWhite

            "So I call BS on FB."

            At this point, I pretty much assume that anything Facebook says is BS.

          3. Jack of Shadows Silver badge

            Re: @StewartWhite

            Bingo! Saved me a ton of typing. I've been working with databases since 1975, can't say I've missed many since especially when I like to collect and learn about the new types. Given that Hive is explicitly scale-out by design, the only thing that'd be holding them back from such queries would be a lack of infrastructure. Well, guess what Facebook, you need to buy some more to meet the letter of the GDPR. I almost have sympathy for them. Almost.

            Filing this one away for when/if we get the new law here in California up and running. I'm not exactly expecting that to happen without a fight.

          4. Byham

            Re: @StewartWhite

            As Facebook uses the information generated on each user to provide close to immediate tailoring of the 'FB Experience' with ads, news, and, other information suited to the identity logged in - then I agree it is obvious BS that they cannot access the information - they act on it at every single login and possibly on every single transaction. They may not be able to easily go back to each of the interactions that were used to build the information/picture of the identity - but they and all their advertisers have immediate real time access to the information about each and every identity logging in. Otherwise it does not support the purpose for which it was collected and therefore does not meet the requirements of GDPR.

        2. Anonymous Coward
          Anonymous Coward

          > In the UK the ICO

          ...are worse than toothless: they have a tendency to side with companies against consumers. To dissimulate a bit, they do the occasional slapping of fine on either some nobody or another governmental organisation (so that's consumers footing the bill again), but they're never going to touch anyone of any size.

        3. Alan Brown Silver badge

          " In the UK the ICO has already ruled that Queen Mary University London (QMUL) did not have to comply with a request to provide the raw data used to produce the results of the discredited PACE trial."

          The ICO has issued a number of flawed decisions, but an ICO decision is far from the end of the line - it's not even precedent setting on the NEXT decision they make (never mind they're not a court of law).

          There are ICO appeal procedures and then the law courts - and the courts have been less than kind to the ICO's strange interpretations of the law in the past, with particular criticism given to the way they side with orgs declining to disclose data. (Judges understand the principles of FOI far better than the ICO - remember that most ICO employees handling cases are underpaid, underqualified, overworked civil servants and that's exactly the way Whitehall wants it)

          1. Ben Tasker Silver badge

            The ICO has issued a number of flawed decisions, but an ICO decision is far from the end of the line - it's not even precedent setting on the NEXT decision they make (never mind they're not a court of law).

            Not to mention the complaint has gone to the Irish Data Commissioner, so the ICO are entirely irrelevant here anyway.

    2. Anonymous Coward
      Anonymous Coward

      Whats to stop a criminal pretending to be someone else and asking for all the data Facebook hold on them?

      Seems like identity theft could get a huge boost if it becomes too easy to access this info.

      1. big_D Silver badge

        Given that you have to be able to identify yourself in the first place, you would have already had to have stolen their identity.

      2. JohnFen Silver badge

        "Seems like identity theft could get a huge boost if it becomes too easy to access this info."

        Yet another excellent reason why these companies need to stop storing all this data.

    3. Ian Michael Gumby Silver badge
      Boffin

      @Doctor Syntax

      The report just says he's asking for browsing activity off Facebook. It's not clear whether he also has a FB account or whether he's a non-account holding innocent bystander.

      It doesn't matter.

      Under GDPR, any data collected by FB on a non-account holder would be a violation since the non-account holder has no way to 'opt-in' to their capture and retention of his data. Where they may claim you approved is that the site you visited had a banner than said that they use cookies therefore if you visit the site, you agree to their capturing data on you and imply that it carries to their third parties like FB.

      (And that's questionable at the start) or if they use .js from FB which has nothing to do with the cookies.

      So the UK has every right to go after them.

      If he is an account holder, then under the GDPR they have to detail what details they collect and how they use it so that the user/punter/shill has the option to 'opt-in' giving them an informed consent to track him.

      That's not so clear therefore it too is against the law.

      Either way you cook it... its still tainted meat and you will get sick. ;-)

      1. tallenglish

        Re: @Doctor Syntax

        I think the way they may weasle out of it is the site that has FB links has to have shown the GDPR accept all request and part of that is allowing 3rd parties to use the data from that site and in this case, FB is the 3rd party.

        So ironically FB will blame everyone else, in the "you must have clicked accept all on xyz.com website to view it and part of that acceptance is to allow FB to track you while using that site".

        What I am not sure about is the legality of how they merge those site tracking to one unique ID or if they do at all - if they do then the must have some identifyable info about the user and could easilly grant the request (albeit data taking a long time to collate), if not then it is fully anonymised then it only allows for custom advertising to that website based on what type of users are visiting it and that wouldn't require GDPR data given to any one user I think.

        Question is what are they slurping, are they infuring things like sex and other preferences by the type of site visited (like men mainly go to car websites) and sexuality based on what you search for in pornhub, etc?

        1. Anonymous Coward
          Anonymous Coward

          Re: @Doctor Syntax

          > I think the way they may weasle out of it is the site that has FB links has to have shown the GDPR accept all request and part of that is allowing 3rd parties

          That would be extremely dodgy from a compliance standpoint. But then again, lots of people are just playing dumb anyway.

  2. Anonymous Coward
    Anonymous Coward

    'It's not clear whether he also has a FB account or whether he's a non-account'

    "Info collected on folk outside the social network 'not readily accessible'" ... "Michael Veale, who works at University College London, submitted a SAR to the social media giant on 25 May asking it to hand over the information it has collected on his browsing behaviour and activities off Facebook."

    1. Doctor Syntax Silver badge

      Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

      Yes. All it says is off Facebook.

      Let's back up two paragraphs before what we both wrote about: "The crux of the issue is the data the firm slurps up via its Facebook Pixel, the widely used tracking code on multiple websites"

      Note that these multiple websites extend far beyond those Facebook runs.

      Now look at the next paragraph; it makes the point that the tools Facebook provides are "to access the data collected on the platform [i.e. Facebook's own platform] – for instance, ad preferences" and not those collected off it, i.e. those collected by the means described in the preceding paragraph.

      And that's what "off Facebook" means. It gives no indication as to whether he has an account with them or not because he's not asking about data collected on the platform.

      1. djack

        Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

        Why does it matter whether he's a member or not? It's personal data that they have collected about him.

        From a technical aspect, if he's a member it should make it easier for them to extract and collate the relevant data. If he's not a member, they have no justification or permission whatsoever for collecting and processing that data in the first place.

        1. Doctor Syntax Silver badge

          Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

          "Why does it matter whether he's a member or not? It's personal data that they have collected about him.

          From a technical aspect, if he's a member it should make it easier for them to extract and collate the relevant data."

          I'm just thinking it terms of how this can play out. If he doesn't have an account FB can present a defence along the lines of "we don't know who he is". If he has an account this defence is less likely to succeed and if the case then exposes the amount of data collected off-platform it makes it less easy for them to defend against a subsequent claim by a non-account holder.

          1. djack

            Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

            Let's take the thought exercise a bit further ..

            They have a bunch of data that is classed as personal. You may even go so far as being able to deanonymise some of it making it potentially identifiable (don't ask me how, but the deanon crowd can be scarily inventive when they get a hold of big datasets).

            For any particular data element, they can say that they don't know who it is about. Therefore there is no way that they can evidence any informed consent for the collection and processing of said data. The individual is not (necessarily) a user of Facebook so there is no way that the data is collected as an essential part of any service provided to the individual. Therefore, as far as I can see, they would have no legal basis to keep hold of the data and should therefore delete it.

            That will probably save them megabucks in storage costs ;)

            1. big_D Silver badge

              Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

              djack, what you say is true. But does it outweigh the money made from selling targeted adverts?

              Until big fines start getting handed out, the cost of storing the data will always be miniscule, compared to its possible use, whether that use is legal or not.

              1. djack

                Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                But that's the thing with the GDPR, the potential fines are quite large.

                If someone has the will (and I'll admit it's probably quite a big if) then this sort of case could cause some massive changes of behaviour in the tracking and advertising industry. Probably just for European end users though. It will either cost a shed load in fines or a shed load in lawyers fees (and then hopefully a shed load in fines on top! - Hey, I can but dream)

                1. Charles 9 Silver badge

                  Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                  "But that's the thing with the GDPR, the potential fines are quite large."

                  Until some genius finds a legal way to weasel turnover numbers...

                  1. DCFusor Silver badge

                    Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                    Charles, that was kinda my point above, which collected some downvotes...maybe some SJWs feewings were hurt or something.

                    Point is - it's obvious where the power lies here. Any fines of any transnational never amount to even a day's cleared profits, as the Reg writers themselves often point out. Which shows who is in actual control, and the rest is theater.

                    These days, if you want to keep your data - which has value, like other stuff, you have to earn it by perhaps blocking the collection of it...even if you don't have a FB or Google or whoever account -

                    You might have to lift a finger or spend a little skull sweat, as you can count on the fact that those who are making money aren't going to figure out how to defeat themselves for you with "that one weird trick".

                    Sorry if I come across as too cynical. Being an old fart in this world,, and having touched matters of high finance, politics, and computer science, well, it'll get to ya if you keep your eyes open and "follow the money". Cui bono - except now you don't even have to buy it directly - you can be monetized without your own direct input. (taxes pay for it, the things you buy,...and so on)

                    1. Byham

                      Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                      "Point is - it's obvious where the power lies here. Any fines of any transnational never amount to even a day's cleared profits, as the Reg writers themselves often point out. Which shows who is in actual control, and the rest is theater."

                      I would think that a fine for example of 30% of annual turnover would make even Facebook sit up and take notice. The levels of fines especially from the European Court systems as well as the levels of potential enforcement are not something that any transnational will take lightly.

            2. Lyndon Hills 1

              Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

              The individual is not (necessarily) a user of Facebook so there is no way that the data is collected as an essential part of any service provided to the individual.

              The 'service' is not being provided to the individual, it's being provided to advertisers.

              1. djack

                Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                The 'service' is not being provided to the individual, it's being provided to advertisers.

                Hence they have no defence of it being stored as an essential part of the service to the subject.

            3. Anonymous Coward
              Anonymous Coward

              @Djack Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

              Sorry mate it doesn't work that way.

              They may anonymize and aggregate data that they sell, but the raw data that they capture and retain... isn't anonymous and is still kept because it has value.

              But keep trying.

          2. KLane

            Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

            If this stored data can't be queried efficiently, then what does Facebook use or archive it for? I suspect if a TLA asked for it, there would be no difficulty coming up with it.

            1. PurpleLace

              Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

              I think it's more the manner in which it's queried.

              In the sense that for an advertiser, you know that user I'd "xyz" likes this and that and so can target adverts.

              But that doesn't necessarily mean that for a Facebook user or non Facebook user that you could very easily associate a user id with a an actual identity provided as part of the SAR. Especially if what they say about the two platforms being unrelated (if I remember the article correctly)

            2. Jack of Shadows Silver badge

              Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

              There's a huge financial difference between a few subject requests from the various TLA's and, potentially, millions of data subjects for Facebook. Then the question for them becomes, which would you rather pay? More for infrastructure to comply or 4% of turnover for the rest of the firms existence? Now that's an interesting economic calculation in the realm of game theory right there!

              1. Charles 9 Silver badge

                Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

                "More for infrastructure to comply or 4% of turnover for the rest of the firms existence?"

                Ever heard of The Cost of Doing Business? If they can find a way to reduce their legal turnover (I don't think there's a fine in the world that can't be finagled--that's what lawyers are for, partially), they could just pay the fines so as to keep going.

          3. Anonymous Coward
            Anonymous Coward

            @Doctor SyntaxRe: 'It's not clear whether he also has a FB account or whether he's a non-account'

            Sorry but from a technical aspect, there is no defense.

            Regardless of your status as a member or not, FB captures and performs work on the data in order to build a profile. Its not until the later stages that they are able to match this information against a FB user.

            Think of it this way.

            You use Dr. Syntax here.

            You may have your favorite fetish site where you go by igor

            Your real name may be Christopher Robbins and on FB you go by the ailas Chris McDougal.

            (I don't know I'm just making this example up ...)

            So even if they can't match Christopher Robbins to a FB user, there is still data on you and it has value.

            What they do with it is a mystery and under GPDR, its still illegal because they didn't get an explicit , informed, consent.

            FB really doesn't have a strong leg to stand on in either case.

        2. DCFusor Silver badge

          Re: 'It's not clear whether he also has a FB account or whether he's a non-account'

          djack - of course they have no justification. But this isn't the 1950's - they have "feelings" and "are offended" they can't make money by selling whatever they *want* to collect about you to people who *want* to buy it - and after all, they did half the work finding that stuff out about you - you only did the unpaid other half by giving it to them....

          This is the new age - feelings matter more than what we used to think of as right and wrong. It is what it is. I don't have to like it, and neither do you, but that doesn't make people's lack of old-school morality revert to a good state. And might always meant right - it's just that these days, the might is more transparently NOT resident in governments - who are owned in fee simple by these big transnationals.

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019