back to article Little grouse on the prairie: IBM's AI facial-recognition training dataset gets it in trouble... in Illinois

A class-action lawsuit filed late last week has accused IBM of using photos of millions of people in Illinois without informing them to build a facial-recognition dataset. Come on, feel the Illinois As we mentioned before the weekend, the US Supreme Court last week dismissed Facebook's attempt to get the highest court to …

  1. Pascal Monett Silver badge
    Coat

    "using photos of millions of people in Illinois without informing them"

    Well, in IBM's defense, it got the photos, not the addresses . . .

    1. SonofRojBlake

      Re: "using photos of millions of people in Illinois without informing them"

      That's not a defense.

      "Oh, we haven't got an address to match to that photo? So we can't get permission to use it? Delete it from the database then..." should have been the default.

    2. msknight

      Re: "using photos of millions of people in Illinois without informing them"

      > Well, in IBM's defense, it got the photos, not the addresses . . .

      Not a defense IMHO. They know which accounts they scraped the photos from. They just didn't care to go to the bother of asking permission; they just scraped away.

    3. a_yank_lurker Silver badge

      Re: "using photos of millions of people in Illinois without informing them"

      The owner of the photographer probably never got permission (or may even know the subjects) for using the photos in biometric scanning/testing. Depending on the shot, any people in the photo may just be unknown locals, good luck getting permissions or identifying them.

      The problem is Itsy Bitsy Morons goes off half-cocked too often. Grab a dataset (or photos) from the web without checking the law, copyright, etc. is a good way to enrich someone else at your expense. This particularly stupid when grabbing photos from a site like Flickr with contacting the owner with whatever legal issues you need cleared up. How expensive and time consuming is it to send an email and wait for an answer when this would be a good project for an intern or junior staff member.

      1. doublelayer Silver badge

        Re: "using photos of millions of people in Illinois without informing them"

        In all reality, asking for that much permission would not be in any way easy. You'd have to contact every account you scraped from, which could be done automatically, but then you'd need to read all the replies manually. With millions of photos involved, that'd be prohibitively difficult, and answering questions in those replies would be even worse. None of that is meant to absolve IBM of not asking for permission, but they can't feasibly collect a dataset like that and handle the permission problem. There's a reason consequence-free data collections are expensive. And why Google is unlikely ever to tell us what is in theirs.

        1. Someone Else Silver badge

          Re: "using photos of millions of people in Illinois without informing them"

          Somebody call a Waaah-mbulance for poor IBM or Google.

      2. Michael Wojcik Silver badge

        Re: "using photos of millions of people in Illinois without informing them"

        Grab a dataset (or photos) from the web without checking the law, copyright, etc

        In this case, IBM only scraped photos which explicitly had a liberal CC license applied.

        It seems to me BIPA and CC are in conflict here. I have no idea whether BIPA makes the right it establishes inalienable, or whether it can be waived by a license such as CC. (I haven't looked at the text of BIPA.) And I wouldn't want to even hazard a guess as to how courts would find. But I don't think this is a clear case of IBM deliberately violating the law, since a reasonable interpretation of some CC variants would allow what they did.

        1. jtaylor

          Re: "using photos of millions of people in Illinois without informing them"

          "It seems to me BIPA and CC are in conflict here."

          It does seem that way. Fortunately, private contracts cannot override statute law. It doesn't matter what IBM agreed to with Person X, that still doesn't exempt them from the requirement to get permission from Persons Y and Z before using their biometric data.

          This all falls on IBM: either they didn't ask permission, or they asked people (CC or the photographers or someone standing in line at the deli) who were obviously unable to give that permission.

  2. Doctor Syntax Silver badge

    The Creative Commons group responded to the reports, saying that "fair use allows all types of content to be used freely".

    It's an interesting question. As the BIPA has provision for a data subject to grant permission then a CC licence might well satisfy that for a photo posted by the subject. But I don't see how a CC licence could grant permission for an image of a third party. In fact, can Flikr itself, being a businesss, accept images of 3rd parties?

    1. Anonymous Coward
      Anonymous Coward

      Try telling YouTube that.

      There seems to be a lot of confusion in their camp about what constitutes 'fair use'.

    2. a_yank_lurker Silver badge

      @ Doctor Syntax - What Itsy Bitsy Morons did was not check the law for any restrictions on usage for biometrics. They claimed 'fair use' for academic research which does not apply to the Illinois law. Fair use is a copyright issue and does grant any other rights other than possible protection from copyright infringement. Given the scope of their slurping I am not sure fair use covers them anyway. A model release might restrict the usage of the image and the image is almost certainly released under full copyright not CC. Here again our slurp happy Morons are likely to be possible legal hot water but not under the Illinois law. The Illinois law bans using biometric data without specific (written) consent of the person which the morons obviously never did.

      So Itsy Bitsy Morons might actually faced a variety of lawsuits for violations of copyright and various privacy laws. If they were smart, the Morons would have gotten written permission from the photographers to use the images in the project as they are using a large number images. They would have in the communication with photographers ask about locations were the photos were taken. If in a location that has restrictive legislation, do not use them or get the necessary permissions. None of these steps were apparently done. I hope many get a nice chunk of chain from them for sloppiness.

  3. Imhotep

    Stop It Now

    I'm hoping that every plaintiff that brings these types of cases wins their suit.

  4. RegGuy1

    You need data if you want understand it

    For many years IBM has been trying to take the high road and not abuse data, but simply create tooling that can manipulate others' data. Alas they've found that this doesn't really work. To get the tools to generate inferences from the data means you need data. Doh!

    Rather later than everyone else -- who doesn't give a toss abut privacy -- they have realised they must be in the data collection game too. Remember all that hype about medical data and detecting cancer -- the end of doctors? Well that fell flat in part because they had no real data (or at least not enough good quality data) to work with.

    But I think they are finally coming round to what Google thought 20 years ago -- just give me data. Data, data, data and we'll see what we can use it for after.

    It's a sad world we live in, but hey, if you don't want anybody using your data don't put it up on a public site. It's not rocket science.

    [Hmm, a post making out IBM is the good guy, I wonder how many down votes that will generate. :-/ ]

    1. JohnFen Silver badge

      Re: You need data if you want understand it

      I don't think your post made IBM out to be the good guy. It made IBM out to be as bad as the other actors in this space, in the end.

  5. FozzyBear Silver badge
    Devil

    Hope they win. $5K per photo. That should make that last quarter profit disappear.

  6. spold Silver badge

    Welcome to a new level of complexity and privacy ....

    Even if you bothered to read the lengthy linked privacy policy prior to to clicking "I have read and understood" that enabled this transfer in the first place...

    You also likely never envisioned your data (image) being used in this way in the first place...

    Now once your data has been subsumed into the inky blackness of the facial recognition training process, then getting it out of there is nigh on impossible.... it is unlikely that your data is now identifiable, or even locatable, in any meaningful way to exhume it....

    HOWEVER, once the resulting trained facial recognition systems are sold and used in all sorts of imaginative, possibly nefarious ways, and said software is fed your image... guess who it is going to score a hit on!

    1. Anonymous Coward
      Anonymous Coward

      Re: Welcome to a new level of complexity and privacy ....

      re: "the inky blackness of the facial recognition training process"

      They'll offer up a competing captcha device, this one requiring punters to "click all the boxes with African-American faces", "click all the boxes containing faces with glasses", "click the boxes of faces with prison tattoos", and so on. Because no idea has ever been so bad that it can not somehow be made worse.

  7. MachDiamond Silver badge

    CC is not Copyright

    Creative Commons isn't codified in US law. It's mostly a scam. If you want to protect your creative works, you need to register them with your country's Copyright office. In the US, Copyright is the only right granted in the Constitution. All other rights are amendments to the Constitution. Most signatories to the Berne Convention on Copyright agree that the creator of a work has an automatic Copyright in that work provided it wasn't done for an employer or while under a Work Made for Hire contract. At least in the US, you can't bring a suit in court for infringement if you haven't registered the work.

    It will be interesting to see what courts make of this if the plaintiffs have the money to hire attorneys. The other thing with registering a Copyright in the US in a timely manner is that if you win a case, the defendant must pay your reasonable attorney's fees. This means that if your attorney thinks you have a good case, they'll often take it on contingency. Somebody believing in CC may have a hard time getting the same arrangement as a judge may just throw the whole case out.

    Are the images being used "commercially"? If IBM is generating the biometric data through scanning photos, wouldn't that data then belong to IBM? If the photo is not being published, is it really an infringement? This case will have a lot of questions to sort through. If I were betting, I'd put my money on IBM. They can afford better lawyers and lots of them.

    1. CRConrad

      Re: CC is not Copyright

      Judging from your headline and first paragraph, you seem to have fundamentally misunderstood what a license is.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2020