back to article Data scrapers used Amazon cloud to reap biz bods' CVs, wails LinkedIn

LinkedIn is still waging its battle against “scrapers”, who use software to automatically harvest publicly available personal information from the social network. And that fight has today wound up in a California court where the website's bosses are trying to unmask the miscreants who have reaped the site for users' employment …

COMMENTS

This topic is closed for new posts.
  1. frank ly

    crawling security

    "Other security measures the attackers circumvented included “Sentinel”, which limits the number of requests permitted from a single IP address, and the restrictions on crawling that are implemented in its robots.txt file."

    I thought that robots.txt was a polite request; hardly a security measure, especially against people who have a tendency to break your rules.

    1. Chris Miller

      Re: crawling security

      robots.txt is a polite request to automated web crawlers. It can also sometimes be a handy way of pointing the bad guys to where your crown jewels are located. Use thoughtfully.

      1. Vimes

        Re: crawling security @Chris Miller

        It's a good way though of picking out bots that need to be blocked - just add a line to deny access to a location that doesn't exist then periodically check the error log.

        Of course using something like iptables or 'deny from' entries in the .htaccess file is probably better if you really do want to block access. Personally speaking I've done my best to block all Amazon EC2 IP addresses from my site. After all, how often does anybody see a legitimate visit from 'the cloud'?

  2. Anonymous Coward
    Anonymous Coward

    You are welcome to scrape my LinkedIn accnt

    as it has nothing of revelance stored in it. I only use it so that I can checkout potential hires who boast about their LinkedIn Profile. It is amazing what lies you can uncover there. Like the one who said that he was employed at XXXX a year before XXXX even existed.

    Seriously though, it is any wonder that sites like this have IMHO gone well past their sell by date.

    I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias. So far this has been very successful indeed. Mind you, I don't use Twitter or Facebook.

    1. Anonymous Coward
      Anonymous Coward

      Re: You are welcome to scrape my LinkedIn accnt

      I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias.

      So what's the story here ? You forgot ? Couldn't be arsed ? Or ??

    2. Jason Bloomberg Silver badge

      Re: You are welcome to scrape my LinkedIn accnt

      I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias.

      Are you actually saying you will always use an alias if you can and if you cannot you will use your real name?

  3. taxman
    Facepalm

    Publicly available PI

    So if they don't want 'scapers' to access this *publicly available* data........just make it private.

    It's a fact of life for any public website that contains *useful* information, robot scrapers will be used to harvest information. And as far as I know accessing *publiclicy available* data is not an offence if it does not damage the data or infrastructure it is held on or deny access to others.

    1. Vimes

      Re: Publicly available PI

      Was it really entirely publicly available? If it was then why were fake accounts required to do the scraping?

      Members only != publically accessible (IMO)

      And as far as I know accessing *publiclicy available* data is not an offence

      It may well be if it has been made clear that their presence is not welcome. See the parts of the Computer Misuse Act 1990. In particular sections 1(1)(b) and 1(1)(c).

      IANAL, but they would appear to have deliberately evaded limits put in place that define who can access it - members only in this case? - and limits stated in the robots.txt file. All this would make their actions far more questionable than it may first appear IMO.

    2. Psyx

      Re: Publicly available PI

      Although it's true that it's legal, would you -if you owned a shop perhaps- like someone to come in with a clipboard and write down every single price, measure everything and cull as much business intel from you as they possibly could, or would you sling them out and call them a c***?

      Amazon scrape the hell out of our site, and it's rather aggravating to see copy outright stolen, too.

      1. Steve 129

        Re: Publicly available PI

        That's the point though. If you owned a shop then you COULD throw them out.

        If your shop window allows the 'public' to see in them you can't complain when they do so, even if it is just to copy your prices down.

        If you don't want them to see through the window then cover it up and block it.

  4. John Styles

    LinkedIn has butt-hut?

    I weep bitter bitter tears.

  5. heyrick Silver badge

    LinkedIn ?

    This the same linkedin that wanted me to sign up to block spam from some bloke (I don't know) called Kushnoor? Who's unsubscribe link doesn't do anything?

    DIE.

  6. Anonymous Coward
    Anonymous Coward

    Really? Fuze? Sentinel?

    This is really, really basic stuff.

    It's a little sad that they've bothered to give a codename to what is little more than obvious, old-fashioned flood control. Still, that's the IT industry for you. I'm surprised they didn't try to actually patent it.

  7. Anonymous Coward
    Anonymous Coward

    Linkedin to what?

    I looked at linkedin which was stated to be for professionals and it claimed a membership of IIRC 10 million for the UK. I immediately decided it was for posers and a waste of my time. I couldn't distinguish it from Facebook/myspace in terms of being worthwhile.

  8. Seanmon

    True

    I fucking hate linkedin. But even more, I hate the pimply-faced trendy-haired skinny-trousered pointy-shoed so-called "recruitment consultants" who seem to think if you're not on linkedin, you're not interested in the job.

  9. chris lively

    wtf?

    If it is "publicly available" then I think the courts ought to throw the whole thing out. Perhaps even beat LinkedIn with a stick by forcing them to provide a web api to make it easier to grab. Which, now that I think about it, sounds like it would be something LinkedIn would want to do anyway...

    If it is data that is not "publicly available" then it seems to me that someone ought to sue LinkedIn for not doing enough to secure everyone's private data.

    All in all I'm failing to see the angle in which LinkedIn should have a standing to sue. And, no, I don't give a rats ass about their "terms of service" for data that will can show up on a google search.

This topic is closed for new posts.

Other stories you might like