back to article Data scrapers used Amazon cloud to reap biz bods' CVs, wails LinkedIn

LinkedIn is still waging its battle against “scrapers”, who use software to automatically harvest publicly available personal information from the social network. And that fight has today wound up in a California court where the website's bosses are trying to unmask the miscreants who have reaped the site for users' employment …

COMMENTS

This topic is closed for new posts.
Silver badge

crawling security

"Other security measures the attackers circumvented included “Sentinel”, which limits the number of requests permitted from a single IP address, and the restrictions on crawling that are implemented in its robots.txt file."

I thought that robots.txt was a polite request; hardly a security measure, especially against people who have a tendency to break your rules.

5
0
Silver badge

Re: crawling security

robots.txt is a polite request to automated web crawlers. It can also sometimes be a handy way of pointing the bad guys to where your crown jewels are located. Use thoughtfully.

2
0
Silver badge

Re: crawling security @Chris Miller

It's a good way though of picking out bots that need to be blocked - just add a line to deny access to a location that doesn't exist then periodically check the error log.

Of course using something like iptables or 'deny from' entries in the .htaccess file is probably better if you really do want to block access. Personally speaking I've done my best to block all Amazon EC2 IP addresses from my site. After all, how often does anybody see a legitimate visit from 'the cloud'?

5
0
Anonymous Coward

You are welcome to scrape my LinkedIn accnt

as it has nothing of revelance stored in it. I only use it so that I can checkout potential hires who boast about their LinkedIn Profile. It is amazing what lies you can uncover there. Like the one who said that he was employed at XXXX a year before XXXX even existed.

Seriously though, it is any wonder that sites like this have IMHO gone well past their sell by date.

I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias. So far this has been very successful indeed. Mind you, I don't use Twitter or Facebook.

0
3
Anonymous Coward

Re: You are welcome to scrape my LinkedIn accnt

I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias.

So what's the story here ? You forgot ? Couldn't be arsed ? Or ??

2
0
Bronze badge

Re: You are welcome to scrape my LinkedIn accnt

I made a policy about 10 years of using my real name on the internet whenever I couldn't realistically use an alias.

Are you actually saying you will always use an alias if you can and if you cannot you will use your real name?

3
0
Facepalm

Publicly available PI

So if they don't want 'scapers' to access this *publicly available* data........just make it private.

It's a fact of life for any public website that contains *useful* information, robot scrapers will be used to harvest information. And as far as I know accessing *publiclicy available* data is not an offence if it does not damage the data or infrastructure it is held on or deny access to others.

5
2
Silver badge

Re: Publicly available PI

Was it really entirely publicly available? If it was then why were fake accounts required to do the scraping?

Members only != publically accessible (IMO)

And as far as I know accessing *publiclicy available* data is not an offence

It may well be if it has been made clear that their presence is not welcome. See the parts of the Computer Misuse Act 1990. In particular sections 1(1)(b) and 1(1)(c).

IANAL, but they would appear to have deliberately evaded limits put in place that define who can access it - members only in this case? - and limits stated in the robots.txt file. All this would make their actions far more questionable than it may first appear IMO.

0
1
Silver badge

Re: Publicly available PI

Although it's true that it's legal, would you -if you owned a shop perhaps- like someone to come in with a clipboard and write down every single price, measure everything and cull as much business intel from you as they possibly could, or would you sling them out and call them a c***?

Amazon scrape the hell out of our site, and it's rather aggravating to see copy outright stolen, too.

0
0

Re: Publicly available PI

That's the point though. If you owned a shop then you COULD throw them out.

If your shop window allows the 'public' to see in them you can't complain when they do so, even if it is just to copy your prices down.

If you don't want them to see through the window then cover it up and block it.

1
0

LinkedIn has butt-hut?

I weep bitter bitter tears.

0
0
Silver badge

LinkedIn ?

This the same linkedin that wanted me to sign up to block spam from some bloke (I don't know) called Kushnoor? Who's unsubscribe link doesn't do anything?

DIE.

0
0
Anonymous Coward

Really? Fuze? Sentinel?

This is really, really basic stuff.

It's a little sad that they've bothered to give a codename to what is little more than obvious, old-fashioned flood control. Still, that's the IT industry for you. I'm surprised they didn't try to actually patent it.

0
0
Anonymous Coward

Linkedin to what?

I looked at linkedin which was stated to be for professionals and it claimed a membership of IIRC 10 million for the UK. I immediately decided it was for posers and a waste of my time. I couldn't distinguish it from Facebook/myspace in terms of being worthwhile.

0
0

True

I fucking hate linkedin. But even more, I hate the pimply-faced trendy-haired skinny-trousered pointy-shoed so-called "recruitment consultants" who seem to think if you're not on linkedin, you're not interested in the job.

1
0

wtf?

If it is "publicly available" then I think the courts ought to throw the whole thing out. Perhaps even beat LinkedIn with a stick by forcing them to provide a web api to make it easier to grab. Which, now that I think about it, sounds like it would be something LinkedIn would want to do anyway...

If it is data that is not "publicly available" then it seems to me that someone ought to sue LinkedIn for not doing enough to secure everyone's private data.

All in all I'm failing to see the angle in which LinkedIn should have a standing to sue. And, no, I don't give a rats ass about their "terms of service" for data that will can show up on a google search.

1
0
This topic is closed for new posts.

Forums