User topics

Article topics

Log in Sign up

Use of web archive was not hacking, says US court

The use of web archive The Wayback Machine did not constitute hacking in the case of a law firm which used the web archive to see pages which owners did not want it to see, a US court has ruled. The deliberate bypassing or evasion of the archive's protection measures could still be deemed hacking, though, said Judge Robert …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Thursday 2nd August 2007 11:13 GMT Anonymous Coward

Robots.txt is a request, not a technical measure

No browser currently uses robots.txt as a method preventing ordinary users from accessing web pages. It is only there to ask *robots* not to visit certain pages.

Equally, there is as far as I'm aware not even any *ability* for the end-user of archive.org to look up the robots.txt file, and certainly no compulsion on them to do so.

It might have been more plausible if the site owners had tried to sue archive.org for ignoring the robots.txt when they originally crawled the site (though I suspect the fault was probably the host site failing to deliver the robots.txt when requested). But to even consider sueing an archive.org end user for accessing public domain material is ridiculous in the extreme and the very strongly suggests the action of a drowning man clutching a straw.

However, robots.txt is surely just a recommendation followed by many but not all robots, and is not a technical measure. If I put a number of different coloured notices on a board and another notice which say s "please do not read any yellow notices" then that is surely just a request, not a technical measure. Because there is absolutely nothing technical that is *stopping* me from reading the yellow notices. If on the other hand, I had distinguished the yellow notices by putting them into sealed envelopes, that might indeed be viewed as a proper technical measure.

As I see it, robots.txt is only a "please do not read" notice, and if you really don't want people seeing your pages you ought to use a proper technical measure (such as password protection) rather than merely putting a "please do not read" notice on the site.

0 0
Thursday 2nd August 2007 13:22 GMT tim

Wot he sed

Summed up nicely

0 0
Thursday 2nd August 2007 14:32 GMT Dam

Like... htaccess anyone?

Anyone out there able to figure out why the *retards* didn't put up an htaccess to protect their screenies?

BEAM ME UP SCOTTIE, there's no intelligent life down there (c) BOFH

0 0
Thursday 2nd August 2007 15:10 GMT Jim Sissel

ID-Ten-Ts

If they don't want people reading the pages the only sure way of doing that is to not put them on the Internet. The judge blew this one in his ruling that robots.txt is in any was a DCMA measure. His ruling should have been very clear and plain.

DON'T PUBLISH STUFF YOU DON'T WANT PEOPLE TO SEE!

0 0
Thursday 2nd August 2007 17:19 GMT Owen Carter

Is this really about revisionism?

I don't get it. The Issue here seems to read as revisionism, someone is being sued for reading a past version of a document, which was once in the public domain. The technical details of the case (copyright, blah, hacking, blah) is just a vehicle being used by someone to try and suppress something that they no longer want public.

Imagine this same data had been published and distributed in a paper document. And then the authors subsequently decided that they did not want people to read it; so they gathered up all the copies they could find and published a new version of the same paper with different contents.

Would they be able to sue you if you found a original copy that someone had helpfully placed in a local library? Would it be a crime to read it? Would it be inadmissible in court as proof of anything?

0 0
Thursday 2nd August 2007 17:19 GMT Tom Haczewski

Completely agree

If these muppets don't know that if you publish your stuff to the web and don't password protect it, people might read it they should probably get a new, non-IT job.

DUH.

0 0
Thursday 2nd August 2007 17:19 GMT b shubin

Clueful judge

truly an uncommon animal within the US judicial system. most judges here, when given an opportunity to make sweeping, disruptive judgments on technology issues, jump in with both jackbooted feet.

the DMCA is, without question, one of the worst technology laws ever enacted. it is certainly in the top 5. be that as it may, this act has not been declared unconstitutional, and it has not been amended in any significant way.

the judgement conforms very narrowly to the letter of the DMCA, which is vague enough to include simple things like robots.txt under anti-circumvention measures. the opinion specifically excludes any other cases from using itself as a precedent (precedent is very powerful in US law, good to see someone treating it with care).

kudos to the judge for a wise, well-considered ruling (that is the fist time i have written those words in almost 40 years).

0 0
Thursday 2nd August 2007 18:45 GMT Dillon Pyron

Timing

So, they put the robots.txt file on "the 7th or 8th". And the defendant viewed the archived copy on the 9th. Who knows when they were archived? It could have been a month previous.

As many others have pointed out, there is no obligation to observe an robots.txt file.

0 0
Thursday 2nd August 2007 21:44 GMT James

robots.txt

robots.txt has no place in law, and there is no compulsion to follow its suggestions.

If one places a curtain over one's bedroom window to keep passing law enforcement from peeking in and arresting the occupant while in the act of some sort of illegal fetish, and the wind blows the curtain aside long enough for said officers to get a look, then it's perfectly legal (in the US) for the officers to then proceed to act under the "plain view" doctrine. The simple fact of hanging the curtain does not protect the criminal. While the officers are forbidden from moving the curtain, themselves, if an "accident" disables the curtain's protections, then there is technically no curtain ... and no protection.

Regardless of whether the defendants had included robots.txt (the curtain), IA grabbed their site and offered it to the plaintiffs without extraordinary effort.

If anything, this should have been about the rights of IA to permanently store copyrighted content without the permission of its copyright owners ... which brings us to Google's cache ...

0 0
Friday 3rd August 2007 12:08 GMT Goldie

A barn door in the middle of a field?

It seems that the "experts" make no difference between robots.txt and .htaccess files!! While the latter indeed serves a functionality similar to lock the former is just a notice.

A real life example to IT illiterate laweyrs: putting in a public building a sign forbidding taking pictures (what actually web archives do) still does not put any technical obstacles preventing the public to use their bare eyes. What on Earth can prevent a determined councel to give an order to few interns with browsers ("click on every hyperlink if it keeps you on that web site"), and to dig through results.

Copyrighting a book does forbid to make illegal copies of it but AFAIK does not prevent reading it (if one is literate enough)!

0 0

This topic is closed for new posts.

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024