back to article Use of web archive was not hacking, says US court

The use of web archive The Wayback Machine did not constitute hacking in the case of a law firm which used the web archive to see pages which owners did not want it to see, a US court has ruled. The deliberate bypassing or evasion of the archive's protection measures could still be deemed hacking, though, said Judge Robert …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Anonymous Coward

    Robots.txt is a request, not a technical measure

    No browser currently uses robots.txt as a method preventing ordinary users from accessing web pages. It is only there to ask *robots* not to visit certain pages.

    Equally, there is as far as I'm aware not even any *ability* for the end-user of archive.org to look up the robots.txt file, and certainly no compulsion on them to do so.

    It might have been more plausible if the site owners had tried to sue archive.org for ignoring the robots.txt when they originally crawled the site (though I suspect the fault was probably the host site failing to deliver the robots.txt when requested). But to even consider sueing an archive.org end user for accessing public domain material is ridiculous in the extreme and the very strongly suggests the action of a drowning man clutching a straw.

    However, robots.txt is surely just a recommendation followed by many but not all robots, and is not a technical measure. If I put a number of different coloured notices on a board and another notice which say s "please do not read any yellow notices" then that is surely just a request, not a technical measure. Because there is absolutely nothing technical that is *stopping* me from reading the yellow notices. If on the other hand, I had distinguished the yellow notices by putting them into sealed envelopes, that might indeed be viewed as a proper technical measure.

    As I see it, robots.txt is only a "please do not read" notice, and if you really don't want people seeing your pages you ought to use a proper technical measure (such as password protection) rather than merely putting a "please do not read" notice on the site.

  2. tim

    Wot he sed

    Summed up nicely

  3. Dam

    Like... htaccess anyone?

    Anyone out there able to figure out why the *retards* didn't put up an htaccess to protect their screenies?

    BEAM ME UP SCOTTIE, there's no intelligent life down there (c) BOFH

  4. Jim Sissel

    ID-Ten-Ts

    If they don't want people reading the pages the only sure way of doing that is to not put them on the Internet. The judge blew this one in his ruling that robots.txt is in any was a DCMA measure. His ruling should have been very clear and plain.

    DON'T PUBLISH STUFF YOU DON'T WANT PEOPLE TO SEE!

  5. Owen Carter

    Is this really about revisionism?

    I don't get it. The Issue here seems to read as revisionism, someone is being sued for reading a past version of a document, which was once in the public domain. The technical details of the case (copyright, blah, hacking, blah) is just a vehicle being used by someone to try and suppress something that they no longer want public.

    Imagine this same data had been published and distributed in a paper document. And then the authors subsequently decided that they did not want people to read it; so they gathered up all the copies they could find and published a new version of the same paper with different contents.

    Would they be able to sue you if you found a original copy that someone had helpfully placed in a local library? Would it be a crime to read it? Would it be inadmissible in court as proof of anything?

  6. Tom Haczewski

    Completely agree

    If these muppets don't know that if you publish your stuff to the web and don't password protect it, people might read it they should probably get a new, non-IT job.

    DUH.

  7. b shubin

    Clueful judge

    truly an uncommon animal within the US judicial system. most judges here, when given an opportunity to make sweeping, disruptive judgments on technology issues, jump in with both jackbooted feet.

    the DMCA is, without question, one of the worst technology laws ever enacted. it is certainly in the top 5. be that as it may, this act has not been declared unconstitutional, and it has not been amended in any significant way.

    the judgement conforms very narrowly to the letter of the DMCA, which is vague enough to include simple things like robots.txt under anti-circumvention measures. the opinion specifically excludes any other cases from using itself as a precedent (precedent is very powerful in US law, good to see someone treating it with care).

    kudos to the judge for a wise, well-considered ruling (that is the fist time i have written those words in almost 40 years).

  8. Dillon Pyron

    Timing

    So, they put the robots.txt file on "the 7th or 8th". And the defendant viewed the archived copy on the 9th. Who knows when they were archived? It could have been a month previous.

    As many others have pointed out, there is no obligation to observe an robots.txt file.

  9. James

    robots.txt

    robots.txt has no place in law, and there is no compulsion to follow its suggestions.

    If one places a curtain over one's bedroom window to keep passing law enforcement from peeking in and arresting the occupant while in the act of some sort of illegal fetish, and the wind blows the curtain aside long enough for said officers to get a look, then it's perfectly legal (in the US) for the officers to then proceed to act under the "plain view" doctrine. The simple fact of hanging the curtain does not protect the criminal. While the officers are forbidden from moving the curtain, themselves, if an "accident" disables the curtain's protections, then there is technically no curtain ... and no protection.

    Regardless of whether the defendants had included robots.txt (the curtain), IA grabbed their site and offered it to the plaintiffs without extraordinary effort.

    If anything, this should have been about the rights of IA to permanently store copyrighted content without the permission of its copyright owners ... which brings us to Google's cache ...

  10. Goldie

    A barn door in the middle of a field?

    It seems that the "experts" make no difference between robots.txt and .htaccess files!! While the latter indeed serves a functionality similar to lock the former is just a notice.

    A real life example to IT illiterate laweyrs: putting in a public building a sign forbidding taking pictures (what actually web archives do) still does not put any technical obstacles preventing the public to use their bare eyes. What on Earth can prevent a determined councel to give an order to few interns with browsers ("click on every hyperlink if it keeps you on that web site"), and to dig through results.

    Copyrighting a book does forbid to make illegal copies of it but AFAIK does not prevent reading it (if one is literate enough)!

This topic is closed for new posts.