If we're talking about just another bot like GoogleBot going out there to look at what it can find, then couldn't they be blocked via something like the .htaccess file?
Publishing ANYTHING on .uk? From now, Big Library gets copies
On the same day that thousands of public sector bods will go on strike in a row over pay, pensions and working conditions, new regulations will come into force at midnight tonight allowing the British Library to begin scraping content from UK websites. Under the rules - known as legal deposit - the country's biggest collector …
-
-
Friday 5th April 2013 11:15 GMT Vimes
Also: I hope they don't plan on using cloud based services like the offerings provided by the likes of Amazon to do the scraping and rely on their own servers.
I've already got Amazon on the naughty step thanks to hacking attempts coming from them, and the only other thing that would distinguish the legitimate scraper from the rest in the access log would be the user agent string - and this can be easily forged.
-
-
-
Friday 5th April 2013 13:26 GMT Vimes
Re: Errrr presuambly these are excluded
Somebody has been paying too much attention to certain ideas put out as April fools jokes...
http://gizmodo.com/5777429/the-entire-internet-on-a-floppy-disk
Also found this when looking for a link like the one above:
http://www.w3schools.com/downloadwww.htm
:)
-
-
Friday 5th April 2013 11:27 GMT Anonymous Coward
What about our copyrights?
"Under the rules - known as legal deposit - the country's biggest collector of publications produced in the UK and Ireland will start harvesting what it described as "ephemeral materials like websites" to ensure that the content is "preserved forever"".
Yet what if I published something (put online) which I don't want to be preserved (yet) ?
And let's ignore the obvious "I own copyright on my work" issue but what about situations where I pre-publish stuff to appeal to the visitors while I'm still working on it? I'm doing that a lot with several tutorials I write (I'm passionate about sound synthesis & design and maintain my own hobby website) and as long as a version hasn't reached v1.0 status I wouldn't want to see it getting included with some big collection of stuff. Simply because some things could easily change, sometimes quite drastically.
Another issue; although its very easy to point at Google many people forget that in contrast to popular belief something which gets slurped by Google can be removed again. And it's quite easy too, the keywords being webmaster tools. As others above already pointed out; you can even prevent Google from indexing your site (or parts of it).
So what do these guys provide? Or are we now down to "We're the government, we decide, the end justifies the means, it's all for the common good, stop whining." ?
And some people still wonder why so many are losing faith rapidly when it comes to governments in combination with IT.
-
Friday 5th April 2013 11:37 GMT Pen-y-gors
Re: What about our copyrights?
Copyright isn't an issue, that's the whole point of copyright deposit - publishers of books are required to give a copy to each of the six copyright deposit libraries (if they want one) - it doesn't affect the authors' copyright. Ditto with online material - the copyright is unchanged, but they are now allowed to make a copy for archival purposes whether you want them to or not. If you want to keep it secret don't publish it on the web for everyone to see and download.
This greatly simplifies things - the National Library of wales started to archive a number of 'important' Welsh websites several years ago, but had to contact the site owners of each one and get their permission in advance, and, if I remember rightly, the copies are not accessible outside the Library network - it's an archive for long term preservation, not a mirror site.
-
Friday 5th April 2013 12:01 GMT Anonymous Coward
Re: What about our copyrights?
Copyright isn't an issue
but are they just slurping text and ebook files? or are they taking music, movies, images and everything?
if you have a website where you've purchased the rights to display photographs (i.e. a celebrity fansite), the license only exists to your own site. Are the British Library purchases a blanket license from the likes of GettyImages?
what if you've put up a mp3 of your favourite music on your site? it's not worth the music industry targeting you for your single infringement, but after this exercise, the British Library could be liable for millions of copyright infringements for non-book material.
-
Friday 5th April 2013 12:06 GMT David Dawson
Re: What about our copyrights?
Copyright is a legally granted monopoly given to the creator of a work.
Its not something that naturally exists, its a collection of laws passed by HM Government.
So, if the Government of the day chooses to alter how copyright is assigned to allow the British Library to scrape the UK portion of the internet, it is perfectly legal for it to do that, as it created the entire concept of copyright in UK law in the first place.
-
Friday 5th April 2013 13:01 GMT Vimes
Re: What about our copyrights?
But where laws are concerned: whose laws take priority when more often than not sites cross national boundaries?
Take for example:
http://amazon.co.uk.ipaddress.com/
Hostname: amazon.co.uk, IP Address: 176.32.108.186, Organization: PROD DUB, ISP: Amazon Data Services Ireland Ltd, City: -, Country: Netherlands
As for the registrant's address for Amazon.co.uk:
65 boulevard G-D. Charlotte, Luxembourg City, Luxembourg, LU-1311, Luxembourg
-
Friday 5th April 2013 18:07 GMT Ken Hagan
Re: What about our copyrights?
I expect the UK government's attitude would be that (since the .uk namespace belongs to them, even if they do delegate its management to Nominet) if you publish under a .uk address, you are putting the material (and perhaps yourself) under UK law. If you are re-publishing stuff which you don't have the right to put under UK law, that would be a matter between you and the owner of the stuff you are re-publishing.
-
-
Saturday 6th April 2013 04:11 GMT Anonymous Coward
Re: What about our copyrights?
off-topic @Daivid Dawson: what kind of answer is that? It's ok for the government to take things away since they created it?
If one day the UK is to be hit by a meteorite, and the UK government decided to suspend all telecommunications, air and cross-channel traffic to prevent panicks and to only allow the "privileged" to safely escape the country, according to your reasoning, it's ok to do that since they created much of what modern society is made up of.
I didn't realise we're still a bunch of serfs under the feudal system.
-
Saturday 6th April 2013 11:10 GMT Anonymous Coward
Re: What about our copyrights?
"I didn't realise we're still a bunch of serfs under the feudal system".
Most people don't realise that. Congratulations on waking up and noticing reality. (Matrix, anyone?)
Consider. As Walter Bagehot said, Parliament can do anything except change a man into a woman. (And with modern technology I'm not sure that restriction applies any more). Parliament is an assembly of our elected representatives, which therefore expresses our collective will - right? Wrong. Parliament is an assembly of self-seeking jobsworths a majority of whom do exactly what the Prime Minister tells them to - if they want to go on enjoying their cushy lifestyle.
So far, we have established that David Cameron can do anything he wants, with the possible exception of sex changes. There is no essential difference between his power and that of a medieval king such as, perhaps, William the Conqueror or John. So yes, actually, we are serfs - except that serfs had more concrete and enforceable rights than we do. And didn't have to pay as much tax. (See, for example, https://sites.google.com/site/stevenburgauer/essay03).
-
-
Sunday 7th April 2013 09:17 GMT Anonymous Coward
Re: What about our copyrights?
"I'll believe in this relationship when Cameron dies from a surfeit of peaches and cider and Miliband is found drowned in the Butt of Ramsey. Or wherever he has been brownnosing this week".
Somewhere, Messrs Sellers and Yeatman are laughing heartily. They must have come up with some jokes the publishers wouldn't print.
But there is something in it. After all, wasn't King Gordon faced down by the banking barons?
-
-
-
-
-
Saturday 6th April 2013 21:32 GMT David Dawson
Re: What about our copyrights?
"off-topic @Daivid Dawson: what kind of answer is that? It's ok for the government to take things away since they created it?
If one day the UK is to be hit by a meteorite, and the UK government decided to suspend all telecommunications, air and cross-channel traffic to prevent panicks and to only allow the "privileged" to safely escape the country, according to your reasoning, it's ok to do that since they created much of what modern society is made up of.
I didn't realise we're still a bunch of serfs under the feudal system."
-----------
In this country, Parliament is sovereign, so yes, if the government chose to do that, then that would be legal, which is a different thing to 'ok'. Legal and moral/ ethical are separate concepts I'm afraid.
Sorry you had to find out this way. I wish they would teach this kind of thing in school.
"Er, and other governments. The UK government can pass laws overriding the copyright it grants, but not that granted by the USA, France, Germany, China..."
--------------
Only so far as the law in this country respects those other countries laws. Which is what sovereign means. This is an important distinction! The UK has signed up to copyright treaties, so I imagine they would be respected...
-
Saturday 6th April 2013 22:06 GMT Anonymous Coward
Re: What about our copyrights?
<quote> The UK government can pass laws overriding the copyright it grants, but not that granted by the USA, France, Germany, China...</quote>
Yes it can, for the same reasons that other countries like China get to ignore OUR copyright laws, we may end up violating some treaty but at the end of the day? trade sanctions from the USA? no one cares about their entertainment being banned anyway.
-
-
-
-
-
Friday 5th April 2013 12:34 GMT Gav
Re: What about our copyrights?
It's not rocket science people. If you do not want people to take a copy of your website content then do not put it on a publicly accessible website. It's how browsers work. They have caches, they take copies.
Copyright has nothing to do with it, as no-where is it said that the British Library will be re-publishing your website. It has a copy. It will let others see that copy, just like it already does for millions of books.
-
Saturday 6th April 2013 20:42 GMT Anonymous Coward
Re: What about our copyrights?
Agree entirely. What cheek!
I have no objection to services like Archive.org, which keep a record of valuable sites, and which any of us can access. But I object to this one. Why? Because the "archive" is purely for the benefit of staff at the British Library (and, yes, the relative handful of people who can physically walk there, if the staff decide to let them use it too). I don't put stuff on the web for the benefit of British Library staff; I do so for everyone.
As ever with the British Library, "one for us, and none for you".
-
-
-
Friday 5th April 2013 11:39 GMT Pen-y-gors
Re: Wayback machine
Yep, an existing private service that can be switched off tomorrow at the whim of the owner. That's not what I call secure long-term archiving and preservation. It's important for people in 200 years to have access to the the day-to-day publications of the 21st century - will Wayback machine still be online in 5 years, 10 years, 20 years, 50 years?
-
Sunday 7th April 2013 09:20 GMT Anonymous Coward
Re: Wayback machine
"It's important for people in 200 years to have access to the the day-to-day publications of the 21st century..."
If you believe for a single moment that Web sites scraped by copyright libraries and stored with today's technology will be legible in 200 years, I have $1 trillion worth of hybrid HD/SSDs to sell you.
-
-
-
Friday 5th April 2013 11:52 GMT Tom7
How far does this go?
I'm curious, though not curious enough to go look it up. Does this allow them to scrape your content, or does it force you to allow them to scrape it? What I'm getting at is, what if I detect the British Library robot and send it off to some obscure error page to prevent them archiving my site? Has that just become illegal? Or does it just indemnify the libraries from copyright claims if they happen to get to my content?
-
Friday 5th April 2013 12:00 GMT Kubla Cant
Beano
Although these libraries are entitled to receive and keep a copy of all copyright publications, I don't think they necessarily do so. I seem to recall that the Bodleian failed to produce back numbers of the Beano to help while away the long hours that should have been spent writing essays.
In the case of web content, the fact that a large and increasing proportion is produced dynamically must make things difficult. For many sites there's no such thing as a definitive copy, so there's nothing to keep.
And can anybody explain why an alien university such as Trinity College Dublin should benefit from the free handout of books?
-
Friday 5th April 2013 12:53 GMT Anonymous Coward
Re: Beano
Although these libraries are entitled to receive and keep a copy of all copyright publications, I don't think they necessarily do so.
However, many years ago I seem to recall reading an amusing report about a school somewhere in England that had discovered a dusty old tome in its library and after a bit of research came to the conclusion they had the only copy of this book in existence. They made a big deal of it with press releases etc ... and were then surprised when they got a letter from one of the copyright libraries saying that as they were entitled to a copy of the book and as this seemed to be the only copy available then they'd be sending someone to collect it!
-
Saturday 6th April 2013 11:15 GMT Anonymous Coward
Re: Beano
"...they got a letter from one of the copyright libraries saying that as they were entitled to a copy of the book and as this seemed to be the only copy available then they'd be sending someone to collect it!"
Think of all the fun they could have had by informing the other copyright libraries of the situation, and then watching them fight it out.
-
-
-
Friday 5th April 2013 12:05 GMT heyrick
Thoughts
My stuff is released under a sort of licence. Essentially it is reminding you of copyright, but it also expressly forbids the content being served by a third party system while my website is still "live" (when I'm gone, it's no longer my problem). Secondly it prohibits in any case the modification of content for any purpose other than translation (especially the practice of detecting keywords and linking them to adverts). Those are the terms of distribution, Accept them or piss off, basically.
Secondly, given that recently a person was guilty of libel for retweeting a lie; I presume if somebody libels on their site and this turns up in the copy, the British Library will also be equally liable.
Thirdly, any terms and conditions imposed by the library will be groundless; they want to come get our content and copy it, so good luck making a disclaimer stick...
Fourthly, I assume it will obey robots.txt; if not it'll get blocked by IP on principle (or maybe I'll redirect them to their own website?).
Did they think this through?
-
Friday 5th April 2013 12:43 GMT Anonymous Coward
Re: Thoughts
" Secondly it prohibits in any case the modification of content for any purpose other than translation (especially the practice of detecting keywords and linking them to adverts). Those are the terms of distribution, Accept them or piss off, basically."
You don't understand copyright. You have been *given* the right over copies on the condition that the BL is allowed to store the material regardless of what you want/like. If you don't like it, your choices are to take it off line or release it as public domain.
This is not new; it's just a clarification of existing law.
-
Friday 5th April 2013 17:04 GMT dssf
Re: Thoughts THEREin lies the problem
Well, for some people. OK, so I'm trying to "deconstruct" , this to understand it...
Does a government think that it "created the concept of copyright"? Why cannot this be just a mere recognition that inventiveness deserves some protection?
If a author writes a story, publishes it, and people pay for it (assuming it is that good), then, if government wants a copy, does it then get in line and PAY for a copy just like anyone else paying for a legit copy? (Doesn't the author have to pay just to obtain recognition of the copyright?) Otherwise, TAKING a copy could be seen as tantamount to theft. (If I just said "arrrestable words", then, kindly remind me not to dare step foot in the UK or on soil from where I can be extradicted to the UK...) I could see huge problems in the future if the UK library law were allowed to perpetuate on distant human colonies. Colonists might throw an insurrection unless it is the PEOPLE who collectively say that it is an okay thing. I am not saying *hide* or deny the preservation of published materials of worth or note, but that each published work preserved by a library should first be done so with the permission of the author or rightful rights holder. Government doesn't publish fiction, cooking guides, comics, porn, or love stories. So, it doesn't deserve to "own" the copyright in those works. Fortunately, it seems, things are different in the states. Well, to an extent. Here, it's getting to the point where the public may end up paying to access court documents and "public records".
Would it be unreasonable to hear someone say "The real fact behind such a proclamation is that it allows powerful men to shut down and jail/imprison those it deams a threat"? If government "grants" rather than "recognizes" copyright of an other's works, then it means government can shut down a voice it doesn't want heard. Copyright may be a "human construct", but it should not be a right for any damned government to think it can just take and shut down works.
Yes, I get it, the story is not about copyright and government profiting financially.
BTW, do I understand that in the UK, if a person in the UK publishes a work, and makes only ONE physical copy, and the government library system wants a copy, it can *demand* a copy? What if the author says, "You must pay me for time, materials, and labor", and marks it up above street price? Would that be legal? Can the English/UK library system demand the author provide a free copy? I am assuming that an author or publisher or copyright owner must pay at the government toll gate to initially get that piece of paper stating "you're the proud owner of this government-issued/authorizied/revokable copyright"...
-
Friday 5th April 2013 18:40 GMT Steve Knox
Re: Thoughts THEREin lies the problem
Does a government think that it "created the concept of copyright"?
-
Saturday 6th April 2013 11:23 GMT Anonymous Coward
Re: Thoughts THEREin lies the problem
"If a author writes a story, publishes it, and people pay for it (assuming it is that good), then, if government wants a copy, does it then get in line and PAY for a copy just like anyone else paying for a legit copy?"
For the same reason that, if a government wants £100 billion to give to corrupt banksters or to render some remote country uninhabitable, it doesn't tell its ministers to roll up their sleeves and do some honest work to earn the money. It just passes a law compelling the rest of us to give it the money.
I'm always amused by people who maintain that "violence never solves anything" or that we live in an essentially peaceful society. Everything government does is founded solidly on an indispensable foundation of almost unlimited violence. If I don't wish to pay taxes, they will take them out of my bank account. If I withdraw the money and hide it under my bed, they will send policemen to take it from there. If I resist, the policemen will arrest me. If I won't let them, they will threaten me with weapons. If I defend myself with a weapon (using an appropriate level of force) they will, at some point, kill me - or wound me severely enough to stop me resisting. Then (if I survive) they will send me to prison and keep me there by more violence.
That is how government works. But don't take my word for it. Would you believe the first president of the freest, most democratic, most wonderful nation the world has ever seen?
"Government is not reason. It is not eloquence. It is a force, like fire: a dangerous servant and a terrible master".
- George Washington
-
Tuesday 9th April 2013 12:20 GMT Anonymous Coward
Re: Thoughts THEREin lies the problem
"If a author writes a story, publishes it, and people pay for it (assuming it is that good), then, if government wants a copy, does it then get in line and PAY for a copy just like anyone else paying for a legit copy? (Doesn't the author have to pay just to obtain recognition of the copyright?)"
The government does pay, in the form of the legal protection. And, no, the author doesn't have to pay for copyright in the UK.
-
-
Friday 5th April 2013 22:40 GMT heyrick
Re: Thoughts
"You don't understand copyright. You have been *given* the right over copies on the condition that the BL is allowed to store the material regardless of what you want/like."
In other words, "let's tweak a law so we can record a copy of everything without landing in trouble".
Consider this: The acts restricted by copyright in a work. (1)The owner of the copyright in a work has, in accordance with the following provisions of this Chapter, the exclusive right to do the following acts in the United Kingdom— (a)to copy the work (see section 17); (b)to issue copies of the work to the public (see section 18); [F44(ba)to rent or lend the work to the public (see section 18A);] (c)to perform, show or play the work in public (see section 19); [F45(d)to communicate the work to the public (see section 20);] (e)to make an adaptation of the work or do any of the above in relation to an adaptation (see section 21); and those acts are referred to in this Part as the “acts restricted by the copyright”.
(2)Copyright in a work is infringed by a person who without the licence of the copyright owner does, or authorises another to do, any of the acts restricted by the copyright.
In other words, I as the author of something have the right to provide it, or not. On the terms of my choosing. This is part of national and international agreements that just can't be arbitrarily modified. So, no, I do not believe that I have been given the "right" on the condition that the BL copies everything regardless. [further complication: my material originates in France and is uploaded to a .co.uk domain hosted in the United States... <grin>]
Actually, I don't care if they copy, it is the (re)serving that I don't appreciate.
The above, by the way, is from the Copyrights, Designs, and Patents Act 1988 - read it here (part 1, chapter 1, paragraph 16).
.
Now, you may believe that this is a big storm in a very small teacup (yes, it is), however it introduces interesting precedent. Either a government institution can copy and then publicly reproduce copyrighted content without giving a damn about said copyrights; or the government is quite willing to modify copyright laws to allow the above to happen. Funny, citizen copies something they shouldn't (as a copyright infringement), it's a whole different story...
-
-
-
Friday 5th April 2013 12:21 GMT Anonymous Coward
Frequency
How often are they going to trawl a site? Presumably they will want each update as a separate archive. Presumably Google merely update to the latest copy of a page.
It would be interesting to know what intelligent limits they are going to place on this. Keeping "everything" is not an option.
-
-
Friday 5th April 2013 17:15 GMT dssf
Re: UK-created websites in the non-.uk domain-- or blogpot.uk?
Re: UK-created websites in the non-.uk domain-- or blogpot.uk?
If your audience is in a certain country, and you use Blogger/Blogspot, and people in that country start accessing your content, even if it starts out as .com (say, in the USA), then that country's domain will be on a blogger/blogspot page of your audience. Google says this is to speed up access to the content. I don't completely buy it. If the site has a very small amount of dynamic content, and if most of it is text and small images, then probably javascript crap and browser settings conflicts will slow the page loading down more than it being. 6,780 miles from the reader.
However, IIRC, you an ask Google/Blogger/Blogspot to disallow per-country domain appending or whatever it is they call it.
Also, you can set the page to be readable only by invited or white-listed people or email addresses. So, even if you ARE in the UK, if you publish content under TOCs that state the readers are special, private, invited, non-public guests, then that might legally be enough to disallow a grab-bagging/copyright-vacuuming library system from scarfing up content its author intents to keep in limited, private circulation. Of course, it would get nasty if a TOCs-violating subscriber/invitee just screen scrapes and then republishes the content on a legit .uk page that is harveste-- umm, archived before a takedown notice could be issued.
Also, if an author laces his or her content with ungainly working making it offensive to the public and not satisfactory to introduce to schools, then would the government redact/black out such words if the remaining content is somehow worthy of archiving and representation to the public? How non-sensical would an author need to become to virtually guarantee the UK copyright czars back off? (IIRC, UK parody, libel, and freedom of speech concepts are different than in the USA and some other countries...)
-
-
Friday 5th April 2013 12:33 GMT Sandpit
@heyrick
"Did they think this through?"
Did you think this through? This is enabled by an act of parliment. It is protected by law. look up legal deposit legislation, it's been a round for a long time, this new provision extends that to non-print, something that was granted in 2003 but has only just gone through today 10 years later.
Yes, this has been thought through, a lot!
Andy why is this being done at all? It's for YOU, to make everything that is published (in whataever form) available to the public and forever.