Copyright office embraces the liberation of the $86K copyright database

Carl sez,
Boing Boing readers may remember the headline from two weeks ago which read "Guerrilla librarians free the $86k Library of Congress copyright database."
"A couple of weeks ago, we wrote to Marybeth Peters, the Register of U.S. Copyrights, to ask why the copyright database had a copyright, and why it cost $86,000. On Friday, the Library of Congress blogged the issue, and dismissed the whole thing as a 'blogospheric brouhaha.' Well, the Library of Commerce can diss our distinguished signatories all they want, but lucky thing is these are all public records, and we're making all 21 million of them available for download."

Well, we just received a nice letter from the Copyright Office saying that what we did is A-OK and it is just fine to harvest their records. In a phone call, the Register of Copyrights said she was happy their database was reaching new audiences. We'll continue to spider that database, and have just added a nice RSS feed of the latest copyright additions.

Link (Thanks, Carl!)

Discussion

Take a look at this

The reason there's a fee attached is "cost recovery," or at least that's the case by law for the LOC's Cataloging Distribution Service (CDS), which has similar public information records, in this case, the LOC records that document various holdings and other information that's been filed, like CIP (Cataloging in Publication) data. It's not Books In Print, but it's useful.

The CDS is required by Congress to charge fees for data dissemination that reflect its costs in performing its tasks. They can't give it away. However, there's no copyright assigned to the material per much of what's created by the government.

The set of retrospective catalog information representing the sum of their holdings is tens of thousands of dollars, and would be of great interest to those working on various library and book projects, especially for older books and to supplement Amazon and others Web services that allow ISBN-based retrieval.

If someone had the set of LOC information, they could distribute it at no cost, and, like the Copyright Office, there would be no complaint, because there can be none. I wonder if anyone has that dataset and wants to distribute it?

Take a look at this

Glad to hear it. The people I've dealt with at the LOC through the years have always been really honest, logical, no-bullshit sorts (to the point of frustration sometimes, frankly). I figured there was something behind their fee other than "our database is proprietary information" thinking.

Take a look at this

It was this sort of frustration with an increasingly complex world that fueled much of the wry humor in "Peanuts". I remember one strip where young Linus was complaining about the rigors of modern life in Kindergarten. You had to be able to say the ABCs, get a drink of water for yourself and cut with scissors. Brother, the things they ask of kids nowadays! Of course, putting ten dollar words in little kids' mouths was that strip's stock in trade. It used to drive Shulz crazy when people would "correct" him for having Linus talk about his "ophthalmologist" when the services he was describing would be rendered by an optometrist. "I use 'ophthalmologist' because it's a funnier word!" he would exclaim, but there were always those who just didn't get it.

Take a look at this

HELP!!

I REALLY want to download all the files from that repository but I can't get the files to finish, either on Mac or PC. They hang at 11, 42, 54 MB and so on, I think the most I ever got was 111mb on the 1GB file.

Anyone know of any (more stable) mirrors I can get the data from or a better way?

Thanks in advance and email responses to ctome3d@gmail.com

digininja

Post a comment

Anonymous