August 27, 2009 4:00 AM PDT

Google Book Search? Try Google Library

by Tom Krazit
  • Font size
  • Print
  • 16 comments

Is Google ready--or willing--to become a library?

Librarians, academics, and privacy advocates will gather Friday on the campus of the University of California at Berkeley to discuss the implications of Google's proposed settlement with publishers that, if implemented, will allow it to bring millions of books online.

At issue are concerns over privacy, quality, and Google's intent with the project, the only one of its kind in the U.S. to receive the legal authority to scan books that are out of print but under copyright protection--estimated by the Internet Archive to comprise 50 percent to 70 percent of all books published since 1923.

Google's Dan Clancy will have his hands full defending the Google Book Search settlement Friday at a conference.

(Credit: Tom Krazit/CNET News)

Almost from the day it was announced, the settlement has drawn scorn and scrutiny from authors, library groups, industry associations like the newly formed Open Book Alliance, and even the Department of Justice. Many are concerned that the settlement gives a private organization the sole right to essentially create and control a public good--a digital library--without explicit responsibilities to maintain that public good outlined in the settlement.

And, as UC Berkeley professor Geoffrey Nunberg put it, "this is the last library."

It's going to be extremely difficult for anyone else to create a similar digital library in the future, at least under the current laws. Any other organization that wanted to scan a large percentage of the world's books would likely have to go through a similar legal process that Google has followed for four years to gain access to those so-called "orphan works," a weighty expense even before you start counting the exorbitant costs of scanning the books themselves.

There's a sense among several of those planning to speak at Friday's conference that an Internet corporation--even one sworn to "do no evil"--does not necessarily share the same values and principles that librarians rabidly defend. And left unsaid, but by no means absent, is the growing scrutiny paid this year to Google's dominant position in the Internet search market and how that power squares with Google Books and the publishing industry.

Google's Dan Clancy plans to speak at the event, having been the point man for much of Google's outreach on the settlement. The American Library Association's Angela Maycock gave Clancy credit for listening to the concerns of library groups all year, but he's bound to get an earful Friday.

Big Brother concerns
Expect much of the debate at UC Berkeley to focus on privacy. Public libraries have long been considered anonymous places, where patrons can pursue their interests free from concerns about their browsing being tracked. The Internet, of course, is pretty much the complete opposite environment.

"Is Google going to provide the same kinds of guarantees that users expect, the ability to access books with relative anonymity? The legal document is silent on these concerns," said Michael Zimmer, a professor with the University of Wisconsin at Milwaukee. "I know the people at Google. I trust them, they are good people, but these are serious things."

Tom Leonard, university librarian at UC Berkeley, agrees. "We want users who use public libraries to feel very comfortable that their identifies will be protected," he said.

Google has a practice of executing innovative ideas far before the implications are visible. But Leonard also sees the upside to the settlement, assuming all the concerns can be addressed.

"We're pretty excited about the fact that the world has changed, and that we can give access the way readers want it," he said. "They want to make full-text searches of everything we have in the libraries."

Universities do have an alternative in the HathiTrust, a digital library project that counts UC Berkeley and the University of Michigan--also a close partner of Google's--among its partners. That service lacks the scope of what Google is potentially entitled to scan, but it curates the material in a fashion that's better suited to the needs of the academic community.

That's good, because at the moment, Google Book Search is almost laughably unusable for serious research, UC Berkeley's Nunberg said. For example, he pointed out that the Charles Dickens classic "A Tale of Two Cities" is listed in Google Book Search as having been published in 1800; Dickens was born in 1812.

There are still a few kinks in the data attached to Google Book Search.

(Credit: Screenshot by Tom Krazit/CNET)

Nunberg plans to speak out on the quality issues with Google Book Search, although he readily concedes that the product was not designed for the needs of academics and scholars. But that only underscores the point: if Google Book Search is the only way to obtain a digital copy of a book 100 years into the future, scholars will have to depend on it for research, he said.

So what comes next? Friday, September 4, is the deadline for authors to decide if they wish to opt out of the settlement. It's also the deadline for interested parties to submit their comments regarding the settlement to the U.S. District Court for the Southern District of New York, which is overseeing the process.

There are definitely groups like the Open Book Alliance, who will be represented by Peter Brantley of Internet Archive on Friday, which would prefer to scrap the settlement and start over. "Google has a practice of executing innovative ideas far before the implications are visible," said Colin Evans, a "data wizard" at Metaweb and panelist on Friday.

However, it sounds like most of those in attendance are willing to give Google a shot as the digital librarian of the future so long as they adhere to the rules of the club.

"There's a lot of questions about how they will balance (their) mandate as a for-profit corporation and their mission to provide universal access to information," Maycock said. If it really wants to make the controversy over this settlement go away, Google needs to embrace "the ethical framework that libraries operate under," she said.

Tom Krazit writes about the ever-expanding world of Internet search, including Google, Yahoo, online advertising, and portals, as well as the evolution of mobile computing. He has written about traditional PC companies, chip manufacturers, and mobile computers, spending the last three years covering Apple. E-mail Tom.
Recent posts from Relevant Results
Chinese author plans lawsuit over Google Books
DDoS attack hobbles major sites, including Amazon
Web staggers under pre-Christmas DDoS attack
Twitter buys developers of GeoAPI
FTC asks for more info on Google-AdMob deal
Yahoo shutting down for the holidays
Google's creed: 'Open will win'
Why Google may want Yelp
Add a Comment (Log in or register) (16 Comments)
  • prev
  • 1
  • next
by divisionbyzero August 27, 2009 4:47 AM PDT
Google took the time and paid the money to get this legal right that anyone can get assuming they are willing to put in as much effort. What's the problem here? If they change the law, then Google should be compensated for all of their legal fees and paid a bonus for paving the way for others.
Reply to this comment
by knowles2 August 27, 2009 9:06 AM PDT
One would think so, but it clear that either the big players, Yahoo, Microsoft or amazon are not interest in competing or are just to try to damage Google at any cost, I am going to say they are doing the second option because if they not interested in competing with Google then that just scary because they may then start to think about that for all there other productions, an it becomes dangerous when there no competition.
by HlLLARY CLITON August 27, 2009 4:55 AM PDT
whatever happened to the idea of wanting people to read and have easier access to books.
Reply to this comment
by ddesy August 27, 2009 6:21 AM PDT
That concept has died with the increasingly corporate market. Sad, but true.
by gerrrg August 27, 2009 6:50 AM PDT
Having spent a little bit of time investigating this issue, I can say that too many groups and individuals (including the Open Book Alliance) have been obfuscating the facets of the settlement.

You've combined out-of-print books with orphaned works, which aren't equivalent to each other. Orphaned works are (generally) out-of-print (but primarily those) whose authors cannot be found or identified.

The law does not prevent anyone from digitizing out-of-print books. What it does require is that one obtain permission prior to use, and when a work is out-of-print, that copyright reverts back to the author if it was held by the publisher . Google ran afoul of the interpretation of this, and under the negotiation, will pay out $60 per book (and various lower amounts for snippets) that was scanned prior to this settlement.

The orphaned works issue is one of two sections of the settlement which may seem controversial.

Under this settlement, Google has set up (and seeded with $34.5M) an independent Book Rights Registry group that is tasked with sorting out a book's copyright status, and paying out royalties. Another responsibility for this group, is to find authors of orphaned works. Under good faith, if no one can find the author of an orphaned work, then Google is being granted the right to digitize the work and sell it. The profit from sales or advertisements that are displayed alongside the orphaned work, is then transferred to the Book Rights Registry (ostensibly to be held in perpetuity until either the work becomes public domain or the author is found). Generally under copyright law, because you cannot use works without prior permission, some people find this section of the settlement to be controversial. Ultimately however, the copyright law provides relief to the copyright holder. Therefore, if Google were to digitize orphaned works - and because said rights remain in force - the copyright holder is still free to sue Google. Perhaps the entire issue will become moot, if the US House would move on HR5889 - 2008 (the senate already passed S2913 - 2008), and allow certain orphaned works to be used.

The other controversial portion of the settlement surrounds the assignment of the terms of any third-party contract that would otherwise place Google at an economic disadvantage (think Open Book Alliance signing a separate deal with Microsoft which proved to give Microsoft a better deal than Google's settlement). Whether or not that's a fair deal (or legal) is for others to decide, however since this settlement is sitting before a judge, one would assume that said judge is aware of all points of the settlement.

Whew.
Reply to this comment
by TV James August 27, 2009 9:08 AM PDT
Thanks for taking the time to explain these details. They are sorely lacking from the original article.
by gefitz August 27, 2009 9:40 AM PDT
THANK YOU, gerrrg, for helping us understand the facts here more than any CNET writer has.
by gefitz August 27, 2009 9:36 AM PDT
I call B.S. on the "librarian commitment to privacy and open access to materials". I work in a public library, and decisions are made EVERY DAY to purchase or not to purchase materials for patrons. These decisions necessarily prioritized (because of budgetary concerns, of course) using some librarian-speak voodoo. Direct requests from patrons are often not filled.

I know EVERY request cannot be filled, and a librarian may be "committed to full and open access to any material", but it just isn't the reality to say that a public library can provide everything.

Google can't either. But at least they are more able to expand the amount of material available. And to me, they are EQUALLY likely to withhold access to materials for some just-as-arbitrary reason as a public library.
Reply to this comment
by Techno Guy August 27, 2009 9:38 AM PDT
This article fails to explain the opposition to Google Book Search. The theory behind libraries is that they provide ready access to books, typically at no charge to patrons, and that they actively encourage the interchange between the two. Google is paying the cost to digitize millions of books that will become readily and freely available to anyone with Internet access, and libraries and academics are opposed?
The privacy claims to me seem very dubious: if you don't like Google's terms of service, don't use the service; your ability to access these works remains the same as it did prior to Google books. Instead, I sense that the real objection is one of power and philosophy: Google Books has the potential to diminish the necessity for brick-and-mortar libraries and their gatekeeper role to printed works. Additionally, Google is a private entity. This article suggests, without ever fully explaining, that this in and of itself is a problem. But if one assesses the philosophical orientation of those groups opposing Google?s effort, it takes little imagination to understand why Google?s private status alone is sufficient to stir their resistance.
Reply to this comment
by InklingBooks August 27, 2009 2:09 PM PDT
Several posters are right. With only a few exceptions, the American media, including technology outlets such as CNET, have been clueless about the meaning and ramifications of the Google settlement. Most of what they've been saying seems borrowed from Google's FAQ and press releases.

Overseas, the media has for a long time done little more than echo the U.S. media. But that's changing, particularly in Germany and New Zealand. A New Zealand paper just published the best overview of the settlement that I've seen:

http://www.nzherald.co.nz/technology/news/article.cfm?c_id=5&objectid=10592332&pnum=0

I highly recommend it.

Notice how everyone is included in this "discussion' at UC-Berkeley but the group with the most to lose: writers. It's 'screw the writers' time at Berkeley and at almost all the other so-called discussions on the topic here in the U.S. Particularly bizarre are the librarians, who seem to assume they own the copyright to any book in their collection that's not in print.

Especially notable by their absence from these discussions are foreign authors who, whether they know it or not, will find their treaty-granted U.S. copyrights gutted if this settlement is approved by the court. This settlement is nothing less than a gross violation of key provisions of the Berne Convention, which is binding on the U.S. and some 160 other countries. When that becomes common knowledge, things will get very unpleasant. Google employees will long to be as well liked as those at Microsoft.

I might add that Google's efforts to inform foreign authors have been so inadequate and the media's coverage so ill-informed, that I know an experienced copyright lawyer and Internet expert in New Zealand who only discovered that it applied to his country's authors a few weeks ago. Things are just that bad.

In the end and whatever the court rules, this settlement will prove a waste of time. If accepted by the court, it'll be overturned by Berne after a lot of angry words. The problem isn't technology. It is that even when things were changing must less slowly than today, the Berne Convention was being 'revised' or 'completed' roughly every 11 years between 1886 and 1979. It hasn't been updated a single time in the thirty years since 1979. Think about that for a moment. The last time Berne was revised, 'personal computer' meant an Apple II with a 40-character, monochrome screen and what became the Internet linked room-sized computer systems and a few dozen university and military locations. No wonder we have problems.

Google is ruthlessly exploiting that gap between technology and law for its own benefit. They want new rules, but rules that benefit primarily then and ignore the fact that copyright centers on protecting creators and not enriching one corporation or making a librarian's job easier.

The faster we shove aside this dreadful settlement, the quicker we can negotiation a revision to Berne that's fair to all parties.

--Michael W. Perry, Seattle
Reply to this comment
by gerrrg August 29, 2009 4:20 AM PDT
The Google settlement does not in any way destroy copyright protections. I strongly suggest you read the actual text of the settlement.

And if you have a complaint against loosening the reins of orphaned works, I suggest you look closely at the Berne Convention's own rules:

"In the case of anonymous or pseudonymous works, the term of protection granted by this Convention shall expire fifty years after the work has been lawfully made available to the public."1

US law, in fact, is stricter, providing protection of the author's life + 70 years. And with HR 5889 - 2008, the law would provide that the infringer negotiate with the copyright holder, in "good faith", for "reasonable compensation" and provide such compensation within a "reasonably timely manner".2

I don't buy one iota of this reactionary noise being passed on by people that are too lazy to investigate for themselves and provide actual proof of their argument.

1 - "Berne Convention for the Protection of Literary and Artistic Works" - WIPO Database of Intellectual Property, Articles 7, 15, http://www.wipo.int/treaties/en/ip/berne/index.html
2 - H.R.5889, "Orphan Works Act of 2008",Sec.514,a.B, http://www.thomas.gov/cgi-bin/query/F?c110:1:./temp/~c110yDx5jM:e908:
by proudmonkey August 27, 2009 4:43 PM PDT
I am all for Google Library. This makes my job as a researcher so much easier. The obscure things I am able to discover and find through google books is awesome. I am sorry no one else stepped forward, but I am glad Google is doing it cause they will do it right and not abandon their project. Those afraid of someone getting to powerful sure arnt supporting there capitalists government they fight so hard from "changing".
Reply to this comment
by Police_States_of_America August 27, 2009 11:20 PM PDT
google is making impossible to find books possible to find, which is commendable. i seriously dont see them making much if any profit off a bunch of out of print books. if they were capable of being profitable they would still be in print. it goes back to google's original goal, organizing the worlds information.
Reply to this comment
by untactical_theonlyone August 28, 2009 10:27 AM PDT
But what about books that are being donated to Google and don't fall under copyright because of their age.
Are we to destroy our donations to them, simply because one half of the bench disagree with the other but, hasn't it always been the same? Here one company chose to make a difference and began the tedious task of implementing a save & rescue operation for the benefit of billions.

It doesn't matter who got in first between rival companies, the important aspect is to preserve every bit of literature that we can else all may be lost for those who lay claim to this wonderful reading library.
Reply to this comment
by Linkmoses August 31, 2009 12:26 AM PDT
Frankly, I think Google has done far more good for librarianship and libraries than bad. The web, and Google especially have made the librarian more important than ever for resource vetting, algorithmic signals of trust, and searcher education/coaching.
Reply to this comment
by spinoza2 August 31, 2009 3:25 AM PDT
"Almost from the day it was announced, the settlement has drawn scorn and scrutiny from authors, library groups, industry associations like the newly formed Open Book Alliance, and even the Department of Justice."

This is the same kind of ludicrous hyperbole that we've seen elsewhere on CNET regarding Google Book Search. If CNET were honest, this is how they really should have reported on the issue:

"Around the world millions of researchers, scholars, and students are making unprecedented use of a huge repository of books that have until recently been extremely difficult to find or access and have been collecting dust on library shelves for decades. We here at CNET are not researchers, scholars, or students, and so we are kind of clueless as to how these books are being used. But being journalists looking for sensationalist headlines to sell our advertising, we are going to focus all of our reporting on this little group of activist opponents to the Google Book initiative--a relatively small group of technology-oriented competitors to Google, as well as some librarians, publishers, and writers, all of whom recognize a vested, if misguided, personal interest in drawing attention to their cause. Since the media love a fight, we're going to dramatize this as much as we can, just like we're doing by focusing on extreme Republicans in the health care debate. Who cares if 80% of all Americans want a national health care plan, or if the country is going bankrupt because of skyrocketing health care costs, we love a cat-fight and that, frankly, is a lot easier to report on than trying to accurately characterize the situation.
Reply to this comment
(16 Comments)
  • prev
  • 1
  • next

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About Relevant Results

Relevant Results focuses on the big Internet companies of our time, tracking the evolution of search, communication, and business on the Web. Tom Krazit examines how a shift to mobile computing and the growing demand for online content affect our understanding of how to deliver information in the 21st century, in between bemoaning the state of the New York Mets and searching for the perfect IPA.

Add this feed to your online news reader

Relevant Results topics

advertisement
advertisement

Inside CNET News

Scroll Left Scroll Right