• On BNET: 3 worst things about the iPhone 3G S
September 8, 2008 11:21 AM PDT

Google raising newspaper morgues from the dead

by Stephen Shankland

Updated 2:57 p.m. PDT with Google's commentary about ad revenue sharing and other details. Also, my colleague Rafe Needleman covered Google's launch of the newspaper digitization work at TechCrunch.

Google is making searchable, digital copies of old newspapers available online through partnerships with their publishers, the company said Monday.

Under the ad-supported effort, Google will digitize millions of pages of news archives, including photos, articles, headlines, and advertisements, Google said.

Google's newspaper archive search and display effort is supported by ads, visible on the right edge.

Google's newspaper archive search and display effort is supported by ads, visible on the right edge. (Click to enlarge.)

(Credit: CNET News)

"Around the globe, we estimate that there are billions of news pages containing every story ever written. And it's our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily," said product manager Punit Soni in a blog posting about the effort. "The problem is that most of these newspapers are not available online. We want to change that."

The effort is of particular interest to reporters such as myself who've made the jump from print journalism to online. When I started at CNET News a smidgen shy of 10 years ago, I was initially concerned that the online medium was more ephemeral than print.

But as soon as I realized that CNET's search box opened up our archive of work, I realized that online news actually is more permanent in many ways than a newspaper that's almost invariably recycled or thrown away within a day of its publication. Few have the time and money to visit a newspaper's archive of old papers, called the morgue, or flip through back issues in a state library's microfilm collection.

The results of Google's project initially will be available through the Google News Archive site, Soni said. "Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search Google.com, you'll be searching the full text of these newspapers as well," he said.

Google didn't reveal which publishers are partners except the Quebec Chronicle-Telegraph and two organizations, ProQuest and Heritage Microfilm. However, examples of the service showed pages from The Evening Independent of St. Petersburg, Fla., the St. Petersburg (Fla.) Times, The Tryon (N.C.) News, and the Pittsburgh Post-Gazette.

The project expands on an earlier partnership to digitize content from The New York Times and The Washington Post, Google said.

Google has tangled with news agencies before over who has rights to content. It settled a lawsuit with Agence France-Presse in 2007 and a similar suit from the Associated Press in 2006.

The profit motive
With Google, it's often hard to tell what project is designed to contribute revenue directly and what's part of the larger corporate mission "to organize the world's information and make it universally accessible and useful," which can have the effect sometimes of making Google's search better, therefore used more often, therefore a better business.


The newspaper effort falls into this profit-and-loss gray area. Although the company is supporting it with advertisements, loftier goals were foremost in the mind of Adam Smith, the director of product management who oversees the newspaper effort, Google Book Search and related efforts.

"For us this is about improving the users' experience on the Web," Smith said. "Our objective is to bring all the world's historical newspaper information online in conjunction with our partners."

That's not to say money isn't involved. Google supplies advertisements on the right edge of the page that are based in part on the content in the newspapers, he said.

The majority of the ad revenue goes to the publishers, Smith said. (Update Sept. 12: Apparently I misheard Smith--it's only the majority of revenue, not the vast majority.)

And other revenue models are possible, he said. "There may be pay-per-view in the future, but we don't have anything to announce now," Smith said.

Although the project involves Heritage Microfilm and ProQuest, which both have microfilm archives, Google is doing the actual scanning of the film. The index has more millions of articles so far, he added.

Currently the system shows only images of the newspapers, not the text that's shown by existing news archive partnerships with newspapers that typically already have digitized much of their content.

Dozens of publishers are involved in the effort, he said.

Originally posted at Digital Media
Stephen Shankland writes about a wide range of technology and products, but has a particular focus on browsers and digital photography. He joined CNET News in 1998 and since then also has covered Google, Yahoo, servers, supercomputing, Linux and open-source software, and science. E-mail Stephen, or follow him on Twitter at http://www.twitter.com/stshank.
Recent posts from Webware
Firefox 3.5 and the potential of Web typography
Sites that help you lodge complaints
Google App Engine misfires
Microsoft: Bing needs to improve when news breaks
Google finally sued by makers of Finally Fast
Google Toolbar for IE speaks your language
Bing brings out the tweets
Google Search optimized for a mess of phones
Add a Comment (Log in or register) (4 Comments)
  • prev
  • 1
  • next
by rccoffee September 8, 2008 12:16 PM PDT
This would be a great service to genealogists. I would love to be able to read the Elizabeth (N.J.) Daily Journal online. It went out of business decades ago and is only available on microfilm at the local library.
Reply to this comment
by Manhattan2 September 8, 2008 1:13 PM PDT
4Dplanet.com has been doing this with video. very interesting results!
Reply to this comment
by Vegaman_Dan September 8, 2008 11:22 PM PDT
This would be awesome to see. Right now all those newspapers are only on microfische and even those blue sheets are rapidly disappearng. What library even has a reader? Does anyone know what they are these days?


For research purposes, it would be fantastic. Even for hobbyists, it could be invaluable. I would hope they include the photos from the original papers as well.

Reply to this comment
by Greg_N September 24, 2008 6:19 PM PDT
A resource of interest might be The Nambour Chronicle & North Coast Advertiser was first published 31st July 1903 and continued as the local newspaper for the Sunshine Coast Region (north of Brisbane, Australia) until 1983. It has been scanned from microfilm and made available, in digital format, the entire full text run of this newspaper from 1903 to 1955.

Researchers are now able to search the entire paper by keyword or issue date and can download
and print directly from the paper. This provides easy and convenient access to this valuable historical newspaper

http://www.nambour-chronicle.com/index.php

Not on the same scale as the Google project but still worth a visit.
Reply to this comment
(4 Comments)
  • prev
  • 1
  • next
advertisement

About Webware

Say No to boxed software! The future of applications is online delivery and access. Software is passé. Webware is the new way to get things done.

Add this feed to your online news reader

Webware topics

Making sense of Windows 7 upgrades

faq The basics and the fine print on Microsoft's options for those eyeing the next operating system from Redmond.
• Full Windows 7 coverage

Road Trip 2009: Big Sky Country

CNET News reporter Daniel Terdiman takes his car full of gadgets to the Rockies and the Great Plains in search of tech, science, nature, and more.
• America's Fortress: Cheyenne Mountain

advertisement

Inside CNET News

Scroll Left Scroll Right