Webware

Read all 'archiving' posts in Webware
August 26, 2009 10:28 AM PDT

Open Book Alliance to oppose Google Book deal

by Tom Krazit
  • 8 comments

With less than two weeks remaining until a key deadline in the Google Books settlement, Google's opposition is circling the wagons.

The Open Book Alliance, a consortium that includes nonprofit author groups, library institutions, and Google rivals Amazon, Microsoft, and Yahoo, launched Wednesday to "insist that any mass book digitization and distribution effort be open and competitive." As reported last week by the Wall Street Journal, the group will be led by Peter Brantley of Internet Archive and veteran antitrust lawyer Gary Reback of Carr & Ferrell.

Google's proposed settlement with book rights holders last October gave it the sole legal authority to scan and distribute digital books that are still in copyright but out of print, and library groups and privacy activists have been up in arms ever since.

Some object to the unchecked publishing power granted to a single corporation, some are concerned that rights holders are not getting a fair shake under the deal, and some just don't like Google. On the other hand, there are some rights holders who are excited by the idea of gaining recognition and perhaps revenue for books long out of print.

Book rights holders have until next Friday, September 4, to decide if they want to opt out of the proposed settlement and prevent their books from being displayed in Google Book Search. The U.S. Department of Justice is also looking into the Google Books settlement to determine if "anticompetitive practices" were used in the formulation of the settlement.

That's perhaps where Reback comes in. Reback was instrumental in the DOJ's prosecution of Microsoft in the 1990s, and also attempted to argue an antitrust case by representing Peoplesoft against an eventually successful takeover bid from Oracle. He did not immediately return a call seeking comment on the Open Book Alliance.

Originally posted at Relevant Results
August 19, 2009 12:00 PM PDT

PositivePress: A heavy-duty DIY Web archive

by Josh Lowensohn
  • 1 comment

Web archiving service Iterasi is launching a new product late Wednesday called PositivePress. It lets users passively monitor and archive RSS feeds that are saved forever--even if a site disappears, or makes changes to its content. Users can compile pages they want to share into a single report, then send it off to others for review.

The service is aimed mainly at public relations firms, but it could also end up being a really versatile tool for historians, political sites, and Web archiving enthusiasts. It's also a distinct departure from Iterasi's original product (now called "Iterari Personal"), which would require users to either manually choose pages to save, or have them install a browser extension that could do so on a schedule of their choosing.

PositivePress simply saves pages as soon as an RSS feed is updated, which removes some of the need for taking scheduled snapshots. It can also archive fresh pages from search results on engines including Google, Yahoo, Bing, and Digg. In a meeting last week, Iterasi's CEO Pete Grillo explained to me that the scheduling feature would no longer be included in the free version since the mechanism that saves pages has moved to the cloud. One of the biggest positives about the new product is that you can now leave your computer off, or not have your browser running, and continue to have it archive.

PositivePress saves Web pages from RSS feeds or search queries. These can be viewed long after a site has gone offline.

(Credit: CNET)

There are four individual plans for Positive Press, ranging from $99 a month for the "pro" level, all the way up to $699 for the "platinum." There are also 5- and 10-user monthly licenses that run at $399 and $699 a month, respectively.

The main difference between all these plans is... Read more

Originally posted at Web Crawler
July 8, 2009 5:15 PM PDT

Archive your e-mail from almost any account

by Jessica Dolcourt
  • 6 comments

I have thousands of e-mail messages in my corporate Outlook in-box, and thousands more in Gmail and in my ancient Hotmail account. MailStore Home is a free program that can archive them all locally, and display those archives in an interface that reads like your Outlook in-box.

Why use it? You can clear away old messages and attachments, but easily search to find them again when that inevitable moment arrives. Until universal offline in-boxes like Yahoo's Zimbra Desktop start addressing consumers on a wider scale, MailStore Home is also a good way to read mail offline in areas of spotty Wi-Fi, or to use as a de facto message backup.

MailStore Home

MailStore Home's search pane includes attachments and repeat queries.

(Credit: CNET/Screenshot by Jessica Dolcourt)

MailStore Home can archive a pretty impressive list of accounts and protocols, including Microsoft Outlook and Outlook Express, Microsoft Exchange, Thunderbird, SeaMonkey, Gmail, Windows Live Mail, IMAP, POP3. It also supports .EML files. It largely resembles Microsoft Outlook's layout with a side bar on the left--complete with folder tree and search field--and a large reading pane on the right. There are also some small navigational icons along the top that you can use to jump to archiving, burning archives to disk, advanced search, and tools.

The program's management is straightforward. Buttons on the start screen replicate the navigational icons up top, and there are also some stats, like your oldest and newest messages and the total size of your archive. When you archive an in-box, a wizard walks you through special configuration steps and lets you enter folders to archive or exclude if you want some backed up, but not all. MailStore Home skips your spam, trash, and junk folders by default, and it checks for duplicate messages while going about its business.

E-mail search is one feature of note. Using the advanced search screen, you can drill down to specifics--dates, folders, even the contents of e-mail attachments. You can also search for messages with or without attachments, and save queries to rerun the report at a later time. MailStore Home supports Boolean search terms. When you've found your message, you'll have management options like opening, saving, and exporting. Search was speedy and accurate in our tests. Though processing took a few long seconds, we were able to reply to archived Gmail messages via Outlook.

The freeware version for consumers doesn't do it all. There's no auto-archiving or scheduling for starters, so archiving is a manual activity. Initial scanning also takes a long time, and subsequent archives of the same in-box (click "run" to rearchive) start over from scratch instead of offering you the option to pick up from the most recent message date. We'd like to see more, and more nimble, filters on that left sidebar, like to filter only e-mails with attachments. MailStore Home also restricts you to three account profiles, which isn't especially useful if you've got more active accounts than that. Despite these drawbacks, MailStore Home offers a fine free solution for storing e-mail from multiple in-boxes and searching through the archives.

Related story: Three killer Outlook add-ons for office workers

Originally posted at The Download Blog
March 22, 2009 10:59 AM PDT

SXSW thoughts on Twitter's past, present, future

by Tim Leberecht
  • 3 comments

AUSTIN, Texas--Someone blogged that South by Southwest Interactive is just like the Internet itself: disjointed, decentralized, scattered, fast, aggressive, random, fragmented, and so on.

In fact, the main commonality between the two may be that the number of attributes to describe them is infinite. Like the Internet, the annual tech conference here is an echo chamber of an echo chamber, a place where original thought and commentary get mixed up and mashed up in a highly self-referential meta conversation.

That was already the case before Twitter entered the scene at SXSW two years ago, but the microblogging service has certainly amplified the effect. It was both comical and frightening to see the uber-individualistic geeksters at SXSW captivated by the invisible rules of an ostentatious behavioral uniformity: within 1 mile of the convention center, you could observe the strange ritual of groups of people standing or sitting together, chained to their iPhones, twittering instead of talking: "SXSW. Twittering about SXSW."

The real conversation was often limited to a quick "What's your name?" or "Where's the next party?" just to have some input for the next tweet. It is indeed a read-write generation that is coming of age in the wake of an all-dominant present, with no particular loyalty to the past and maybe not even an interest in the future (see Peggy Orenstein's recent piece on "Growing up on Facebook" in The New York Times Magazine).

Yet the rise of the social digerati is unstoppable. New data by Nielsen Online shows that social-networking sites (which encompass social networks and blogs, by Nielsen's definition) are experiencing growth rates of twice as much as any of the main destination sites (search, portals, PC software sites, and e-mail). The time spent on social networks and blogging sites is growing at more than three times the rate of overall Internet growth. Furthermore, social networks are gaining traction among new audiences.

... Read more
Originally posted at Matter/Anti-Matter
Tim Leberecht is frog design's vice president of marketing and communications and has worked in the media, entertainment, and high-tech industries. He is a member of the CNET Blog Network, and is not an employee of CNET.
September 8, 2008 11:21 AM PDT

Google raising newspaper morgues from the dead

by Stephen Shankland
  • 4 comments

Updated 2:57 p.m. PDT with Google's commentary about ad revenue sharing and other details. Also, my colleague Rafe Needleman covered Google's launch of the newspaper digitization work at TechCrunch.

Google is making searchable, digital copies of old newspapers available online through partnerships with their publishers, the company said Monday.

Under the ad-supported effort, Google will digitize millions of pages of news archives, including photos, articles, headlines, and advertisements, Google said.

Google's newspaper archive search and display effort is supported by ads, visible on the right edge.

Google's newspaper archive search and display effort is supported by ads, visible on the right edge. (Click to enlarge.)

(Credit: CNET News)

"Around the globe, we estimate that there are billions of news pages containing every story ever written. And it's our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily," said product manager Punit Soni in a blog posting about the effort. "The problem is that most of these newspapers are not available online. We want to change that."

The effort is of particular interest to reporters such as myself who've made the jump from print journalism to online. When I started at CNET News a smidgen shy of 10 years ago, I was initially concerned that the online medium was more ephemeral than print.

But as soon as I realized that CNET's search box opened up our archive of work, I realized that online news actually is more permanent in many ways than a newspaper that's almost invariably recycled or thrown away within a day of its publication. Few have the time and money to visit a newspaper's archive of old papers, called the morgue, or flip through back issues in a state library's microfilm collection.

The results of Google's project initially will be available through the Google News Archive site, Soni said. "Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search Google.com, you'll be searching the full text of these newspapers as well," he said.

Google didn't reveal which publishers are partners except the Quebec Chronicle-Telegraph and two organizations, ProQuest and Heritage Microfilm. However, examples of the service showed pages from The Evening Independent of St. Petersburg, Fla., the St. Petersburg (Fla.) Times, The Tryon (N.C.) News, and the Pittsburgh Post-Gazette.

The project expands on an earlier partnership to digitize content from The New York Times and The Washington Post, Google said.

Google has tangled with news agencies before over who has rights to content. It settled a lawsuit with Agence France-Presse in 2007 and a similar suit from the Associated Press in 2006.

The profit motive
With Google, it's often hard to tell what project is designed to contribute revenue directly and what's part of the larger corporate mission "to organize the world's information and make it universally accessible and useful," which can have the effect sometimes of making Google's search better, therefore used more often, therefore a better business.


The newspaper effort falls into this profit-and-loss gray area. Although the company is supporting it with advertisements, loftier goals were foremost in the mind of Adam Smith, the director of product management who oversees the newspaper effort, Google Book Search and related efforts.

"For us this is about improving the users' experience on the Web," Smith said. "Our objective is to bring all the world's historical newspaper information online in conjunction with our partners."

That's not to say money isn't involved. Google supplies advertisements on the right edge of the page that are based in part on the content in the newspapers, he said.

The majority of the ad revenue goes to the publishers, Smith said. (Update Sept. 12: Apparently I misheard Smith--it's only the majority of revenue, not the vast majority.)

And other revenue models are possible, he said. "There may be pay-per-view in the future, but we don't have anything to announce now," Smith said.

Although the project involves Heritage Microfilm and ProQuest, which both have microfilm archives, Google is doing the actual scanning of the film. The index has more millions of articles so far, he added.

Currently the system shows only images of the newspapers, not the text that's shown by existing news archive partnerships with newspapers that typically already have digitized much of their content.

Dozens of publishers are involved in the effort, he said.

Originally posted at Digital Media
July 1, 2008 10:00 AM PDT

Iterasi getting public RSS feeds and widgets

by Josh Lowensohn
  • Post a comment

Web page archiving tool Iterasi is getting a small but important update Tuesday morning. Users can now share their stream of archived pages with others as an RSS feed, letting anyone view their saved items either directly in their browser or in a feed-capturing tool like Google Reader or desktop e-mail clients.

Also being introduced is a new widget that can be tacked onto your blog or favorite start page like iGoogle or My Yahoo. It will display a reverse chronological stream of the latest pages you've tucked away. Each item is just a thumbnail, but when users click on it they'll be taken to the fully archived version of the page, complete with working links. It's the same basic experience seen when the service launched its sharing feature.

"What's surprising is how many of our users were asking for RSS feeds," Iterasi CEO Pete Grillo told me. Grillo acknowledged that the current Iterasi user base is a bit on the early-adopter side, and he thinks the widgets will help open the service up to a wider audience.

He also expects more people to jump onboard as the platform expands to include Mac users, which should be happening in the next few weeks--right around the time the long-awaited auto-archiving feature makes its way into users hands. "We're close to having it ready," Grillo said "and RSS is going to make it far more useful than we originally intended." Once in place users, will be able to schedule when they want the service to take snapshots of their favorite pages. It will continue to do so as long as the computer where the extension is installed is running.

I've embedded an example of the new widget after the break. It'll continue to update as more pages are saved.

... Read more

May 6, 2008 11:00 AM PDT

Iterasi goes live with personal Web-archiving tool

by Josh Lowensohn
  • 1 comment

Web bookmarking tool Iterasi just launched the first version of its Firefox extension to people who have signed up for the beta. The service, which I wrote about in January, lets you capture a Web site in its entirety, complete with links, formatting, and a time stamp to help sort it out later.

The company was set to release the plug-in back in late February but has been busy for the past few months resolving some security issues, as well as tweaking usability with a small group of beta testers. One of the reasons for the delay was to ramp up the sharing feature, which now lets users embed a notarized page on their blog or Web site like I've done below. I've had a chance to use the service over the past day, and it's definitely got the makings of a really engaging bookmarking tool.

All your saved pages can be browsed and sorted quickly with Iterasi's dashboard (click to enlarge).

Once installed, Iterasi puts a small selection of buttons in your browser's toolbar. There's a button to skip to your notaries, as well as two ways to notarize whatever page you're on: either a full option that lets you add tags and put the page in a special folder before filing, or a quickie option that will save the page with one click. My immediate qualm is that all of them have identical little icons and letter shortcuts that aren't exactly intuitive, however, as soon as you've used it once, you'll know where everything is.

The service is on the slow side when it comes time to "notarize" pages (aka slurping up all the content), but once it's been captured it's incredibly snappy to browse through. Users must first wait for whatever page they're on to load completely, and then it will slurp it up and file it away. Creating folders and tags is a snap, and you can quickly amass a huge collection of pages you've captured, which can be sorted in about a half dozen ways.

Sadly missing at this time is the scheduling feature that lets users automatically capture snapshots of their favorite sites at whatever times they choose, something I was looking forward to setting up to capture the front pages of several news and social bookmarking sites. I'm told the scheduling feature will be in place in the next 30 days, the creators just wanted to get a simpler working version out to people to try out before ramping up the servers to scale with the influx of captured pages. Also worth noting is that your computer must be on or in standby mode for pages to be captured, as the capturing is done on your side and not Iterasi's--something that might change with the introduction of a Pro plan later down the line.

Iterasi is currently in a private beta, but you can sign up for it on this page or grab an account anytime someone has shared an Iterasi saved page with you. Below is a capture of the front page of Digg.com from yesterday. Since then, all of the stories have received more diggs and run off the front page, but this captured it like a live screenshot.

March 10, 2008 9:01 AM PDT

E-mail archive program gathers Gmail account information as well

by Robert Vamosi
  • 1 comment

In looking for a program to back up his Gmail account, programmer Dustin Brooks found a commercial program that instead copies username and password information, according to a blog on Codinghorror.com.

Over the weekend, Brooks said in an e-mail to CodingHorrror.com that he was looking for a program that would archive his Gmail account onto his local hard drive. He signed up for a program called G-Archiver distributed by Mate Media of Miami, Fla. Brooks says that after installing the program, it didn't do all he was looking for so he decided to reverse engineer the source code using a program called Reflector for .Net.

Inside the source code Brooks found the program author's e-mail address and account password for Gmail. Thinking that was a little strange, Brooks used the hardcoded information to open John Terry's Gmail account. There, Brooks alleges he found 1,777 messages, all of which had username and passwords for people who signed up for the G-Archiver, including his own. In other words, whenever anyone signed up for the program, as Brooks had, a copy of his or her username and password was sent to John Terry's Gmail account.

Hardcoding e-mail addresses isn't new. In a presentation at Black Hat D.C. 2008 a few weeks ago, researchers Nitesh Dhanjani and Billy Rios reported that phishing site creators frequently hardcode e-mail addresses into the code in order to receive copies of the personal information submitted independent of where the Web form is being sent.

Brooks says upon realizing what each of the e-mails contained, he then deleted all the mail and emptied the trash. He then changed the author's password, and reported jterry79@gmail.com's abuse to Google.

On the CodingHorror.com site this morning, Brooks wrote "Granted my actions may have been a little quick and harsh, I was a little upset over the whole deal. I have a lot of personal info in my account along with a stored credit card for Google checkout. I very easily just could have changed my password and been done with it, but I didn't want more people compromising their accounts as well. The only e-mails in this account were usernames/passwords. This wasn't a personal account used for other things."

A number of sites have since removed G-Archiver from their download collection, including CNET Download.com. Attempts to contact Mate Media have so far gone unanswered.

Originally posted at Defense in Depth
March 5, 2008 12:05 PM PST

PhotoShelter adds Flickr import tool

by Phil Ryan
  • Post a comment

A new tool from PhotoShelter lets you import images from a Flickr account to your PhotoShelter Personal Archive.

A new tool from PhotoShelter lets you import images from a Flickr account to your PhotoShelter Personal Archive.

(Credit: PhotoShelter)

Barely a week goes by when I don't see a story about someone's photo being stolen from Flickr. I guess I'm not the only one, because PhotoShelter today announced that they've added a tool to their customers' Personal Archive accounts that lets them import images from, or export images to, a Flickr Pro-level account. Ultimately, it's a pretty slick way for the company to capitalize on the fact that PhotoShelter's Personal Archive provides a more secure environment for photographers, since it doesn't allow unauthorized viewing or downloads, though photographers can set selected galleries as public if they want to allow non-password-protected viewing. Plus, PhotoShelter's system includes an e-commerce engine, so you can set prices and sell your images.

The new tool also preserves any keywords or descriptions previously added in Flickr, and since PhotoShelter's system automatically recognizes EXIF data, you shouldn't lose anything in the transfer, except the possibility of your image becoming the unwitting star of an international ad campaign without proper compensation. The tool also lets you transfer images from a Personal Archive account to a Flickr account in case you want to take advantage of that service's photo sharing capabilities. If you use both services, this new tool gives you a nifty way to add watermarks to your Flickr photos, since PhotoShelter's system has a tool to do just that. Isn't it great when two photo sharing services find a way to play nicely together?

Originally posted at Crave
January 28, 2008 4:01 AM PST

Iterasi makes social bookmarking timeless

by Josh Lowensohn
  • Post a comment

Iterasi is a new bookmarking tool previewing today at DEMO. I got a demo of the service in action a few weeks back, and am looking forward to getting my hands on it for a review when the beta begins within the next month. The basic premise of Iterasi is that you can save any page you're looking at for later. It's almost like a screenshot, except that it preserves links, formatting, and any content that was on the page when you were viewing it at that moment. The end result is a bookmark that you can share with others that retains what the page looked like at that point in time. The creators tell me this is especially handy if you want to show someone a page that's behind a security login or on a local intranet.

To begin saving bookmarks on Iterasi, users need to install a small browser plug-in that will let them "notarize" any page they're on for later retrieval. I told the creators the notarize moniker reminded me of getting legal documents signed, but they think it will grow on users, and that it made more sense than making up some word that just sounded nice. The notarize button resides in the top right-hand corner of your browser, and also lets you jump to your bookmark list with one mouse click.

(Credit: Iterasi.com)

To sort through all your notarized content there's a home screen that lists everything in reverse chronology and can be parsed quickly using any tags you've added. You can either browse by text links that looks a little similar to the detailed file view in Windows Explorer, or a list view, which shows each saved site as a thumbnail. The service has a built-in search tool that will sort through the tags, site names, and any content that was stored on each page. You can also put multiple items into folders, and send them off to other Iterasi users, or your contacts via e-mail.

One of the most interesting features, and one I'm really looking forward to getting my hands on, is the... Read more

advertisement

About Webware

Say No to boxed software! The future of applications is online delivery and access. Software is passé. Webware is the new way to get things done.

Add this feed to your online news reader

Webware topics

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

Most Discussed

Inside CNET News

Scroll Left Scroll Right