Google's ever-active search bots, which scour the Web constantly for new pages, have begun a new, more active phase of their indexing jobs.
In a blog post Friday, Jayant Madhavan and Alon Halevy of Google's crawling and indexing team said the company has begun an experiment in which its indexing software experimentally enters text in Web site forms to see what previously undiscovered pages may appear.
"In the past few months, we have been exploring some HTML forms to try to discover new Web pages and URLs that we otherwise couldn't find and index for users who search on Google," they wrote. "This experiment is part of Google's broader effort to increase its coverage of the Web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines."
The new Google indexing practice involves only "high quality" Web sites and doesn't run on sites with "robots.txt" files or other standard mechanisms of warding off indexing software.
To decide what words to "type" into the forms, the indexing software samples from among words on the Web page with the form, Google said.
The technology looks related to a company called Transformic that Google acquired, according to a blog post by Anand Rajaraman, who was involved with the technology earlier in his career, while working for Halevy.
The spreadsheet in Google Docs now supports independent form entry. That means that if someone wants to use a Google spreadsheet as a database, they can ask others to fill in data by putting information into a nice, compact form, instead of into the spreadsheet itself.
As is typical in Google Docs, this feature is simple, easy to use, but somewhat underpowered. For example, the form cannot be easily embedded in a Web page, and there's no data validation on form entries. I still recommend WuFoo for online data collection, and there are other good online databases allow embed forms (and export data to a spreadsheet for quick processing).
A pretty sweet feature is that users can easily e-mail the Google form, if all they want to do is collect a bit of data from people they know. Also, if the spreadsheet is open on a computer, the data coming in via the form can be monitored in real-time, which is, frankly, bad-ass. Try entering data in a form here.
I do expect this feature to evolve over time. Because of this evolution, I do not give good odds to the long-term survival of the other online databases. In fact, I fully expect Google to release a database application into Google Docs to go along with this bare-bones data-entry function.
One of the best examples of everything that's right about this whole Web 2.0 thing is WuFoo, the service that makes creating a Web form -- and collecting data from it -- about as simple as dreaming up a few questions. Recently I heard about a similar site, AskItOnline, that is being designed for the somewhat different job of collecting data from surveys.
I wasn't going to write about AskItOnline just yet since the site is still in closed beta and many necessary features haven't yet been built. But I saw it pop up on Del.icio.us, and it got a link on e-Hub. It turns out that a lot of people are interested in what this site promises to deliver.
(Credit:
CNET Networks)
I got access to the beta and communicated with its builder, Kaitlyn McLachlan. What I saw was a work in progress -- with much work yet to be done. The form designer is the most complete feature, and it's encouraging. It's very much like WuFoo's designer, although there are some specific question types that WuFoo doesn't have. For example, you can set up big grids of matrix questions like you often find on consumer surveys.
But there's more to a survey than forms. Specifically, running a good survey means getting the right people to answer it, and then doing analysis on the results beyond just counting replies.
I don't get the impression that AskItOnline will offer much in the way of helping you create "panels" of people distributed across whatever population you're trying to survey. "We'll cross that bridge once we get to it," McLachlan wrote to me. I hope she gets to it soon, since without this feature, respondents will be self-selected, potentially skewing survey results badly.
On the other hand, when it's released to the public, AskItOnline should offer solid reports, statistics, and analysis. McLachlan told me, "We will have everything from simple statistics to extremely advanced statistics and reports." I'm looking forward to trying those features.
I like how AskItOnline looks right now, and what's being built could be extremely useful. But running a truly representative and reliable survey is not a simple business, so I hope that the simple-above-all-else Web 2.0 design aesthetic doesn't trump good survey science.
- prev
- 1
- next





