- Related Stories
-
Still no cure for what ails Yahoo
July 17, 2007 -
Jabber creator signs on to Wikia search project
April 30, 2007 -
Google balances privacy, reach
July 14, 2005 - Related Blogs
-
Canadian group opposes Google-DoubleClick deal
August 2, 2007 -
Crawl the Web with Wales' open-source search tool
July 27, 2007
But if a Google could come out of nowhere to blast past Yahoo, why can't another unknown emerge to eat Google's lunch? Nobody at Spock is making that bold claim just yet--the invitation-only search service went live only this morning.
As it launches, Spock has more than 100 million people in its database, and the company plans to quickly add more by scouring other publicly available sites. While people-related search sites such as Wink, ZoomInfo.com and LinkedIn have had their 15 minutes of fame without upending the constellation of forces in the search arena, Spock takes a slightly different tack, offering meta-tag searching and Wikipedia-like tagging privileges to trusted users.
CNET News.com recently sat down the CEO co-founder Jaideep Singh to find out more. By the way, Singh says the company name has nothing to do with the Vulcan science officer of the Starship Enterprise. It's an acronym for "single point of contact and knowledge."
Q: How many people has Spock indexed now?
Singh: A little over 100 million people.
And you're adding approximately how many each day?
Singh: There are two things: one is people, and the other is how many documents we're processing, because one person may have many documents. We're really crawling an index in the entire Web and picking out documents and organizing those documents around people.
Can you explain exactly how the technology works?
Singh: If you're looking for some specific keyword, Google is great. The issue is that when you now search for people on Google, what you get is a bunch of documents about people. If you have a popular name like David Stern, who is the NBA commissioner, the first couple of pages are really about that person. So you really can't find the David Stern you met at the bar or from a business meeting.
That's a simple manifestation of the thing. It takes a lot more technology to do what we're doing, which is really trying to figure out the unique David Stern and organize documents and information and images and relationships--all those things--around a person.
How much harder is it to do that than a general search?
Singh: A lot harder. It's actually a different technology stack. The only thing that's common is crawling.
So where's the difference?
Singh: When we're done crawling, we go off in a different direction. Instead of just doing metadata extraction, we try to figure out who is this document about. We want to figure out the most relevant thing in that document. So, say there's a document about Charlie, and it says, "Jaideep likes to play tennis with Renee." That doesn't mean Charlie plays tennis or likes to play tennis. So you really you have to understand language and understand what this document is all about and that takes you to do things like natural language processing and other technologies.
Is there anything that you folks have come up with that's proprietary?
Singh: Absolutely. We have numerous patents. We have seven Ph.D.-type people in our company working on the algorithms for this thing. We have a lot of other outside help, including a lot of notable advisers from Stanford and from industry who are helping us really solve these problems. It's not just solving the problem, it's solving that scale for billions of Web documents. That's the largest-scale problem there is out there, so that's a challenge.
Is what we now see on the screen what the public will see when Spock opens up?
Singh: That's correct.
And one of the first questions they'll have is how is this different from Google.
Singh: Let me just step back a little bit. When users come to use the site, we think they're going to find it to be a very cool service because not only is it Google-esque in a way. You can give a query and type in a name or any keyword--you can say "Give me all the astronauts--but when you do that, you get very well-organized results and see the picture of the person. You see the most relevant terms or words that define this person and you'll see where they are on the Web and their relationships.
I tested the service earlier and it pulled up a lot more personal information about me than I found with a Google search.
Singh: You raised a really good point. Let's talk about that for a second. One has to realize that what we're doing is identical to what Google is doing in terms of indexing the Web. We're going out to public documents and picking up content. One has to realize that there is a lot of stuff about you on the Internet. You may have blogged someplace but it's on the Web. You may have a MySpace profile and it's on the Web. What we're finding is our users--when they come on to Spock--can really find this valuable in terms of "Hey, what has Spock discovered that's on the Web about me?" So, just knowing that is valuable.
Can you go beyond a firewall?
Singh: We don't do that. Unless it's out on the public Web, we don't try to get inside.
See more CNET content tagged:
document, co-founder, Google Inc., person, search engine




Prepare To Beam Up Scotty! Wow.
You should have picked a name like Troll Hard that nobody else has like me. Nobody will sue me over that name.
You can claim that this information is already freely available (because it was found on the 'Net) but a lawyer might see it differently.
In short, do they screen/limit any of the data they collect?
So far it's not much; I used it for my name and came up with nothing; including none of the info available on a simple Google Search.
Depending on where they go with this, though, you could easily see Public records being added, like marriage, home ownership, phone numbers.
It would make the site more "Useful" if they did this, because they would have everything on the person you were seeking to find.
But talk about a frightening Privacy issue if they went that way. They wouldn't last with all of the lawsuits.
We'll see where they tread on this muddy ground...
When I do a search I expect the first returns to provide access to what I searched for, not to an article about what I searched for.
All the search engines need to place returns for news stories and articles at the end of the returns.
Second of all, an entry is an open book. You can upload pictures to an entry, even if they are totally unrelated to entry in question. During beta testing this was demonstrated to the spock team by someone uploading pictures of daffy duck to entries like oprah winfrey's. Was anything done to fix that situation? No.
Third of all, it's possible to vandalize an entry without hacking the entry. All one needs to do is upload any manner of photos to the entry as well as mess with the tags. Tagging is out of control as any tag can be added whether it accurately discribes the entry or not. These issues have also not been addressed.
Fourth of all, there is no way to lock down an entry if it is being repeatedly vandalized. Unlike a wiki, these entries open to editing at all times.
Fifth, and probably most disturbing, is that it's been stated by the spock team on their own blog that the only way to get your name removed from their system is to remove ALL of your social network accounts. That means no myspace, no facebook, no anything! AND, should the information in your entry turn out to be inaccurate, AND you can't claim your entry, you have no recourse to fix the problem, WHATSOEVER.
So keep in mind, it may be a search engine, but it's purpose it merely to index social network sites.
http://www.spock.com/
Currently when I just checked it was down due to some problems.
I hope they aren't stealing the data like stealing pictures off of web sites that are copyrighted as pictures of people you are looking for.
In this case, however, Paramount's lawyers probably won't be too amenable to a settlement. I predict that the SPOCK moniker will be gone by the end of the year.
Why is Cnet hyping this junk?
Oh wait, it is VC funded & it is in Silicon Valley area and has Stanford connections so of course this means that the Big media (Silicon Valley) machinery immediately will cover it & hype it to see if enough people will fall for it. Same as with 2nd life, TWeeter and other useless junk, which the only thing that they all have in common is that they are VC funded & based in Silicon Valley (San Fran area). I cant believe people have not woken up to this Game yet!
Check that all words are spelled correctly.
Try more general terms.
- Will Spock be better than Spoke.com ?
- by Neotrope August 15, 2007 11:00 AM PDT
- The problem with all these automated company and personnnel companies is that nobody seems to care if the info is correct, and some like Spoke.com provide no means to correct the info without becoming a registered customer/user. For instance, my company is Neotrope, and on one of our business sites we posted a press release about our client Team F1. Now, when you go to Spoke, it shows my client as the president of my company. Idiots. I've even contacted them with no reply. Clearly asleep at the wheel and spreading mis-information seems to not be a legal or business concern. Will Spock be different?
- Like this Reply to this comment
-
(17 Comments)Here's my client listed as pres. of my company!
http://center.spoke.com/info/p6V50Dx/MukeshLulla
BAH! The goal for these kinds of sites seems to be a method of collecting then selling data, or allowing "members" or "subscribers" to use the data as marketing and lead generation tools. Why would I want my business info on any of these sites -- in fact, I don't. IP blocking is a great thing. :-)