July 2, 2001 4:00 AM PDT

Porn sneaks past search filters

Related Stories

Taming the Web

April 19, 2001

Lawsuits slam Net filtering efforts

March 20, 2001

New filter scours servers for illicit content

October 24, 2000
Search companies are increasingly turning to censorware to court G-rated customers such as corporations, schools and parents, but they're still showing too much skin.

The shortcomings of porn filters were on display last week when Google launched a test version of a search engine for images with an optional filter for what it terms "inappropriate adult content." Even with the filter turned on, Google is serving a healthy dose of pornographic images, often for keywords with primarily nonsexual meanings.

"The filter removes many adult images, but it can't guarantee that all such content will be filtered out," Google acknowledges on its Web site. "There is no way to ensure with 100 percent accuracy that all adult content will be removed from image search results using filters."

Google is hardly alone in the uphill battle to filter pornographic and other sensitive images. Technology companies devoted to image recognition acknowledge that the state of the art is still crude, yielding inexact results at the cost of computing power.

While technologists struggle to improve their tools, the market for image filtering is the subject of dispute. Google cites the need to protect its "sensitive" users, while search destination AltaVista touts its own filter as indispensable.

"A picture says a thousand words, so we want to make sure that the image search is filtered by default," said AltaVista spokeswoman Kristi Kaspar. "We find that quite a few people are using the image search database for school. And what a huge turnoff if we're in an education market with a great product and we couldn't figure out how to provide a family filter."

In another demonstration of potential demand for better image-filtering technology, Lycos deemed the available technology so inadequate that the site's parental controls disable multimedia search altogether.

Some in the image-recognition business see a burgeoning corporate need to identify what kind of images their employees are downloading, while others extend the technology to e-commerce applications that can recognize a product such as an article of clothing and find similar examples for sale elsewhere.

But according to at least one image search provider, actual use has not lived up to perceived demand.

"Image filtering is something where we're investing a lot of (research and development) because we think it's going to be an essential feature," said Tom Wilde, vice president of marketing at Fast Search & Transfer, an Oslo, Norway-based company that is the search technology provider for Lycos.com and other Web portals. "But there's a difference between the perception of growing market demand and what's actually happening. At our All The Web portal, 98.6 percent of our visitors are using the image search without the content filter on."

Testing barriers
Regardless of demand for filtered image searching, several companies are struggling to get a handle on the problem.

Google noted that its image filter is still in beta and said engineers are working to improve the product. But company representatives acknowledged that they face a daunting task.

"It's a real challenge to do this effectively for a lot of different reasons," said Susan Wojcicki, product manager for Google search. "There is a lot of pornography out there on the Web. If all the porn were in one place, we could cut it out. But it's everywhere. Also, the definition of porn is not very clear."

Even with consensus on a pornography definition, technologists have their work cut out for them. Current techniques fall into three categories. The first attempts to filter images by analyzing the text that names and surrounds them on a Web page.

This method runs into several problems. For example, many words that belong to the pornographer's lexicon also fall into birder's dictionaries, guides to animal husbandry and hardware catalogs. As a result, text-based analysis turns up a high proportion of both false positives and false negatives, screening out wren tits and wood screws while admitting more salacious content.

More problems with the text-based approach accompany foreign-language pornography. For now, the Google filter works only on English-language pages.

After text filtering, the second avenue of attack screens out images gleaned from blacklisted Web addresses where pornography is deemed likely to turn up.

But pornography has proved a faster target than such lists can catch.

"Most of the firewalls have lists of URLs, but porno sites change their URLs regularly," said Bill Armitage, chief executive of Bulldozer Software, a Clinton, Mass.-based image-indexing and search technology provider that operates the Diggit search engine. "Those lists are always out of date. At any given time they're only 60 to 80 percent accurate. The remaining 40 to 20 percent of the time, you need another filtering mechanism to keep those things from coming in."

For that extra layer of protection, many search engines are pinning their hopes on the third and most complex method, which analyzes the image itself for "flesh" tones and body shapes. But this method returns its own share of false negatives--letting pornography in--and false positives, blocking more innocuous images.

"I'll tell you what slips through--baby pictures slip through," said J.J. Wallia, head of sales and business development for LookThatUp, a Paris-based company with offices in Burlingame, Calif. "That's a false positive. Babies tend to be showing a lot of skin. This is something the industry has just not been able to get around."

Perhaps more damning than the occasional excluded infant is the toll that image analysis exacts on central processing units (CPUs).

"The state of the art on image searching is such that there is no surefire pornography detection available," said Fast Search & Transfer's Wilde. "The big search engines have not yet done that because it's not scalable enough to keep up with the growth of the Internet. It's incredibly CPU-intensive to do image processing. We have 70 million images in our index. The image detection software that's available now gets absolutely crushed by that."

Wilde estimates that the image recognition industry is between six and 12 months away from providing an adequate product.

Even then, he warns, problems will remain.

"If you do some sort of flesh detector, what color is flesh?" Wilde asked rhetorically. "It's really that complex. And then what's pornographic? You have different sensitivities, especially internationally. Then there's hate, weapons and violence. It's a really, really difficult problem to solve."

5 comments

Join the conversation!
Add your comment
Trasparent Image to bypass image analysis
There is another problem with image analysis technique, if you put a transparent image over a porn image. Image analysis softwares wont be able to detect it.

Porn websites can easily embed this technique in their websites with few lines of code.
Posted by ehsensiraj (1 comment )
Reply Link Flag
It's amazing that this article is more than 7 years old and that it's still rattling around.

First, let me start by applauding Google on their work; being in the business of blocking porn and helping folks fighting porn addiction, I can attest to the fact that Google?s ?SafeSearch filtering? feature is one of the best in the industry. Our ability to ?lock it? on makes a huge difference to our client base.

As far as ?the uphill battle to filter pornographic and other sensitive images? goes, I?d suggest that the problem is not with the technology, but with what we are attempting to do - As Albert Einstein said:

?Perfection of means and confusion of goals seem - in my opinion - to characterize our age?

As a former CIO, I can tell you that nothing kills a concept or project faster than striving for perfection. We need to learn to live with ?more than good enough? and bridge the technological gaps with process and education.

I?m not saying that we abandon all hope, but rather that our goal should be to provide a reasonably safe surfing environment for our children and ourselves, as Google has done, while we continue to incrementally improve on it.

Stop arguing about whether or not pornography, gambling, hate, weapons or any of the other sites are inherently bad; stop arguing about the fact that things are not perfect. Instead, help empower parents, guardians and teachers with the tools they need to protect our children from exposure to said material while giving them the ability to take advantage of all the benefits the internet offers.

Once again, I applaud Google for their continuous improvement, but in the end, we must all take responsibility for ourselves and our families.

Sincerely,
Carlos A. Mendoza
Founder, My Internet Doorman

www.myinternetdoorman.org
www.myinternetdoorman.com
Posted by MyInternetDoorman (1 comment )
Reply Link Flag
Parents should have the ability to control the images reaching their children's brains, and I think pornography-as-it-is is bad for children, both because it distorts human sexuality and because it creates a _sameness_ that is disappointing (let them develop their _own_, idosyncratic, less marketable preversions, for Darwin's sake). Parents should also have some say in just how desensitised to or fearful of violence children are---though cutting down on the absurd number of murders on TV every night might be better for this in the case of parents who don't wish their kids so desensitised or scared, as would making real life less violent. (Hint: legalise drugs, make drilling with a well-regulated militia a condition of gun-ownership, stop glorifying violence done by The Good Guys.)

As an added benefit: once filters become really good at determining what's porn and what isn't, down to specific genres, those of us old enough to be notionally responsible could use their complements when pornography were what we wanted---sometimes, I want to filter out all the 'Safe' content, though at the same time I'd still like to give the overwhelming majority of the 'unsafe' the boot as well.
Posted by GeraldFnord (1 comment )
Reply Link Flag
Why not the .xxx top level domain? We could divert traffic to porn gateways, and there
could be a anonymous, but strong authorization method to allow over 18 (over 21, what
ever the legal limit) to access "classified flesh". Of course this is a silly idea,
but it could maybe do some difference in cultures where the naked flesh is a tabu.

I mean let's face it. Porno is to stay. It has always been that way. Google and the internet
has only changed the availability - and slightly! It's not a major shift. We - me
and a friend of mine - got porn as 12-yr olds, when we found abandoned magazines in
the woods near our home.

We sought out porn ourselves as 13 year old kids. It was exciting, a new world, and
I don't think it has done any damage to the brain. On the contrary.

Of course, what I mean is that exposure to pornography for minors may not be
beneficial - this needs more research, and can never be solved for everyone. Families
can decide finally what is appropiate, and the governing laws are to be applied.

The content filtering is a very complicated game, where a lot of resources are
spent on building gigantic systems, which in the end reach no results at all. Just
wait and see. Any system built, can be overrouted by hackers.

The probability that a kid gets, say an image of naked woman or man deemed indecent,
is directly relational to the amount of terminals available. The amount of terminals
is on the rise, and no end to that trend: public internet, TV-embedded net, phones,
etc. This is like trying to put a dam into a river - made out of matches. You shouldn't
do stupid things like waste time on wrong kind of pornographic filters.

Ok, the solution? Education. Honestly. And if you want to linger on with the
idea that your kid doesn't see genitalia being used in a sexual or pornographic
frame of reference before your kid turns 18, then my best advise is to rip the child
off of every possibility to touch the evil internet. Phones, computers,
public terminals, tv, etc. It's called ludditism or the Amish culture. You have
several choices. :) And I am not to tell that they're bad, they are just cultures.
Posted by jpaulin (1 comment )
Reply Link Flag
Sex sells adult webcam live chat is growing moesadult.com has all your adult sites.
Posted by mona536 (2 comments )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.