August 2, 2007 4:00 AM PDT

Please don't steal this Web content

Please don't steal this Web content
Related Stories

Pay-for-blogging site raises questions

July 11, 2007

The big Digg rig

December 4, 2006

Spim, splog on the rise

July 6, 2006

Tempted by blogs, spam becomes 'splog'

October 20, 2005
Related Blogs

Spam, spam, spam and blogs


October 18, 2005
Lorelle VanFossen is passionate. An author, travel writer and nature photographer, she also has a popular blog about, well, blogging. Her pet peeve is online plagiarism, which she encounters nearly every day.

"It's one of my favorite subjects," she said. "I make my living from my writing, and when people take it because they are ignorant of copyright laws--or think that because it's on the Internet, it's free--it makes me really mad. It's stealing content, in my mind."

VanFossen isn't referring to the kind of plagiarism in which a lazy college student copies sections of a book or another paper. This is automated digital plagiarism in which software bots can copy thousands of blog posts per hour and publish them verbatim onto Web sites on which contextual ads next to them can generate money for the site owner.

Such Web sites are known among Web publishers as "scraper sites" because they effectively scrape the content off blogs, usually through RSS (Really Simple Syndication) and other feeds on which those blogs are sent.

VanFossen's Lorelle on WordPress blog is an authority on the Internet for blogging dos and don'ts. One of the no-nos is using content from other sites without getting permission.

"I make my living from my writing, and when people take it because they are ignorant of copyright laws--or think that because it's on the Internet, it's free--it makes me really mad."
--Lorelle VanFossen,
blogger

VanFossen has several ways of checking to see if other sites have scraped her posts. She puts full links in her posts to other articles of hers so that when one of her stories is posted on another Web site, it will link back to her story, and she can see the Trackback. Trackback is a "linkback" method Web publishers use to identify who is linking to or referring to their articles.

She has set up Google Alerts with her byline so that she will get notifications any time Google comes across a news site or blog with a reference to her. She also does a keyword search for her name on Google search, Google Blog Search and Technorati. In addition, she uses a WordPress plug-in that allows her to insert a digital fingerprint, a series of unrelated words, into her posts that she can search on in case her byline is stripped.

Invariably, VanFossen comes across her posts on other sites.

If she hasn't had a previous problem with a site, she will send the site publisher an e-mail asking them to not use her content without her permission. If she doesn't get a response, or she has had problems with the site in the past, she sends a "cease and desist" letter that informs the owners that they are violating her copyright and warns them she will take legal action under the Digital Millennium Copyright Act, or DMCA, unless they remove her content.

VanFossen also contacts the company that hosts the Web site, as well as advertisers on that site and search engines, providing the necessary evidence via mail or fax, as required. "The DMCA puts the onus on advertisers, Web hosts and search engines to remove copyright violations," she said. "I have a form letter I use."

In December, Michelle Leder, editor of Footnoted.org, used a cease-and-desist order to get her content taken off a site that was continuously republishing her posts. "Even the post I wrote about him stealing my content was posted on his site," she said with a laugh.

"It wasn't the issue of money," Leder added. "When other people's business model is based on stealing content, that's a significant problem."

One site that offers a free service for tracking copyrighted content online is CopyScape. About 200,000 Web site owners use the free service every month, and thousands pay for a higher-level service, said Gideon Greenspan, chief technology officer of Indigo Stream Technologies, which offers the service.

There are many aggregator Web sites that collect content from a variety of sources, often related to a specific topic area, like real estate or cars, around which they can serve contextual ads. While some of the sites reproduce entire blog posts or articles from other sites (CNET News.com included), others offer just headlines or the first paragraph or a few paragraphs. Many include attribution and a link back to the original article. But providing attribution does not preclude a copyright violation, experts say.

CONTINUED: In defense of scrapers...
Page 1 | 2

See more CNET content tagged:
post, DMCA, WordPress, blog, copyright law

Add a Comment (Log in or register) 18 comments
Copyright infringement is the sincerest form of flattery
by DevTop August 2, 2007 6:26 AM PDT
A splog or "spam blog" is a blog that steals content from other web sites, then aggregates and republishes all or some of the content on its own blog.

Splogs are created to promote and increase search engine ranking of affiliated web sites, and/or to make money from ads shown on the splog. Typically splogs are automated, but they can also be manual copy & paste. A recent study indicated that 56% of all blogs are spam, and there are over 575 thousand splogs reported.

http://www.devtopics.com/splogs-spam-blogs-and-stolen-content/
Reply to this comment View reply
This is the real theft
by ColdMast August 2, 2007 7:19 AM PDT
one's own ideas and thoughts, stolen & unreferenced.
Reply to this comment
Theft of Intellectual Property
by eyeswideoopen August 2, 2007 8:06 AM PDT
She's breaking my heart.

Let's consider outsourcing for a moment. What is that? It's theft of a career - theft of a living.

Take note Lorelle VanFossen:

There is an economic war on the American people - and you are not exempt. You'll find no sympathy from [American] programmers, engineers, and soon to be - doctors, nurses, teachers and any other professional whose job can be done cheaper in India or China - or that they can do when imported here on visa. (GATS - Free trade in human commodities).

Some advice: Learn to like cleaning toilets and mowing laws because that is the future for the American worker.... Oh wait...NOPE, Sorry. Those are the jobs that Americans won't do. We have illegals from Mexico for those jobs.

I guess you'll just have to punt.
Reply to this comment View reply
Linking to a site used to be an issue
by randombits August 2, 2007 8:39 AM PDT
I remember when tons of sites (many of them news portals) tried to make the case that it was illegal to simply link to their content and or site.

I got an email about 10 years ago from a publisher who asked permission whether they could link to my site because direct linking was the issue of the day. My response was "of course... and thank you!"

There's little question that outright plagiarism is a problem on some sites (complete copy of content with no credit to the original content producers) and wholly illegal.

Lorelle VanFossen seems to take an extreme and counterproductive view of her precious words where any sentence fragment including a link back to her original source is akin to theft (as opposed to advertising). This makes her an ego-centric nut who totally misses the point of the web.

I depend on the *summaries* I get from many blogs I visit. If I'm interested in reading the full article, I click on the original content link. The biggest issue I'm finding is that the "source" link is usually another blog with another summary. Sometimes these types of links can go for three or four levels before I find the original source. I think this is a far bigger issue.
Reply to this comment
Credit where credit is due
by DavidChartier August 2, 2007 9:09 AM PDT
For years you guys have been ripping off the blogosphere. Michael Arrington has
gotten vocal about it, and once again you've ripped off Weblogs, Inc.'s Download
Squad:

http://www.downloadsquad.com/2007/07/31/blog-pirates-on-the-horizon/

Please start citing sources and giving credit where credit is due. We link and cite you
guys constantly - the least you can do is give back to the community that is
supporting you.
Reply to this comment
Baby Boomer Bloggers
by RatherBCoding August 2, 2007 10:14 AM PDT
I completely agree with there being a serious issue of someone taking your content and claiming it as their own but if we considered displaying un-original content on the same page as contextual ads a crime; Yahoo, MSN, Ask, and Google should be the first to go but now doesn't that sound a little extreme. Get with the program baby boomer bloggers and realize where your traffic truly comes from. Search engines, social bookmarking, link sharing, trackbacks, and other sources; your five friends aren't creating that kind of traffic. Oh and there is this thing called Web 2.0 that allows you to extend your internet reach beyond those 6 bookmarks you have in your browser, you should check it out sometime.
Reply to this comment
Plagiarism Detection for Bloggers
by Mark McCrohon August 2, 2007 1:29 PM PDT
I have developed an online plagiarism detection tool called DOC Cop (www.doccop.com) that can help bloggers identify whether or not their blog has been posted by someone else somewhere else on the web.
Reply to this comment
there's no new new thing
by azareus August 2, 2007 4:24 PM PDT
how much content is completely novel in the first place? newspapers do this all the time through 'syndication.'

Complete verbatim copying should be avoided everywhere. think for yourself

http://brain.com
Reply to this comment
There's an old definition:
by NoVista August 3, 2007 6:56 PM PDT
If you steal from one book, it's plagiarism; when you steal from a hundred, they call it research.

Major non-fiction authors may reference scores of government documents, FOIA requests, newspaper articles, other authors' books and not every snippet of information might be cited. Is this stealing? Sometimes pushing Intellectual Property rights can go too far.

I've looked at Lorelle's site and, well, I'm not quite sure what her business model is. OK, she advertises her own blogging book, probably a print-on-demand item, for which she receives payment.

So, other than ego or hubris that someone 'could steal her words without attribution' what actual damage has she suffered. It's not as though her site is subscription only.

As a successful published writer, I prefer primary source material. I'm quite sure if I encountered a scraper site with some portion of her precious words, I could quickly find her site, even if there was no attribution.

I also know I've been ripped off in my time by editors and publishers, at a cost of time or money or both. I once provided a group of photographs to an editor on request for one project. Later, I learned they'd been used to illustrate someone else's article in that magazine.

Since I was never asked, my letter to said editor was less than cordial. Oh, 'he intended to tell me but was so busy ... ' and also forgot to pay. In the end, I got a check for what it'd cost to produce 'the 8x10 glossy prints with a paragraph on the back' [will Arlo Guthrie sue me for using those words? ha!]

No surprise I never sold another article to that magazine again.

There is no 'right not to be offended' but if plagiarism costs you money, or time, you do have a right to whine about it. Otherwise, STFU!

And the lawyerspeak of this statement: "the nature of the use should be noncommercial" re 'fair use' is just silly. Has this bozo never realized reviews of books and movies in print media by professionals are typically paid for?

Ah, America, the land of litigation. Shakespeare was right.
Reply to this comment
What I'm Worried About...
by theEternalFlame August 3, 2007 9:59 PM PDT
What's preventing me from posting my 'intellectual property' is the idea that someone can publish (in the 'real' world) what I write under their own name. How can I protect my ownership? I would like to one day publish...
Reply to this comment
Scraping Blog's ideas; harm later publication!
by Zeno77 August 4, 2007 9:47 PM PDT
Scraping Blog's ideas; harm later publication for profit!
Reply to this comment
this is not 100% true
by justinkadima August 5, 2007 5:02 AM PDT
If someone posts just a small portion of the original post (just an abstract or something like that)and also adds a link that points out to the original source for the rest of the article I do not see this as theft...is more like advertisement for the original author and his blog.
Of course is all the article is reproduced then is a problem but I do not see this behaivor very often.
The reason why i do not support the opinions and actions of Lorelle about this matter is because I realy do not want to see a general copyright mania on the web where everybody is suing everbody.Let's not forget that the web is bassically about the informations and free access to it.
Actually if you do not want people to take and use
your content ,put a login box and restrict the access only to the registred users.Then you will have control.
I find it very ironic that a blogger that has RSS feeds is upset that users are agregating it's content.
Reply to this comment
I don't steal this web content
by amko_sa August 17, 2007 11:50 AM PDT
Maybe your text is steal from other peapole or idea, site.Every word is copyright.Some people respect other news and copy only part of that news with source link.That is ok for me.Everything on the internet is copyright.Most people on the internet use other source for your own story.We are copyright.Sue all world for copyright.If your story wery important you can sell it on marcet place.For me internet is free place for all people.If you sue some websites because part of your text is in that websites and maybe I am here with the help of that sites your story is not important for me.That site your story consider as importent and interesting.
Sory about my english
Reply to this comment
Powered by Jive Software
advertisement

Latest tech news headlines

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.

More feeds available in our RSS feed index.

advertisement

Inside CNET News

Scroll Left Scroll Right
  • News - Business Tech

    Chrome's JavaScript challenge to Silverlight

    The advent of Google's Chrome browser, software pros say, should spur a big speedup for JavaScript, which would raise its standing against Microsoft's Silverlight technology.

  • Gallery

    Photos: Top 10 reviews of the week

    Here are CNET Reviews' 10 favorite items from the past week, including the TiVo HD XL, Sony Cyber-shot DSC-H50, and the Dish Network's newest digital TV converter box.

  • News - Apple

    Apple watchers spot 'iPod Nano' photos

    The rumor mill has long been predicting a longer, leaner new version of the iPod Nano, and now it's conjuring up some pictures.

  • Outside the Lines

    EIC Squared: Chrome, iPods, and a Dell-Salesforce union

    On this week's EIC Squared podcast CNET's Dan Farber and ZDNet's Larry Dignan discuss Google's latest rocket launch--the Chrome browser--as well as Apple's iPod event next week and a Dell-Salesforce.com union.

  • Video

    Katie Couric reflects on first Webcast

    The political conventions are over and so are CBS Evening News anchor Katie Couric's first series of Webcasts. CNET's Kara Tsuboi sat down with Couric on the final night of the Republican National Convention to discuss what she liked about Webcasting, some of her most memorable guests, and whether TV news will still be around by the next round of conventions.

  • News - Digital Media

    At 10 years old, whither Google?

    Daniel Sieberg of CBS News looks at how the company grew exponentially from start-up to superstar and part of our culture, but what's ahead?

  • Video

    YouTube plays party politics

    During the presidential campaigning four years ago, YouTube didn't even exist. Now it's a tool candidates must master to get their message across. CNET's Kara Tsuboi stops by the YouTube upload booths at the Democratic and Republican conventions to find out why Google's video site has such a big presence in Denver and St. Paul, Minn.

  • News - Gaming and Culture

    Are Demo and TechCrunch50 fragmenting their audiences?

    With both events scheduled to start Monday, many press, as well as venture capitalists and others are having to choose which one to attend.

  • News - Cutting Edge

    Execs predict next Google-like tech

    On eve of company's 10-year anniversary, researchers and business pundits speculate about what technologies might someday have as much impact as Google.

  • Gallery

    Images: The art of 'Spore' prototypes

    Will Wright and his Maxis team worked on dozens of prototypes to test the elements of their soon-to-be-released evolution game. Here's a sampling.

  • Webware

    Mozilla releases second Firefox 3.1 alpha

    Added features include support for a new video tag element introduced with the HTML 5 standard, along with some speed enhancements.

  • Green Tech

    Duke Energy to invest in mini solar power plants

    Can hundreds of rooftop solar panels collectively operate like a central power plant? Duke Energy launches $100 million distributed solar program to find out.