Craigslist and the Revenge of Web 1.0
According to this posting at Search Engine Journal, "Craigslist has blocked the spidering and indexing of its classifieds sites from search engine robots."
This all began with the Craigslist-Oodle smackdown.
What the blocking means is that individual Craigslist pages are not going to appear in search results—although several searches I performed did return specific Craigslist pages. Most "destination sites" would follow suit if they could, but since search is the "front door to the Internet" they cannot afford to lose the traffic and visibility.
It's like the old Pacific Bell/SBC (now at&t) YP ads (I'm paraphrasing from memory): "If it's not in there, maybe it doesn't exist."
Ultimately there's a longer, larger and more nuanced debate here about who owns the content, what is factual, what is proprietary and who has the right to publish it. Feist v. Rural is the arguably "controlling authority" (legal jargon), but it has yet to be tested in an Internet context. We may soon see one.
Quotes from Feist:
[F]acts are not copyrightable … compilations of facts generally are. … Factual compilations … may possess the requisite originality. The compilation author typically chooses which facts to include, in what order to place them, and how to arrange the collected data so that they may be used effectively by readers. These choices as to selection and arrangement, so long as they are made independently by the compiler and entail a minimal degree of creativity, are sufficiently original that Congress may protect such compilations through the copyright laws. [citation omitted] Thus, even a directory that contains absolutely no protectible written expression, only facts, meets the constitutional minimum for copyright protection if it features an original selection or arrangement.
This protection is subject to an important limitation. The mere fact that a work is copyrighted does not mean that every element of the work may be protected. Originality remains the sine qua non of copyright; accordingly, copyright protection may extend only to those components of a work that are original to the author.
… [C]opyright in a factual compilation is thin. Notwithstanding a valid copyright, a subsequent compiler remains free to use the facts contained in another's publication to aid in preparing a competing work, so long as the competing work does not feature the same selection and arrangement.
While one could forcefully argue that a search algorithm is proprietary and thus search results are a proprietary "compilation of facts," could a site or search engine crawl/scrape another's index, make some minor changes in the presentation of results and be perfectly legal in doing so?
It's a provocative and as yet unresolved question.
This Post Has One Comment
At the ILM-05 panel "The Future of Classifieds in a World of 'Free'", Classified Intelligence reported that Google was "making the rounds of classified advertising Web sites, requesting a direct feed of listings" and LiveDeal.com indicated it declined Google's invitation to provide a direct feed of LiveDeal's listings into Google Base. To date, Internet content aggregators and search engine services have succeeded in obtaining free access to content from content publishers in large part because the aggregators and search services did not repurpose the content; the content was analyzed to create abstract presentations to users with redirection to the publishers for the "real", complete content. The Google Base relationship, at present, differs in that Google is repurposing publishers' content (without payment to the publishers)to republish the "real" content on Google "landing pages" resulting in a lessened need for users to redirect to the publishers for access to the complete content. In addition to lessened traffic potential from such a relationship, publishers risk losing control of content, advertisers and brand.