Google vs. the media industry

This year, there has been a number of battlefield reports from the ongoing struggle between the traditional news industry and search engine Google about the legal status of Google's reuse of materials automatically collected by its robots and shown as part of its SERP (Search Engine Results Page) as snippets or linked to in the so-called Google “cache”.

Below are those I've collected during the year.

Belgian Court puts a Halt on Google News

Google News indexes a large number of newspapers on the web, and publishes short abstracts with links to the source. Newspapers often presents the articles for free in their online editions for a short time. Then they are moved to a premium area, where you have to pay to read. However, after the article is moved, its headline, lead-in and often also sized down copies of accompanying photographs are still available via Google, and often the entire article is available for free in Google's so-called "cache". A sizable portion of online newspapers is not too happy about it. A complaint (reported by ChillingEffects.org and Expaticia) filed by Belgian publishers association Copiepresse, representing 17 French and German language newspapers in Belgium, was recognised in a summary hearing on September 5 2006, and confirmed in an appeal hearings published on September 22. A new hearing, on November 24, resulted in a ruling was announced on February 12, 2007. This ruling reduced the daily fine for non-compliance, but otherwise upheld the previous decisions. The ruling appear to be based mainly on the testimony of an expert witness, Luc Golvers, (lecturer at the ULB, Université libre de Bruxelles, and president of CLUSIB, Club de la Sécurité Informatique Belge), about Google's so-called "cache":
Considering that his research has led him to prove that, while an article is still online on the site of the Belgian publisher, Google redirects directly, via the underlying hyperlinks, to the page where the article can be found, but as soon as the article can no longer be seen on the site of the Belgian newspaper publisher, it is possible to obtain the contents of it via the "cached" hyperlink which then goes back to the contents of the article that Google has registered in the "cached" memory of the gigantic data base which Google keeps within its enormous number of servers.
This lead the court to the conclusion that:
[t]he way in which the Google News presently operates cause the publishers of the daily press to lose control of their web sites and their contents.
In the end, the court found that Google infringed upon the publisher's copyright, caused the publishers financial harm, and violated the conditions for use posted on the publisher's web pages. Google was ordered to:
  • remove all content belonging the publishers represented by Copiepresse from the sites Google Nieuws België and Google.be, including the material stored in Google's so-called "cache";
  • post the court's ruling on Google's Belgian the home pages for a period of five days.
As always, there is no mentioning in the ruling of the fact that the robots.txt (Robots Exclusion Standard) allows publishers to opt-out from being being indexed by search engines, but in an interview with Groklaw Margaret Boribon, secretary-general of Copiepresse, explains why the society do not accept robots.txt as a solution. Quote:
This is not acceptable. No. No, no. We cannot choose between being dispossessed of our content or erased. It is not acceptable. It is not Google who can make the laws governing our content. That is not acceptable. And all the standards and techniques they use, as brilliant as they may be, are techniques which belong to them, but which have no legal value. None whatsoever. They are not standardized, they have no legal status, there is no law which says: if you are not opposed, it's normal that we take; there is no law which says that. [...] No. Their opt-out, it's their own strategy but it has no legal basis. [...] I mean, it's just something which is technologically proven, which works well.
Google's official response to the ruling has been posted in GoogleBlog. Google say that they intend to appeal the decision, and add:
We believe search engines are of real benefit to publishers because they drive valuable traffic to their websites. If publishers do not want their websites to appear in search results, technical standards like robots.txt and metatags enable them automatically to prevent the indexation of their content. These Internet standards are nearly universally accepted and are honored by all reputable search engines. In addition, Google has a clear policy of respecting the wishes of content owners. If a newspaper does not want to be part of Google News, we remove their content from our index - all the newspaper has to do is ask. There is no need for legal action and all the associated costs.
In a statement to Reuters, Copiepresse's Margaret Boribon, declared that she was pleased with the court's decision, and added that the association would still consider allowing Google to display extracts from the Belgian newspapers for a fee, although she said it was up to Google to initiate contact. I think Danny Sullivan's comment, Show Me The Money, Not The Opt-Out, Say Publishers (SearchEngineWatch.com), summarizes the present situation well. See also commentary at legal firm Pinsent Masons: Google will appeal Copiepresse decision and Why the Belgian court ruled against Google (Out-Law.com).

Google Cache is Legal

A district court in Nevada, in the case Field vs. Google, has ruled that it is OK for Google to download and store a local copy of copyrighted works, and to make this copy searchable and accessible to the public. The ruling is about the so-called “cache” – a rather oddly named local copy that Google keep of most documents that Google's robot (or agent) pulls from the Internet.

This is by no means the definitive word on this. The ruling is from a lower court, and it invokes the concept of “fair use”. “Fair use” is an integral part of US copyright law, but there is no direct equivalent in European copyright law.

The most interesting aspect with the ruling, however, is that it asserts that failure to set the appropriate metatags in the HTML-file, and/or directives in the “robots.txt” file consitutes an implied license to copy, index and cache (store) the contents. These elements has not (so far) been recognized in similar courtcases in Europe. In a case brought before the Copenhagen city court in 2002 (DDF vs. Newsbooster), the court imposed an injunction [document is in Danish] banning Newsbooster from crawling certai newspapers. Like Google's robot, Newsbooster's robot honoured standard metatags and robot protocols, but the plaintiffs did not make use of them (as described on pp. 7-8, 22-23 and 29 in the injunction). As far as I know, Blake vs. Google is the first case that has properly recognized that the failure to use appropriate metatags or directives constitutes an “implied license” to make a local copy of copyrighted content, and to make this copy accessible to the public.

Google not allowed to show thumbnails

According to a report in The Register, and WikiPedia, a preliminary ruling in a US District Court has found that Google Image Search infringed the copyright of porn site Perfect 10 by displaying thumbnails of the site's images when presenting search results. The ruling also says that Google's text links to the full size images did not constitute infringment.

Following the court's decision, both sides cross-appealed to the United States Court of Appeals for the Ninth Circuit. As far as I know, the appeals are still pending.

Media industry challenges Google

The World Association of Newspapers (WAN) recently announced that “Newspapers want search engines to pay”. This stance on search engines indexing and storing (“caching”) copies of of material made available on the web has also been reflected locally. In a recent interview (“Google bryter opphavsrettregler” – i.e.: Google violates copyright law, Dagens Medier, p. 5, issue 3, Feb. 24, 2006), the chief legal officer in the Norwegian Media Businesses' Association (MBL), Pernille Børset, states that the association believes Google cache to be illegal, and intends to stop Google and a local company using Google' technology, (i.e. Eniro's Kvasir) from breaking copyright law by “caching” articles collected by its robots from websites (maily news sites) operated by members of the association.

There is no indication in the interview (or elsewhere – see, for instance http://wan-press.org/robots.txt) that the association is aware of opt-out conventions such as robots.txt to regulate how search engine robots make use of content.

Google allowed to archive Usenet

Recently, Google has won a legal action brought by writer Gordon Roy Parker over a Usenet posting written by Parker that Google archived and partially displayed in search results as part of Google Groups.

Judge R. Barclay Surrick of the US District Court for the Eastern District of Pennsylvania found that Google's automatic archiving of Usenet postings and excerpting of websites in its presentation of search results did “not include the necessary volitional element to constitute direct copyright infringement”.

Parker also sued for defamation because Google archived negative comments made about the writer in other Usenet postings and websites, as well as invasion of privacy and negligence. The judge dismissed these because the US Communications Decency Act grants immunity to those who provide material on the internet written by others. He also dismissed other claims in respect of alleged racketeering and civil conspiracy on the grounds that they did not make sense.

Google agrees to share revenue with AP

Recently, the San Jose Mercury News (2006-07-30) reported that Google has entered into an agreement with Associated Press about paying compensation for using content from AP. There is no details about the agreement, but is probably based upon sharing revenue for advertising on a pay-per-click or pay-per-view-basis.

Google has up to now been reluctant to share revenue with online content providers, and has as a consequence been sued by the Belgian newspaper consortium Copiepresse and by the Paris-based Agence France Presse for including images, headlines and story leads in search results without permission.

San Jose Mercury News speculates that the Google/AP agreement “could herald a major shift in the relationship between the old media and new Internet gatekeepers”.

Google removes images owned by Norwegian Media

When Google News launched its Norwegian site on Nov. 16 2006, the Norwegian Media Businesses' Association (MBL) immediately threatened legal action unless Google refrained from republishing thumbnails of copyrighted images on the site. On Dec. 9 2006, Google succumbed, and removed all images belonging to MBL member sites. Some images appear, but they all originate from sources that are not member of the MBL.

Source: Steffen Fjaervik: Google News Norway Removes Images.