Jan 14, 2010

Seminar Paper: The Google Books Settlement Agreement

I'm publishing here a series of papers written by law students in my 'Property, Heritage and the Arts' seminar from the Fall of 2009. This paper was written by Matthew Detiveaux:

The Google Books Settlement Agreement:
A New Rubric for Cultural Heritage Management and Privatization?

With the digital age now at its heyday, everything changes. When we want to know something about a topic, we Google or Wikipedia it. When we want to see what something looks like, we perform an image search. When we want to hear an artist, we “download” them. When we want to read a book, Google hopes that we will use Google Books to find it, preview it, and gain access to it.

Using complicated algorithms similar to those used to fuel what is arguably the world’s most sophisticated web browser, Google has unveiled Google Books[i]. Like Google’s other services, Google Books provides a simple, intuitive interface built around the user’s needs. Google has made its name on being the web browser of choice for just about everyone. Many turn to Google for e-mail (Gmail[ii]), driving directions (Google Maps[iii]), news (Google News[iv]), social interaction (Orkut[v]), streaming video (YouTube[vi]), and even cell phone service (the T-Mobile G1, powered by a Google Operating System[vii]). Google’s new hat is not only that of a librarian; Google wants to be the largest, most comprehensive digital library in the world. This is right in accord with Google’s mission: “to organize the world’s information and make it universally accessible and useful.”[viii]

Google, Inc. was founded in 1998 by Larry Page and Sergey Brin, two Stanford Ph.D. candidates who developed algorithms to “data mine” the far reaches of the Internet and bring the most relevant and useful pages forward in a search tool.[ix] The patented PageRank[x] search formulas are highly complicated, with more than 500 million variables and 2 billion terms being used to run its search engine in 1999 alone. Not so confusing are the economics. On January 22, 2009, Google announced fourth quarter financial revenues of $5.7 billion, an 18% increase, when compared to 2007 fourth quarter earnings.[xi] Google, repeatedly voted one of the best places to work in America, is a household name and a Goliath when it comes to the Internet.


“In the beginning, there was Google Books”.[xii]

“About Google Books” tells the story of two Stanford Ph.D. students who had the dream of a “future world in which vast collections of books are digitized,” and “[where] people would use a ‘web crawler’ to index the books’ content and analyze the connections between them, determining any given book’s relevance and usefulness by tracking the number and quality of citations from other books.”[xiii] These two doctoral students were Larry Page and Sergey Brin, the co-founders of Google. Citing the inspiration of other digitizing projects, the two formed a team of “Googlers” who worked hard to make their dream a reality. They visited libraries, universities, and publishing companies to clarify their vision. While visiting the Bodleian library at Oxford University, they mused: “For the first time since Shakespeare was a working playwright, the dream of exponentially expanding the small circle of literary scholars with access to these books seems within reach.”[xiv] These Googlers are not just dreamers, they are negotiation experts. In 2004, “Google Print” was joined by Blackwell, Cambridge University Press, The University of Chicago Press, Haughton Mifflin, Hyperion, McGraw-Hill, Oxford University Press, Pearson, Penguin, Perseus, Princeton University Press, and more.[xv] Later that same year, the Google Library Project partnered with Harvard, the University of Michigan, the New York Public Library, Oxford, and Stanford. Today, Columbia, Cornell, University of Texas and others have joined the Google Library Project, an arm of Google Books. [xvi]

A Harvest in Need of Laborers

Cultural property is traditionally thought of as objects or art that belong to a group of people or those who share a cultural identity. With the Internet and technological advances expediting the process, knowledge, commerce, and even culture have gone global. With few restrictions, the world is quite literally at one’s fingertips. It is of no surprise that purveyors of cultural property – museums, libraries, and archives – are seeking out ways to use the technology and the Internet to accomplish their missions.[xvii]

There are predominately broad category of things being preserved by museums, archives, and libraries today. First are the traditional, pre-digital things. Such include: graphic arts, sculpture, books, governmental documents, and pretty much anything that was not created and simultaneously published using computers and the Internet. The second category contains the rest, that with which we interact on a daily basis: everything that was “born digital”. The mission of the Internet Archive is to preserve the second class of information; they are trying to preserve the Internet, itself, as a cultural artifact.[xviii]

In mid-2008, Google hit a milestone when its Internet tracking software counted and databased 1 trillion unique uniform resource locators, or URLs (websites) on the web.[xix] Internet Archive, founded in 1996, had goals of archiving all this information, literally keeping the files accessible for future generations to browse.[xx] Problematic to Internet Archive’s goal today is that the web is so big due to the easy of website creation and ubiquity of users. Additionally the Archive requires the skill of a programmer to use the site.[xxi] So, although Archive suggests that we have a “Right to Remember” our digital cultural property, that right comes with the obligation to know more than how to turn on a computer and functionally surf the web.

Regarding digital preservation, question remains open as to what should be preserved. Archive is the only visible organization out there with the primary goal of preserving the web experience, itself, for future generations.[xxii] Quite fun is using the “Way Back” machine to access old Geocities sites.[xxiii] Archive claims there is social value in knowing what people were doing online at any given moment in the Internet’s history. But, because the Internet is the sum of all computers connected to it and the input of all users at those computers, the Internet is an intrinsically dynamic and ever-changing thing.[xxiv] Since not all the computing power of the machines connected to the Internet will ever be dedicated to preserving a history of the different interactions of these machines, there will never be a computer (or network) able to document the most ever-changing system known to man. Even if we choose to preserve a certain feature or niche of the Internet, other presumptions must be taken on. For instance, does e-Bay tell us more about commerce than people’s stock research? The Internet Archive raises many fundamental questions of digital cultural heritage left unanswered: if the Internet is an accumulation of all its users, who and what is worth preserving?

Other preservation projects focus on preserving “pre-digital” or digitally independent works. These works include, but are not limited to, books, movies, art, photography, and documents.[xxv] Libraries and museums have struggled to “digitize” their collections for preservation and Internet publication for two specific reasons: money and law.[xxvi]

Because digitization is so new a phenomenon and so specific to what is being digitized, no one can say exactly what needs to be done. For example, where museum preservation techniques for a Venetian vase are pretty well established, no one is quite certain how to preserve a digital picture of that vase so that future generations can view it; intricate problems arise with the ever-changing types of media used to record our arts and humanities.[xxvii] Beginning in April of 2009, the University of Leicester School of Museum Studies opened a MA/MSc program specifically for digital heritage.[xxviii] Under the course details, it touts itself as “at the centre of, and helping to shape an emerging discipline – one of only a few courses which locate themselves specifically at the nexus of digital media and heritage.”[xxix]

But, the digital heritage discussion is not just for academics. In 2000, Congress appropriated $100 million to create the National Digital Information Infrastructure and Preservation Program (NDIIP), an organization whose mission is to save things “born digital”.[xxx] However, the program had $47 million of its funding rescinded and is threatened by not receiving matching non-governmental funding. Already lost and gone are raw data from early satellite probes, including the Viking mission to mars, websites covering federal elections, and other data sources covering natural disasters like Hurricane Katrina.[xxxi] In the digital age, the Internet is not only our source for news; it is our newspaper. Because of its dynamic, fundamental quality, the question remains: “Which data should we keep and how should we keep it? How can we ensure that we can access it in five years, 100 years or 1,000 years? And, who will pay for it?”[xxxii]

As usual, the problem of money and law do not operate on separate planes; instead, they are closely inter-related. Museums face various problems when going to digitize their works, mostly in the face of copyright law. “The interface of digital cultural preservation and copyright law confronts museums with challenges that are much more salient and crucial for the continuous cultural vitality of museums and their presence in digital domains.”[xxxiii] Aside from a concerted lack of consensus on whether to digitize, how to digitize, and how to fund digitizing, museums that have affirmatively answered these questions run into the issues of law. Copyright and licensing have been the key legal players in this realm.[xxxiv]

The Librarian of Congress issued a group to study the effects of Section 108 of the Copyright Act, the Fair Use Doctrine, on museum digitization efforts and to promulgate legislative suggestions to solve these problems.[xxxv] Traditionally, one major problem museums face is that museums are not even eligible under the Fair Use Doctrine of Section 108(b) of the Copyright act.[xxxvi] Further, the Act limits those who do fall under it to three copies of a work, and does not provide for digital copies for preservation efforts.[xxxvii] Most importantly, as Pessach notes, current Section 108 does not cover the provision of on-line, public access to digitally preserved materials[xxxviii]. Thus, museums wishing to digitize cannot forecast the legality of doing so. To date, courts have not had the opportunity to review the use of the Fair Use Doctrine regarding museums’ use and distribution of digital copies of their works.[xxxix] But, arguably, the courts have answered this question for a company called Google.

Pessach cites the famous Arriba Soft case (280 F.3d at 944) and Perfect 10, Inc. v. Google, Inc., 416 Supp 2d 828 (C.D. Cal. 2006) in drawing a distinction between the societal roles of search engines and museums. “Search engines provide commercial functions that are different from the unique functions that cultural institutions may require.”[xl] Legal problems may arise when museums digitize works of art. American courts hold that works produced by digitization are not subject to their own copyright; thus, the underlying copyright of a work being digitized is the copyright that applies to the original work. Therefore, museums’ ability to digitally display any work they might digitize becomes a question of copyright infringement.[xli] In contrast, British law provides that the digitized (photo) is subject to a new copyright.[xlii] The British approach differentiates an “original” from a “new work”, based on a minimal standard of labor and skill used to create the new work.[xliii] Pessach concludes that fair use law should be liberally applied to museums.[xliv] Museums’ role, as public trusts and the purveyors of cultural knowledge and cultural heritage, necessarily conflicts with the underlying rule of copyright, that markets are the appropriate arena for producing and distributing cultural goods.[xlv][xlvi] This battle is not simply theoretical or academic, but is pertinent in the Google Books case being litigated in the courts today.

The Lawsuit

On December 1, 2009, the United States District Court for the Southern District of New York filed a Memorandum Decision against plaintiff Amazon.com, as a member of the plaintiff class, against reconsidering the granting of “the Court’s order granting preliminary approval of the parties’ Amended Settlement Agreement.”[xlvii] Prior to this decree, on November 19, 2009, the Court granted preliminary approval of the Amended Settlement.[xlviii]

The Google Books Settlement Agreement is the result of about five years of litigation, starting in 2004, when Google announced that with its Library Project, it would begin scanning the entire contents of the libraries of Harvard, the University of Michigan, the New York Public Library, Oxford, and Stanford. The combined collections are estimated to exceed 15 million volumes.[xlix] The suit was brought by the Authors’ Guild and McGraw-Hill Cos, Inc., to enjoin Google from scanning their books and publishing them on the Internet. Under the Settlement Agreement, Google will pay $125 million to authors and publishers, including $45 million to copyright holders whose works were digitized without permission.[l] Google will also create a Book Registry, which will be the clearinghouse for the funds raised through its Google Books services. Authors of “orphaned” works, those Google has determined have been abandoned or are without traceable proprietary rights, will have five years from the day that revenues are created by that book to come forward and claim their portion the revenues. Otherwise, those revenues will become part of the Book Registry and be used to facilitate its operation. As far as those who have already had their books scanned by Google, they have the following options. They can opt out of the Google Books program, requiring Google to remove their book from its system. They can opt out of the Google Books lawsuit, allowing them to be removed from the “member class” and gain the ability to bring their own suit against Google Books, or they can do nothing.[li]

Monday, December 14, 2009, the Federalist Society of New York hosted a public debate regarding the legal issues surrounding Google Books. The debate was led by the infamous contracts scholar, Richard A. Epstein and Jonathan Jacobson, a partner at Wilson Sonsini Goorich & Rosati. Epstein argued against the book deal, while Jacobson argued for it.[lii] Jacobson: “We must stand up and applaud [Google], because this really is the new Library of Alexandria.”[liii]

During the Federalist society debate, one woman repeatedly interjected her ideas on the matter. Her name is Lynn Chu, and she is a lawyer, a journalist, and the literary agent of David Brooks, Richard Epstein, Ken Star, Clarence Thomas, and others.[liv][lv] Quite pertinently, the rhetorical question was addressed to the audience as to why an author wouldn’t come forward and claim the profits gained from Google’s making his book available online. Chu responded, simply, that copyright law has never before made authors do such a thing to reap the benefit of their profits.[lvi]

The legal questions surrounding the Settlement are profound and legion. First, there is the issue of whether this Settlement agreement is not really an agreement at all, but whether it is creating a novel legal right to be given to a private entity. Epstein argued (alongside the Department of Justice) that it would have been much simpler to have an “opt-in” scheme as opposed to an opt-out scheme.[lvii] In that way, authors, publishers, and those with intellectual rights in books can choose to be a part of the program instead of being defaultly included in the program and having to elect to be not so.[lviii] The Settlement Agreement leaves unanswered questions of what Google can and will do with the Books, and the program has already made preparations for Google to sell advertising space within the books it creates. The counter to this argument is that without the opt-out format, Google will have no way of protecting and including “orphaned” books, whose societal value would otherwise be lost if was not part of the Google Books repository and made available to the Internet.

The Department of Justice was, at the time of this paper, investigating the Settlement Agreement, but to the author’s knowledge, nothing beyond a memorandum in the interest of the United States of America and against the Settlement Agreement was filed.[lix] The DOJ memorandum only mentions the potential for Google Books becoming a monopoly, in opposition of anti-trust law in force. The Memorandum specifically mentions that Google might, by virtue of being the only entity with the digital copies of these orphaned works, create a service unto itself that, via the Book Registry, would be unassailable to competition. Further, since under the Agreement, Google would be able to act “only to the extent permitted by law,” the question remains open as to whether or not future other corporate entities could gain licenses from Google to use the books, since it is unclear where the copyrights of those scannings is vested, in Google or in the original author of the book.[lx]

The DOJ Memorandum also cites that the issue of this being a class action suit is questionable, because it may not afford sufficient notice to all members involved. “The Proposed Settlement is one of the most far-reaching class action settlements of which the United States is aware; it should be no surprise that the parties did not anticipate all the difficult legal issues such an ambitious undertaking might raise.”[lxi] Further, the DOJ muses whether or not the Agreement effectively settles in favor of all those plaintiffs purportedly represented, because, with its gaps, it does not specifically say what can and can’t happen in the future to those works of authors.[lxii]

More globally, the DOJ memorandum cites policy concerns. “As a threshold matter, the central difficulty that the Proposed Settlement seeks to overcome – the inaccessibility of many works due to the lack of clarity about copyright ownership and copyright status- is a matter of public, not merely private, concern.”[lxiii] Finally, the DOJ forecasts problems with the potential of the Agreement to overbroadly sweep in foreign works, whose authors are not required to register under U.S. current copyright law.[lxiv]

Private Interests versus Public Gain

The profound thing about Google Books is that it is assumes the role of a public trust while maintaining its identity as a private, for-profit, company. The Google Books settlement agreement answers for a private, for-profit entity what no court has answered for the non-profit museums wishing to digitize: whether it is legal to take the tangible work of a living person, digitize it, and make it available to the public. Under the terms of the agreement, Google will retain more than thirty percent of the sales of the digitized copies of the books.[lxv] Additionally, Google will profit from advertisements within the Google Books search page, the referral to external bookstores for the purchase of a given book, and the in-line advertisements in the digital books, themselves. To date, Museums have no such law as the Google Books Settlement Agreement to protect them from known and unknown copyright holders. Being exempt from Section 108(b), the Fair use Doctrine, museums are left out in the cold.

In an early 2009 article, Michael Dunn argued that the question of fair use regarding the Google Books settlement has been intentionally side-stepped by Google and the court.[lxvi] “In reaching the settlement, the parties dodged the question of whether the digitization of a book, in whole or in part, would qualify as protected fair use. Although Google has answered the question for itself, the question remains for other digitizers.”[lxvii] These other digitizers include museums, whose traditional role has been to be the guardians of culture, art, and knowledge. What we are seeing in the Google Books case is the privatization of the role of Cultural Guardian, a role usually reserved to museums, public libraries, and the state. As the Department of Justice points out, the litigation-based agreement is vastly sweeping, so much so that it vests Google indefinitely with a privilege and a duty highly coveted: being the biggest and best library in the world.

Although the Google website touts partnerships with authors and printers, the Department of Justice’s concerns about the sweeping effects upon international literary copyright have become a reality. In a decision December 19, 2009, the French court demanded that Google stop scanning the French books cited in a pending court case and pay 300,000 in damages and interests to publishers.[lxviii] Google Books France’s attorney Alexandra Neri responded: “I don’t think anyone wins. This decision just holds back the progress of access to online information. Defense of author’s rights are a French tradition, but now France could be left behind, without access to its own culture.”[lxix] The court mandated that this be published on Google’s French “Google Books France” website, along as be published in three newspapers.[lxx]

There is also upheaval in China. Mian Mian has filed a lawsuit against Google, claiming copyright infringement regarding her book “Acid Lover”.[lxxi] Although Google deleted the work from its website, the work still appears in searches. Ms. Mian is seeking Google to remove all passages of her book and issue her a public apology, along with damages to the tune of about $8,800. She is the first individual writer in China to sue Google China.[lxxii]

Final Thoughts

Surely, Google is a good candidate for the scanning of libraries’ books; it is one of the most technologically-advanced companies in the world, and it has the funding to make librarians’ dreams come true. Aside from the murky legalities and angry publicists, Google Books has become a success, and Google is proud to have prevailed in its quest.

“Today we're delighted to announce that we've settled that lawsuit and will be working closely with these industry partners to bring even more of the world's books online. Together we'll accomplish far more than any of us could have individually, to the enduring benefit of authors, publishers, researchers and readers alike.”[lxxiii]

Quite relevant is Google’s next digitization project, the Iraqi museum.[lxxiv] For this announcement and ceremony, not only did the CEO visit, he brought his daughter, journalists, bodyguards, and American embassy officials. In a public address, he said: “I can think of no better use of our time and our resources than to make the images and ideas from your civilization, from the very beginnings of time, available to billions of people worldwide.”[lxxv] U.S. Ambassador Christopher R. Hill touted the event as a good thing, being “part of an effort spearheaded by the State Department to bring technology to Iraq. We thought, what better way to do that than to bring Eric Schmidt to Iraq?” Another State Department official refuted the suggestion that event was a government-sponsored infomercial, citing that other companies are involved.[lxxvi]

Curiously, Iraqi citizens were not invited to the event. The museum has been closed since the beginning of the Iraq occupation by American troops, although three intermittent partial openings have taken place over the course of the American stay. Most of the museums’ treasures are still hidden from the public eye in secret storage, due to strings of lootings that have taken place since the American occupation.

At the same time, Nicholas Sarkozy has unveiled a 35 billion spending plan, geared toward digitizing French documents and preparing France for the “challenges of the future”.[lxxvii] Back home, the Library of Congress’s National Digital Information Infrastructure & Preservation Program moves along at an unknown pace, while Google and its lawyers fight vigorously for the right to become the new, universal library and potentially, the new museum.

In “Trade Versus Culture in the Digital Environment: An Old Conflict in Need of a New Definition”, Mira Burri-Nenova discusses the UNESCO Convention on the Protection and Promotion of the Diversity of Cultural Expressions, its presumption that culture is distinct from trade, and the impact this assumption will have in the law’s inefficacy in respect to WTO agreements.[lxxviii] As she points out, “...cultural rights do not correspond to national boundaries”.[lxxix] “It is indeed odd that while the convention clearly acknowledges the dual (trade and culture) nature of cultural goods and services and celebrates their cultural side, no attempt is made to provide guidance on how states might reduce the trade distorting effects of cultural policy matters.[lxxx]

As the Department of Justice quite clearly stated in its brief for the Google Books case, the threshold issue is one of public policy.[lxxxi] That policy regards cultural property and the question of whose it is. Is an accumulation of the world’s books, legally and morally, ours for the taking? And if so, is it responsible to give the books, the library itself, and the keys to the building to a private company? The Google Books case exemplifies what the heart of Burri-Nenova’s argument: culture and trade are intrinsically and fundamentally intertwined, and international policy that does not fully acknowledge both, as inter-dependent, is inherently flawed. She writes,

“...[I]t is likely that most of the existent and conventionally applied cultural policy measures, which are only ‘analogue-based’, do not sufficiently take into account the changed regulatory environment, nor do they have the potency to address appropriately the new digital conditions. If such measures are maintained, we hold that they serve either protectionist interests or are the remnants of an ill-conceived (but politically accepted) perception of globalization and its effects upon culture.”

(Burri-Nenova, 40, 41) (emphasis added).

Google’s mission, “to organize the world’s information and make it universally accessible and useful,” is an ambitious one to say the least. As opposed to the case of pictures[lxxxii] the goal of organizing the world’s books demands that Google take those books and digitize them as its primary step. As is clear, the legal implications of this are profoundly vague and worrisome, not only to authors and publishers, but also to any of us that have copyright interests of anything, including art other intellectual property. Could the next step be a patent search engine, using complex algorithms to link patents with similar components together? Whether we have a “Right to Know” and a “Right to Remember” beg a huge underlying question which Google has not answered or even acknowledged, does Google have the right to organize our world and sell it back to us?

“Finally, one should bear in mind that the new information environment is extremely dynamic and complex and exacerbated the interrelatedness of effects, making regulatory decisions precarious.”[lxxxiii] If not overturned, this court decision will have groundbreaking effects, not only on writers, publishers, and literary agents, but it potentially spur new law relevant to other private digitization ventures like the Google Iraq Museum venture. On one hand, it may either provide a breakthrough in some museums’ efforts to digitize, breaking copyright barriers rights open. But on the other, it may advance other multi-national corporations to key positions, allowing them ripe, unbridled access to unknown lodes of cultural property, the likes of which are owned by the Iraq museum. In this brave new chapter in preservation history, will the Google Books Settlement Agreement become a standard for privatization of cultural property stewardship? As of today, the Agreement is provisionally in effect and has not been appealed. We can only speculate as to what it will mean to the arts, humanities, and academic communities outside the literary sphere, especially to our museums.


