Wednesday, July 29, 2009

Google Books: Equal access to knowledge

The Institute of Intellectual Property and Social Justice at Howard University School of Law and Google

Equalizing Access to Knowledge

Keynote Address: David Drummond, Senior Vice President, Corporate Development and Chief Legal Officer, Google Inc.

Many underserved communities exist in America: minority groups, people with visual impairments, smaller educational institutions—historically black colleges have endowments 1/3 average. Education spending and outcomes are disproportionate for black and Hispanic students; visually impaired don’t have access to the same textbooks as other students. We are an unprecedented opportunity to increase access for everybody. Remove physical obstacles to acquiring knowledge; level the playing field.

Google’s founders worked on a digital library project at Stanford; always wanted to bring libraries into the digital world. When he showed up to work at Google, they told him they wanted to digitize the world’s books; he was a bit daunted. They figured out the logistical/technical challenges, and they felt they could navigate the fair use challenges.

Google Book Search: search across books like you search across the internet. We’ve scanned 10 million books and are just getting started. We have books in most of the languages Google is in; work directly with publishers. 1.5 million are public domain books. Old but valuable. The Library Project: a couple partner libraries include in-copyright books: scanned and indexed, but only show snippet. About 20% of books in these libraries are public domain, 75% out of print, and 5% in-copyright and in-print.

Settlement (for a big collection of settlement-related documents, check out James Grimmelmann’s Public Index): Allows collective license from rightsholders. Better than snippet view, plus free access for US libraries, plus new revenue stream reviving the commercial market for rightsholders. This settlement will also unlock millions of titles for the visually impaired, allowing text-to-speech and other techniques by libraries.

Panel Discussion:

Lateef Mtima, Howard Law: IP and social justice issues intertwine. Constitution starts with a social utility purpose: promoting progress. This is accomplished with rights to authors and rights to the public. Tech has pushed us to focus on the social utility of the exclusive rights, given the potential of new uses. Recall that the spread of player pianos and then record players increased access to music past those who’d been able to afford actual pianos and trained piano players before. So we need to link the digital divide with the purposes of the copyright law.

Wade Henderson, Leadership Conference on Civil Rights (LCCR: Voting rights and education were at the core of the civil rights movement. The last frontier: access to quality education and elimination of poverty. Education is the great equalizer. Thus he celebrates the Google settlement. People mired in poverty don’t have access to information, and we won’t get constitutional change targeting public education in his lifetime. Until we restructure education, we won’t succeed; this project is part of democratizing knowledge to create equality for people where they live today.

Civil rights 2.0: take this information and put it into a broader regime of support—this is a tool, not an answer. Incredibly valuable tool, though.

Charles Brown, Esq., Advisor to the President of the National Federation of the Blind: Incredibly excited by the settlement: the promise of approaching a level of equal access to information that has only been dreamed of a generation ago. Equality of opportunity has been a key theme among the blind. Settlement will change the playing field forever for blind people and others with print disabilities; much of the structure is embodied in the settlement. We hope it gets approved.

As with any revolution, this doesn’t occur in a vacuum. By using screen readers and other access tech, the blind can largely get our hands on digitized text material, but we’re too often artificially blocked from doing so. Incredibly frustrating to encounter an ebook seller that will allow us to buy a digitized book and then block digital speech. We recognize and support IP rights. They must be allowed to profit from creativity. As blind citizens, we must also be allowed to enjoy our property rights. Too often, tech features are available only visually, such as through only poorly constructed websites and visual touchscreens. The law requires equal access, but the concept is lacking in actual practice, forcing us into costly litigation—and institutions, including universities, into costly retrofitting. It’s cheaper to build accessibility into the design than to squeeze it in later. Blind citizens will not allow themselves to be thought of as a squeezed-in afterthought. Google and cooperating libraries are taking the right approach.

Brent Wilkes, League of United Latin American Citizens: LULAC has historically been involved in education—created the predecessor to Head Start. Google Book Search will provide equal access to all Americans who have internet access to books that only elite universities used to have. Especially important to underserved communities who have had difficulty getting access to relevant content, content in other languages, etc. Community colleges: about half of all Latino college students go to community college. Forecast: increasing numbers of Latino students; we can’t sustain high high school dropout rates. The Google project provides some equalization of the educational experience. Scanning UT-Austin’s Benson Latin American collection: most students can’t do that, but that content is very relevant to the Latin experience in the US, which isn’t very well documented elsewhere. Eva Longoria can trace her family back 10 generations; lived on the same land grant under 5 different flags; but her history was never readily accessible—Google may help this. Also scanning Spanish works in Spanish universities, in the US and Latin America. Those books aren’t in B. Dalton or even major libraries. A great equalizer for students learning English who need inspiration in their native language.

LULAC has created 53 tech centers across the US for students—students can do their homework. We value any tool helping students with research projects, homework, college application—but LULAC is not a library, doesn’t have a book collection. Google Book Search will help students a lot, making the community tech center more useful as a model.

We are also privacy advocates, and believe it’s important that Google pay attention to privacy issues. We don’t want to see tracking based on reading habits.

Rhea Ballard-Thrower, Howard Law Library: Some believe that Book Search is the death of the book and the end of the world. She believes that the project makes the book better. The laws of library science: books should be used, every reader should have a book, every book should have a reader, it should be easy to access books, and libraries are changing and dynamic. These principles still apply here. We used to be trained to protect books from people.

Now we can protect the originals, but still give access to the content. Scanning satisfies the “books should be used” principle. Historically only the rich had access to the most books; the idea that a student can read the same book whether in an expensive private school or an underfunded public school is amazing. What about “every book should have a reader”? There are some books out there only appreciated by the author and his mother. But search means if you’re that third person, you can find it. Ease of access: libraries now IM people and do everything they can to enable access—ease increases knowledge.

As for change: this is where librarians are sensitive. People sit at home instead of going to the library, and librarians get nervous: will anyone come see us? But there are three essential parts to the library: the collection, the users, and the staff. And they are all always changing. The collection has changed a bunch. Our users have totally changed—general literacy is a change, historically; users now come in with their own abilities to locate information. What made libraries important was not being repositories—a warehouse is not a library. The people who provide the service are key, regardless of whether it’s face-to-face or via IM. Especially with snippets, people will be directed to the library via Worldcat.

Q from Steve Jamar: there is a privacy issue. Also, how much access will there really be given the dedicated terminal issue? Are people really going to leave their homes/offices to use this?

Henderson: Access can’t adequately be achieved through a library system, no matter how good it is. Digital divide is one problem that needs to be addressed; won’t come through Book Search per se, but this will increase pressure to improve access. Google will have to address the issue, going beyond the initial concept.

Drummond: Access is more than terminals. Remember, 20% of the book is a lot; we also expect to have purchase models, including purchase of parts of a book. There’s also an institutional subscription piece: we can license the entirety of works to institutions. And our pricing provision requires us to take into account interests in getting market return along with interests in broad access, allowing our library partners to hold our feet to the fire. We haven’t built the product yet; trying to get the settlement approved—we’re listening to lots of folks.

Wilkes: Most minority communities have only 30-40% broadband access; more relevant content should spur more demand for access. We are also concerned about privacy—going to a library or tech center can wipe out your trackability. And we encourage Google and others to anonymize information.

Brown: Aside from terminals, we want to get computers into everyone’s hands. Schools and classrooms need to be wired. This is a public policy choice: government needs to do it. Then the school has to provide speech/screen enlargement software. But Google isn’t on the hook for doing that. Internet accessibility should be seen as a public utility.

Let’s not go so crazy regulating Google that they can’t make money. The blind would like to have this product, and we are willing to pay in situations where we are getting value—with Google, we know we’ll be able to use the thing.

Q re orphan works.

Drummond: There are some things being missed in the discussion. Orphan works are usually defined as works that might be in copyright but whose authors can’t be found. Those are not the same as the out-of-print works. In the settlement, for the first time in history, there’s going to be a concerted effort to go find the rightsholders of these materials—a financial incentive for people to come forward. With this new incentive, we think we’ll find that a lot of works aren’t really orphaned. There are some, say pictures, where there’s no attribution and you don’t know where to start; books have an author, a publisher, usually a city—a lot of places to start. We believe the vast majority of books will be claimed over time. Many are likely to be in the public domain as well because of the renewal requirement, and we know that many owners didn’t renew, up to the early 60s. We’re spending a lot of money on a comprehensive family reunification program for books. The Book Rights Registry can license out all claimed books on any terms.

Q from someone working on a digital library of congressional black history for the Congressional Black Caucus: Say more about the terminal v. subscription.

Drummond: there will be both—we think most university and research libraries, and even public libraries, will want to subscribe. On top of that, we’ll provide a free terminal for walk-ins. 20,000 libraries will have the right to do this. (Free access to the product, but not free computer and not free internet access.) Drummond was on a NYPL panel yesterday and there was a lot of talk about how one terminal isn’t enough for a big library, but he cautions that the terminal is free. If they find that there’s way too much demand, or if libraries can’t afford the computers, we can rethink solutions.

Q: There are communities where libraries have closed; they don’t have access at all.

Drummond: we can think about expanding the program to community organizations. We can’t have a free terminal everywhere or else there’s no money in it.

Mtima: Other scholars make the same point: what are we going to do about the digital divide? Some of this should have been addressed earlier, and in different (non-Google) fora. Who is responsible for providing terminals? It would be great if Google rushed in. But others have social activist responsibility to make sure that other sectors of society don’t get off the hook.

Q: Library of Congress has 130 million items. Does Google have a goal to put that amount online?

Drummond: We’d like to digitize them all, and not just in the US. It will take a long time, and we won’t be the only ones doing it.

Q re subscription: will price be set by institution size?

Drummond: based on ways other electronic databases are priced: the number members of university community, etc.

Q: How are non-author, non-publisher interests represented in the settlement (libraries, nonprofits)?

Drummond: library partners have been very important in structuring the settlement. Libraries provide the books; had to revise existing agreements with libraries to settle.

Brown: We were represented!

Q re pricing: for nonsubscription users, will there be a set price range? Will it depend on popularity?

Drummond: Rightsholder has option to set price. Some don’t want to charge anything; some want to charge a premium. Or Google can set the price algorithmically, based on popularity, length, genre, etc. Bands of pricing--$3 up to $29 are the ranges, with median likely to be $6-7.

Victoria Espinel: A lot of concerns about access to IP content, but also concerns about access to IP ownership, particularly for minority communities which are vastly underrepresented. Look at patents: less than 3% owned by minorities. Federal research funding: under 2% goes to historically black colleges/universities with high Hispanic enrollment. Disadvantage to minority communities and to the US as a whole. If access is driver of next generation of creativity, the settlement can be a big step forward, but that’s only one part of a bigger issue of ensuring that all communities have the opportunity to participate equally in the economy.

Wilkes: Google may help minority authors get published; don’t need a physical publisher if you can sell on Google Books. We plan to create lists of recommended books, and link to places to buy or find the books.

Marc Rotenberg: Very interested in privacy as a means of social justice. NAACP v. Alabama protected privacy of membership records in order to protect political association. Many Muslims today are very concerned about access to their library records.

My Q: So are institutions third-party beneficiaries of the settlement, given that the settlement requires attention to access in setting prices?

Drummond: No, but we expect our library partners to be really good advocates for access; they’re passionate about it. (For the record, I find this answer persuasive.)

My other Q (this is not how I asked it, but how I’m trying to think about it): Prodded by my earlier misunderstanding of Google’s plans, I’m really curious about the effect of removing images from big chunks of the corpus. We’re moving to a more visual society, and the stuff that people will be getting from Book Search will be in a significant way much less useful, much less alive, than the physical copies. (Which is not to say that the physical copies are accessible, but that the project may not fully do what it claims. The work you get from Google is not the book.) So how level is that playing field?

Drummond: When you claim a book, you get to say whether or not you own the illustrations. (I must have overlooked this part when I was claiming rights for my grandfather’s books; the interface is not up to Google’s usability standards at present.) If you don’t, they’re not going to be searchable, which is the way Google Books works already now. We are interested in making images more available. For certain books, like children’s books, illustrations are so important that we have special provisions for them.

Brown: Our tech people are working on ways to deal with charts and graphs; we have a library we’d like to preserve, too, so the issue of images is salient to us.

Ballard-Thrower: It’s far from perfect. We have a large archive of African-American works, and right now if you don’t come here you’ll never see it—pictures of black lawyers from the 1860s. The beauty of this project is the partnership. There may be difficulties, but the bigger picture is the wealth of information that right now you need an airplane to see.

Mtima: Legally, Google can only get permission from copyright owners; if the publisher doesn’t own the photo, there’s nothing Google can do. (Yeah, except that that’s exactly what Google does do with respect to, say, image search on the web—it defaults to copying and relies on fair use and opt-out. So for Google to say that it is still defending fair use because it has to rely on fair use to scan the photos, as Macgillivray did in his talk, and then to exclude images from the corpus available to users, is a little disingenuous. And it contributes to the rhetoric of a permission society, as Mtima’s statement illustrates. There is something Google can do; it’s called asserting fair use rights, and everyone is aware that the problems of finding rightsholders/the orphan works problems are greater with respect to visuals, making the fair use argument more compelling even given the settlement with authors & publishers.) Don’t let other IP owners off the hook—Congressional activism is important. (Here we agree.)

There was some further discussion of the need for copyright owners to get paid and the relative uncertainty in the relevant communities about the scope of copyright law.

My other thought: I was struck by the power of the point, made by Siva Vaidhyanathan and others before, that Google is essentially being asked/volunteering to take on the role that should be played by government provision of services. In a privatized age, Google offers too many goodies to ignore. But shouldn’t that trouble us?

No comments: