— CH. 1 · ORIGINS AND DEVELOPMENT —

Google Books

~5 min read · Ch. 1 of 7

7 sections

In 1996, two Stanford University graduate students named Sergey Brin and Larry Page began discussing a radical idea. They imagined a future where vast collections of books were digitized and indexed by a web crawler. This concept would eventually become Google Books, initially known as Project Ocean when it officially launched in 2002. The team visited existing digitization efforts like the Library of Congress's American Memory Project to understand their methods. Page spoke with then-University President Mary Sue Coleman about scanning all volumes at her university. He claimed that while the current estimate was one thousand years, Google could achieve the task in six years. By December 2004, Google announced partnerships with major institutions including Harvard University and the University of Michigan. The company planned to digitize approximately fifteen million volumes within a decade.
Google established designated scanning centers where trucks transported physical books for processing. A custom-built mechanical cradle adjusted the book spine while an array of lights scanned open pages. Each page received images from two cameras directed at its surface. A range finder LIDAR overlaid a three-dimensional laser grid on the paper to capture curvature. Human operators turned pages by hand using foot pedals to take photographs without flattening or unbinding the books. This system allowed scanning rates up to six thousand pages per hour. Many books used customized Elphel 323 cameras operating at one thousand pages per hour. A patent awarded to Google in 2009 revealed this innovative dual-camera infrared light system. De-warping algorithms corrected page curvature using the LIDAR data before optical character recognition software transformed raw images into text. Google omitted color information to prioritize spatial resolution since most out-of-copyright books lacked colors.
Users encounter four distinct viewing levels when searching Google Books. Full view allows public domain books to be downloaded freely. In-print books acquired through the Partner Program may also offer full view if publishers grant permission, though this remains rare. Preview mode limits viewable pages based on access restrictions and security measures set by publishers. Watermarks reading Copyrighted material appear at the bottom of preview pages. Snippet view displays only two to three lines of text surrounding a queried search term when copyright owners decline permission. Google restricts snippet display to prevent users from viewing too much content. No preview results show metadata like titles and authors for books that have not been digitized. These records function similarly to an online library card catalog. The service automatically generates overview pages containing publishing details and high frequency word maps.
The Authors Guild filed a class-action lawsuit against Google on the 20th of September 2005. Five large publishers joined with the Association of American Publishers in a separate civil suit filed the 19th of October 2005. A federal judge rejected the initial settlement reached between the industry and Google in March 2011. US District Judge Denny Chin ruled in favor of Google in November 2013 citing fair use protections. An appeals court sided unanimously with Google again in October 2015 declaring no violation of copyright law. The US Supreme Court declined to hear the Authors Guild's final appeal in April 2016. This decision established significant precedents for digital libraries regarding orphan works. In France, a Paris Civil Court awarded thirty thousand euros in damages to publisher La Martinière in 2009. That court ordered Google to pay ten thousand euros daily until removing the publisher's books from its database. Chinese author Mian Mian sued Google for eight thousand nine hundred dollars over her novel Acid Lovers in December 2009.
Initial partners included Harvard University Library which holds more than fifteen point eight million volumes. The University of Michigan scanned five point five million volumes by March 2012. The New York Public Library offered public domain books in their entirety for free online access. Stanford University Libraries joined as part of the Green Library initiative. The Bodleian Library at Oxford University contributed to the global collection. By May 2007, Mysore University announced digitization of over eight hundred thousand manuscripts written on palm leaves dating back to the eighth century. The Big Ten Academic Alliance committed to scanning ten million books over six years. The University of Texas at Austin partnered to digitize about half a million Latin American volumes. As of March 2012, the University of Wisconsin-Madison had scanned approximately six hundred thousand volumes. These partnerships aimed to make millions of works discoverable worldwide through digital means.
Scholars frequently reported rampant errors in metadata information including misattributed authors and erroneous publication dates. Linguist Geoffrey Nunberg found that searching for books published before 1950 containing the word internet yielded five hundred twenty seven unlikely results. Woody Allen appeared in three hundred twenty five books ostensibly published before he was born. Google blamed bulk errors on outside contractors handling the data processing. Publication dates sometimes predated author births with one hundred eighty two works by Charles Dickens listed prior to his birth in 1812. Incorrect subject classifications placed an edition of Moby Dick under computers and a biography of Mae West under religion. Conflicting classifications assigned both fiction and nonfiction labels to ten editions of Whitman's Leaves of Grass. Some metadata entries incorrectly appended details from an 1818 mathematical work to a completely different 1963 romance novel. Scanning errors included unreadable pages upside down images or crumpled pages obscuring thumbs and fingers.
Google celebrated fifteen years of service in 2017 having scanned more than forty million titles. The company estimated there were about one hundred thirty million distinct titles in the world when it began its mission. A 2023 study by scholars from UC Berkeley and Northeastern University found digitization led to increased sales for physical book versions. The Ngram Viewer graphs word usage frequency across the collection providing historians and linguists insight into human culture. Critics argued that disproportionate English representation creates linguistic imperialism issues affecting future scholarship growth. Jean-Noël Jeanneney, former president of a French institution, criticized the effort on these grounds. Google Editions launched as a digital bookstore competing with Amazon and Apple in December 2010. Despite winning decade-long litigation, Wired reported only a few employees worked on the project by April 2017. Scanning operations slowed significantly since at least 2012 as librarians confirmed reduced pace compared to 2006 levels.

Common questions

When did Google Books officially launch and what was its original name?

Google Books officially launched in 2002 under the initial name Project Ocean. The project originated from discussions between Stanford University graduate students Sergey Brin and Larry Page that began in 1996.

How does Google scan books without damaging them or unbinding the volumes?

The service uses custom mechanical cradles and dual-camera systems to capture images while human operators turn pages by hand using foot pedals. A range finder LIDAR overlays a three-dimensional laser grid on the paper to capture curvature before de-warping algorithms correct the page shape for text recognition.

What legal rulings determined the copyright status of the Google Books digitization project?

US District Judge Denny Chin ruled in favor of Google in November 2013 citing fair use protections after rejecting an earlier settlement in March 2011. An appeals court sided unanimously with Google again in October 2015 declaring no violation of copyright law, and the US Supreme Court declined to hear the Authors Guild's final appeal in April 2016.

Which universities partnered with Google to contribute millions of scanned volumes to the collection?

Initial partners included Harvard University Library which holds more than fifteen point eight million volumes and the University of Michigan which scanned five point five million volumes by March 2012. The New York Public Library offered public domain books in their entirety for free online access while Stanford University Libraries joined as part of the Green Library initiative.

What specific errors did scholars report regarding metadata and scanning quality in the database?

Scholars reported rampant errors including misattributed authors and erroneous publication dates such as Woody Allen appearing in three hundred twenty five books ostensibly published before he was born. Scanning errors also included unreadable pages upside down images or crumpled pages obscuring thumbs and fingers alongside incorrect subject classifications like placing Moby Dick under computers.

See all questions about Google Books →

All sources

135 references cited across the entry

1webAn Inside Look At One Of Google's Most Controversial ProjectsDylan Love
2inlineThe basic Google book link is found at: . The "advanced" interface allowing more specific searches is found at:
3webRead Complete Magazines Online in Google BooksMark O'Neill — 28 January 2009
4webAbout Magazines search
5newsGoogle project promotes public goodKevin Bergquist — University of Michigan — 2006-02-13
6webIs This the Renaissance or the Dark Ages?Andrew K. Pace — American Library Association — January 2006
7inlineMalte Herwig, "Google's Total Library" , Spiegel Online International, March 28, 2007.
8web15 years of Google Books17 October 2019
9inlineGoogle: 129 Million Different Books Have Been Published PC World
10newsGoogle Books: A Complex and Controversial ExperimentStephen Heyman — 28 October 2015
11magazineWhat Ever Happened to Google Books?11 September 2015
12journalDigitization and the Market for Physical Works: Evidence from the Google Books ProjectAbhishek Nagaraj et al. — 2023
13bookGoogle Books Library Project – An enhanced card catalog of the world's books
14webGoogle's Cookie and Hacking Google PrintGreg Duffy — March 2005
15journalThe Google Library Project: Both Sides of the StoryJonathan Band — University of Michigan — 2006
16newsIn Google Book Settlement, Business Trumps IdealsJuan Carlos Perez — October 28, 2008
17webWhere do these books come from?
18webReferences, PleaseTim Parks — 13 September 2014
19webTorching the Modern-Day Library of AlexandriaJames Somers — The Atlantic — 20 April 2017
20webWeekly Google Code Roundup for August 10thDion Almaer — 11 August 2007
21webResume of Ted Merrill, Software Engineer
22newsScan This Book!Kevin Kelly — May 14, 2006
23webPatent reveals Google's book-scanning advantageStephen Shankland — 4 May 2009
24newsThe Secret Of Google's Book Scanning Machine RevealedMaureen Clements — 30 April 2009
25journalMass book digitization: The deeper story of Google Books and the Open Content AllianceKalev Leetaru — 2008-10-11
26webIs Google leading an e-book revolution?Laura Miller — 8 December 2010
27webMy Library FAQ
28webWhere do you get the information for the 'About this book' page?
29webBigger, Better Google Ngrams: Brace Yourself for the Power of GrammarBen Zimmer — 18 October 2012
30newspaidContent.org - The Plot Thickens For E-Books: Google And Amazon Putting More Titles On Mobile PhonesDianne See Morrison — 6 February 2009
31webGoogle Books: How bad is the metadata? Let me count the ways...WordPress — 29 September 2009
32bookGreat Expections by Charles Dickens on Google Books readerCharles Dickens — 1881
33webGoogle Acquisition Will Help Correct Errors in Scanned Works17 September 2009
34magazineThe Artful Accidents of Google BooksKenneth Goldsmith — 4 December 2013
35webThe trouble with Google BooksLaura Miller — 9 September 2010
36webMajor errors prompt questions over Google Book Search's scholarly value10 September 2009
37inline"Google Books: The Metadata Mess" , Geoffrey Nunberg
38journalAn Assessment of Google Books' MetadataRyan James et al. — 2012
39newsGoogle's Book Search: A Disaster for ScholarsGeoffrey Nunberg — August 31, 2009
40bookGoogle and the Myth of Universal Knowledge: A View from EuropeJean-Noël Jeanneney — University of Chicago Press — 2006-10-23
41newsFrance Detects a Cultural Threat in GoogleAlan Riding — 2005-04-11
42inlineBarbara Quint, "Changes at Google Scholar: A Conversation With Anurag Acharya" , Information Today, August 27, 2007.
43bookLiterary Research and the American Realism and Naturalism Period: Strategies and SourcesLinda L. Stein et al. — Scarecrow Press — 2009
44webBooks Help
45webHarvard-Google Project
46webMichigan Digitization ProjectUniversity of Michigan
47journalNYPL Partners with Google to Make Books Available Online2004-12-14
48webResearch at NYPL: Remote Access to Collections and Services
49webOxford Google Books Project
50webStanford's Role in Google Books
51webLibrary Partners – Google Books
52webAustrian Books OnlineAustrian National Library
53webGoogle Book Search GrowsAndrew Albanese — 2007-06-15
54webGoogle partenaire numérique officiel de la bibliothèque de Lyon
55webColumbia University Libraries Becomes Newest Partner in Google Book Search Library Project2007-12-13
56webComplutense Universidad + Google
57webCornell University Library becomes newest partner in Google Book Search Library Project
58webGhent University Library Search Results
59webKeio University to partner with Google, Inc. for digitalization and release of its library collection to the world For "Formation of Knowledge of the digital era"2007-07-06
60newsGoogle digitaliza 35 mil libros de la Biblioteca de Catalunya libres de derechos de autorLa Vanguardia Ediciones
61webLibrary joins Google project to make books available onlineCass Cliatt — 2007-02-05
62webUC libraries partner with Google to digitize books2006-08-09
63inlineCantonal and University Library of Lausanne/Bibliothèque Cantonale et Universitaire (BCU) + Google (in French)
64webGoogle to digitise books at Mysore varsityHindustan Times — 20 May 2007
65webGoogle to scan 800,000 manuscripts, books from Indian universityNate Anderson — 2007-05-22
66webThe University of Texas Libraries Partner with Google to Digitize Books2007-01-19
67webU.Va. Library Joins the Google Books Library ProjectCarol, S. Wood — 2006-11-14
68webUniversity of Wisconsin-Madison Google Digitization Initiative
69bookGoogle Books History – Google Books
70inlineO'Sullivan, Joseph and Adam Smith. "All booked up," Googleblog. December 14, 2004.
71webCopyright Accord Would Make Millions More Books Available OnlineGoogle Press Center
72webAuthors Guild v. Google Settlement Resources PageAuthors Guild
73newsA new chapterOctober 30, 2008
74webAuthors Guild Sues Google, Citing "Massive Copyright Infringement"Paul Aiken — Authors Guild — 2005-09-20
75webPublishers sue Google over book search projectAlorie Gilbert — CNET News — 2005-10-19
76webThe McGraw Hill Companies, Inc.; Pearson Education, Inc.; Penguin Group (USA) Inc.; Simon and Schuster, Inc.; John Wiley and Sons, Inc. Plaintiffs, v. Google Inc., Defendant
77webJudging Book Search by its coverJen Grant — November 17, 2005
78webLibrary partners
80webUniversity Complutense of Madrid and Google to Make Hundreds of Thousands of Books Available Online
81webNew release: UW-Madison Joins Google's Worldwide Book Digitization Project
82webThe University of Virginia Library Joins the Google Books Library Project
83webBavarian library joins Google book search projectElinor Mills
84inlineReed, Brock. "La Bibliothèque, C'est Google" (Wired Campus Newsletter) , Chronicle of Higher Education. May 17, 2007.
85webGoogle Books @ UGent
86webGoogle Book Search Project - MenuBig Ten Academic Alliance
87webKeio University Joins Google's Library ProjectLaura DeBonis
88webCornell University Library becomes newest partner in Google Book Search Library Project
89webShare and enjoyManas Tungare
90webAbout Google Books – Google Books
91webColumbia University joins the Google Book Search Library ProjectGabriel Stricker
92newsMicrosoft Will Shut Down Book Search ProgramMiguel Helft — May 24, 2008
93newsSome Fear Google's Power in Digital BooksNoam Cohen — February 1, 2009
94webLaunch of HathiTrust - October 13, 2008 www.hathitrust.org HathiTrust Digital Library
95newsMassive EU online library looks to compete with GoogleNovember 2008
96newsGoogle Hopes to Open a Trove of Little-Seen BooksMotoko Rich — January 4, 2009
97newsGoogle updates search index with old magazinesDecember 10, 2008
98webOfficial Google Blog: Search and find magazines on Google Book Search
99web1.5 million books in your pocket5 February 2009
100newsPreparing to Sell E-Books, Google Takes on AmazonMotoko Rich — 2009-06-01
101newsFrench court shuts down Google Books projectGaelle Faure — December 19, 2009
102webGoogle Gets Sued by Photographers Over Google BooksJolie O'Dell — 8 April 2010
103webGoogle Readies Its E-Book Plan, Bringing in a New Sales ApproachJessica E. Vascellaro — 4 May 2010
104webGoogle launches eBookstore with more than 3 million titlesMacWorld
105newsJudge rejects Google settlement with authorsMarket Watch
106webGoogle book scan project slows down
107inlineHoward, Jennifer Google Begins to Scale Back Its Scanning of Books From University Libraries , March 9, 2012
108webThe Association of American Publishers
109webGoogle and the world brain - Polar Star Films
110newsGoogle Books ruled legal in massive win for fair use
111inline"Siding With Google, Judge Says Book Search Does Not Infringe Copyright" , Claire Cain Miller and Julie Bosman, New York Times, November 14, 2013. Retrieved November 17, 2013.
112newsGoogle book-scanning project legal, says U.S. appeals courtReuters
113inlineUS Supreme Court Rejects Challenge to Google Book-Scanning Project April 18, 2016
114webGoogle Begins to Scale Back Its Scanning of Books From University LibrariesJennifer Howard — The Chronicle of Higher Education — 9 March 2012
115magazineHow Google Book Search Got LostScott Rosenberg — 11 April 2017
116magazineGoogle and the Future of BooksRobert Darnton — February 12, 2009
117newsAuthors sue Google over book plan21 September 2005
118webU.S. Appeals Court Rules Google Book Scanning Is Fair UseLisa Peet — 2015-10-19
119webAuthors Guild v. Google, Inc., No. 13-4829 (2d Cir. 2015)
120newsGoogle Books just won a decade-long copyright fight
121webGoogle Book Search Wins Victory In German ChallengeDanny Sullivan — 2006-06-28
122newsFrench publishers toast triumph over GoogleAdam Sage — The Times of London — December 19, 2009
123newsGoogle's French Book Scanning Project Halted by CourtHeather Smith — Bloomberg — December 18, 2009
124newsFrench publisher sues GoogleJohn Oates — June 7, 2006
125newsFine for Google over French booksDecember 18, 2009
126webGoogle Faces Chinese Lawsuit Over Digital Book Project28 December 2009
127webWriter sues Google for copyright infringement
128newsMicrosoft Attorney Accuses Google Of Copyright ViolationsThomas Claburn — March 6, 2007
129inlineRobert B. Townsend, Google Books: Is It Good for History? , Perspectives (September 2007).
130webRemove a book - Books Help2014-09-24
131newsInternet Archive and Library Partners Develop Joint Collection of 80,000+ eBooks To Extend Traditional In-Library Lending ModelFebruary 22, 2011
132weblanguagehat.com : TRUST HATHI, NOT GOOGLE
133newsMicrosoft starts online library in challenge to Google Books2006-12-08
134webGoogle Books-An Other Popular Service By GoogleChristina Xio
135inlinehttp://version1.europeana.eu/{{dead
136magazineEurope's Answer to Google Book Search Crashes on Day 1Chris Snyder — November 20, 2008

Google Books

1. Origins And Development

2. Scanning Technology And Process

3. Access Models And User Experience

4. Legal Battles And Copyright Law

5. Library Partnerships And Global Reach

6. Data Errors And Scholarly Criticism

7. Cultural Impact And Legacy

Common questions

All sources