— Ch. 1 · Origins And Development —
Google Books.
~5 min read · Ch. 1 of 7
In 1996, two Stanford University graduate students named Sergey Brin and Larry Page began discussing a radical idea. They imagined a future where vast collections of books were digitized and indexed by a web crawler. This concept would eventually become Google Books, initially known as Project Ocean when it officially launched in 2002. The team visited existing digitization efforts like the Library of Congress's American Memory Project to understand their methods. Page spoke with then-University President Mary Sue Coleman about scanning all volumes at her university. He claimed that while the current estimate was one thousand years, Google could achieve the task in six years. By December 2004, Google announced partnerships with major institutions including Harvard University and the University of Michigan. The company planned to digitize approximately fifteen million volumes within a decade.
Scanning Technology And Process
Google established designated scanning centers where trucks transported physical books for processing. A custom-built mechanical cradle adjusted the book spine while an array of lights scanned open pages. Each page received images from two cameras directed at its surface. A range finder LIDAR overlaid a three-dimensional laser grid on the paper to capture curvature. Human operators turned pages by hand using foot pedals to take photographs without flattening or unbinding the books. This system allowed scanning rates up to six thousand pages per hour. Many books used customized Elphel 323 cameras operating at one thousand pages per hour. A patent awarded to Google in 2009 revealed this innovative dual-camera infrared light system. De-warping algorithms corrected page curvature using the LIDAR data before optical character recognition software transformed raw images into text. Google omitted color information to prioritize spatial resolution since most out-of-copyright books lacked colors.