This project was kicked off by an informal Discord chat earlier today (February 3, 2024).

Several large-scale free document repositories and projects have grown up over the past 10-20 years. Examples include Wikimedia Commons, The Wikipedia Library, Internet Archive, Hathi Trust, Biodiversity Heritage Library and Library of Congress. Several commercial efforts such as Google Books and Newspapers.com also provide some level of no-cost (but possibly license-encumbered) access to a variety of scanned materials. However, this has only scratched the surface of the global repository of printed materials sequestered away; some professionally indexed and stored in climate controlled libraries, others randomly stashed away in basements and garages. In addition, with various copyrights expiring and the public domain growing every year, an ever-expanding collection of knowledge becomes available for legal uploading without restrictive licensing or paywalls.

Getting these materials digitized and made available on-line is one of the big tasks humanity faces over the next several decades. At some point, even the most dedicated archivists will no longer be able to justify the cost to keep all this paper around and it will be discarded. If it has not been digitized, the information it contains will be permanently lost to humanity.

Wikimedia New York City ("the chapter") rents space in Prime Produce. It is proposed that the chapter purchase a high-quality book scanner, install it at Prime Produce, and set it up as a public amenity for people to use to scan books. A requirement of using the scanner will be that the scanned materials be uploaded to a free archive such as Wikimedia Commons or Internet Archive; it must therefore be ensured that scanned material is suitable for such an upload, i.e., in the public domain or under an otherwise free license. The chapter will hold scan-a-thons in which the public will be invited to bring materials to be scanned and receive help and assistance scanning and uploading.

The initial feedback from the chapter is positive, and suggests that funding would probably be available for this. Initial feedback from Prime Produce is also positive, this being an idea that other resident organizations have previously thought about doing. At this point, there is nothing decided other than general but informal agreement that this seems like a good idea. The purpose of this page is to flesh out the technical details of what makes sense to buy, how much it costs, and firm up commitments.

Hardware[edit]

The two basic hardware routes seem to be a dedicated book scanner or a camera on a copy stand. Some things to consider:

Internet Archive: How the IA Scans material

Scanning Services

Scanning Services

Digitizing Print Collections with the Internet Archive Open and free online discovery and access, long-term storage and engineering file management and unlimited downloads.

Tweet (or post) re: scanning

At the Internet Archive, this is how we digitize a book.

Blog post about the above: Meet Eliza Zhang, Book Scanner and Viral Video Star

A (re)Introduction to Book Digitization at the Internet Archive

49 minute video: A (re)Introduction to Book Digitization at the Internet Archive

From this 2021 webinar: A (re)Introduction to Book Digitization at the Internet Archive

Equipment at the Internet Archive: Table Top Scribe System

Description, specifications, and documentation for the IA's Scribe Station--what the IA uses to scan materials: Table Top Scribe System

Center for Jewish History

I visited the Center for Jewish History on 16th Street and saw first hand their archives and their scanning equipment, but can't find detailed info. I will check in with them and find out details --CmdrDan (talk) 22:21, 21 February 2024 (UTC)[reply]

Book2net[edit]

Book2net.net A manufacturer of some serious looking hardware.

"book2net is your reliable partner for all aspects of cultural heritage digitization."

Book2net Case Studies:

Integrating university knowledge[edit]

The proposal at Special:Permalink/1211465891#Proposal for Integrating University Knowledge into Wikipedia, while unlikely to go anywhere, may be of interest. RoySmith (talk) 20:09, 2 March 2024 (UTC)[reply]