AccessMyLibrary provides FREE access to over 30 million articles from top publications available through your library.
Create a link to this page
Copy and paste this link tag into your Web page or blog:
ABSTRACT
The selection, acquisition, and management of digital data are now part and parcel of the work librarians handle on a day-to-day basis. While much thought goes into this work, little consideration may be given to the long-term preservation of the collected data. Digital data cannot be retained for the future in the same way paper-based materials have traditionally been handled. Specific issues arise when archiving digital data and especially geospatial data. This article will discuss some of those issues, including data versioning, file size, proprietary data formats, copyright, and the complexity of file formats. Collection development topics, including what to collect and why, will also be explored. The work underlying this article is being done as part of an award from the Library of Congress's National Digital Information Infrastructure and Preservation Program (NDIIPP).
INTRODUCTION
Digital geospatial data is now routinely found in libraries that carry cartographic data, geologic information, social science datasets, and other materials in support of disciplines using Geographic Information Systems (GIS) in their research and work. Over the course of years, the data have been received on floppy disks, CD-ROMS, DVDs, and hard drives or are available for free or for a fee over the Internet. In the paper world, ensuring longevity of items means creating ideal conditions in which to store collections. Materials will last longer if kept in a cool space without much light and correct humidity and handled as seldom as possible.
The same is not true for digital data. As Clay Shirky (of New York University's Interactive Telecommunications Program) pointed out in July 2005 at the bi-annual meeting of the National Digital Information Infrastructure and Preservation Program (NDIIPP), digital materials must be touched and manipulated on a regular basis if they are to survive. Leaving digital data alone will certainly cause it to be lost, and the time frame may be surprisingly short. Technology is changing at such a rapid pace that it can now be a challenge to find a machine that will read floppy discs, much less the obsolete program on which the data was supposed to run. Web sites can be and are removed at a moment's notice. This is especially frustrating for the federal depository libraries that formerly received paper copies of government information now available only in digital formats. Clearly, librarians must begin thinking about long-term preservation of their digital collection, from what to collect to ensuring that it is preserved with the same thoughtfulness and care that is given to hardcopy materials.
THE LIBRARY OF CONGRESS AND THE NDIIPP AWARDS
In December 2000 Congress appropriated nearly $100 million dollars in funds to underwrite the cost of studying the issues related to the long-term preservation of digital data. The program was to be administered by the Library of Congress and was named the National Digital Information and Infrastructure Preservation Program (Library of Congress, 2006a). Conference Report H. Rept. 106-1033 stated that
The overall plan should set forth a strategy for the Library of Congress, in collaboration with other Federal and non-Federal entities, to identify a national network of libraries and other organizations with responsibilities for collecting digital materials that will provide access to and maintain those materials.... In addition to developing this strategy, the plan shall set forth, in concert with the Copyright Office, the policies, protocols, and strategies for the long-term preservation of such materials, including the technological infrastructure required at the Library of Congress. (Library of Congress, 2006b)
The goal of the program was to create a network of committed partners willing to work on the policies, protocols, and architectures needed to build a series of archives to house digital materials.
The first round of major funding was announced in September 2004 with eight projects receiving a total of $13.8 million dollars in funding over a three-year period. Two of these projects focused specifically on geospatial data. The North Carolina State University Libraries partnered with the North Carolina Center for Geographic Information and Analysis to create a model for archiving…