Slide 1
Continuity of Resources in the Electronic Age:
Preservation Issues in the Digital Library
Josip Mocnik
ASDAL 20th Annual Conference
Andrews University, July 14, 2000
Slide 2
Into the future…
If William Shakespeare had written Hamlet on a word processor,
If Thomas Jefferson has saved his drafts of the Declaration of
Independence with a computer text editor,
If Alexander Graham Bell had documented his experiments with the
telephone on floppy disks,
If Leonardo de Vinci had used a computer graphics system to create
Mona Lisa…
Would their great achievements still be available to us today?
Slide 3
Outline of Talk
- Why bother?
- Terminology
- Research
- Solutions?
- The future
- and questions, questions…
Slide 4
Why bother?
It is essential to try to guess what the future has in store for librarians and other information professionals in this time of rapid technological changes = everyone is affected because we are living in a digital world
[image missing]
Slide 5
I bother because…
- Part of graduate study and research at University College London focused on
preservation and access of digital material Attended Digital Reality II: Preserving Our Electronic Heritage (June 5, 2000, JFK Library, Boston, MA) – thank you JWL
Slide 6
Digital Reality II
- Presenters: Tim Berners-Lee (in 1989 invented the World Wide Web, named by Time May ’99 one of the 100 greatest scientist), Walt Craford (Research
Libraries Group), Paul Conway (Yale University), Fynnette Eaton (Smithsonian Institution), Jeff Rothenberg (RAND Corporation, CA)
- http://www.nelinet.net/conf/pres/pres00/digital.htm
Slide 7
Terminology
- Digital library – “provides the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities”
Don Waters
Slide 8
Terminology
- Digital material – extremely versatile but fragile material that may or may not have print equivalent and is normally stored on magnetic media (tapes, floppy diskettes, hard drives), optical media (CD-ROM or DVD) or/and may be available on the Internet
Slide 9
Terminology
Digital material is vulnerable due to:
- media obsolescence
- physical, chemical, magnetic effects, etc.
- format incompatibilities
- dependence on incompatible or obsolete software
- dependence on obsolete hardware, etc.
Slide 10
Terminology
Preservation – “Enable reliable, authentic, meaningful and accessible records to be carried forward through time within and beyond organizational boundaries for as long as they are needed for the multiple purposes they serve”
Sue McKemmish
Slide 11
Terminology
- Preservation of digital material – “creation of
digital products worth maintaining over time” Paul Conway
- Main preservation issues: longevity (storage); quality (image and text data); integrity (metadata); accessibility (persistent functionality)
Slide 12
Life Expectancy of Digital Material?
[image missing]
Paul Conway
Slide 13
Terminology
- “Digital documents last forever – or five years, whichever comes first.” Jeff Rothenberg
[image missing]
Slide 14
Examples of lost digital documents
- U. S. Census information for 1960 (partly recovered, though widely reported as lost)
- Department of Health and Human Services (hundreds of reels of tape)
- Military files (Combat Area Casualty file for Vietnam war; herbicide data needed to analyze the impact of Agent Orange)
- Scientific data (NASA mission tapes), etc.
Slide 15
Research
- British Library Digital Library Program (Digital Library System, DLS will try to guarantee continuing access to digital material)
- Digital Libraries Initiative (DLI and DLI-2; after 1998 places more emphasis on preservation and not only on digitization)
- NSF/JISC (Joint Information Systems Committee)
International Libraries Initiative – deals with standards for preservation and access
Slide 16
Research
- PANDORA Project (National Library of Australia)
– access and preservation was essential from the outset of the project
- Universal Preservation Format, UPF – deals with access and preservation of media material
- The Digital Rosetta Stone, DRS – US Air Force conceptual model project with objective of maintaining access to digital material in their original format – sounds promising
Slide 17
Research
- InterPARES (International Research on Permanent Authentic Records)
- Attempts to develop theoretical and methodological knowledge essential for permanent preservation of electronic records
- Four domains of investigation
- Seeks to identify the elements of digital records that are necessary to identify authenticity
- Seeks to identify appraisal criteria for different digital material
- Deals with preservation and its implementation
- Works on formulating policies and standards for each individual
institution
Slide 18
Research
- InterPARES
- American team focuses on digital records in government and universities by testing a wide range of digital records
- http://www.interpares.org
- http://is.gseis.ucla.edu/us-interpares/index.html
Slide 19
Research
- San Diego Supercomputer Center – tries to maintain access to digital material until better technology evolves
- “Collection-Based Persistent Digital Archives” (creation of 1 million E-mail collection) includes the best of:
- Archival storage technology
- Data grid technology
- Information models
- Preservation models
Slide 20
Research
- Open Archival Information System (OAIS)
- Started by NASA’s Consultative Committee for Space Data Systems
- A conceptual framework for an archival system to preserve and maintain access to digital material
- Relatively widely accepted (CEDARS, PANDORA, NEDLIB…)
- http://www.ccsds.org
Slide 21
Solutions?
- Technology preservation
- Museum of dead technologies – keeps every peace of hardware and software that ever existed!
Slide 22
Solutions?
- Migration – “periodic transfer of digital materials from one hardware/software to a subsequent generation” Margaret Hedstrom and Sheon Montgomery
- Change media – from highly software dependent formats that less software dependent or copying to stable hard copy/microform – refreshing/recopying
- Backward compatibility with older versions
- Interoperability of different application packages
- Conversion to standard formats
Slide 23
Solutions?
- Migration
- Expensive
- Full functionality may or may not be preserved
- Labor-intensive
- Standards are evolving – not realistic to expect universal standards in the foreseeable future
- Partially successful thus ultimately infeasible as a long-term solution, but do we have anything better?
Slide 24
Solutions?
- Emulation
- “process that performs the function of something
else” Jeff Rothenberg
- Mimicking in software a piece of hardware/software to provide the closest possible capabilities to the actual original
- Not the same as simulation (eg. airplane simulators don’t leave the ground)
Slide 25
Solutions?
- Emulation
- Expensive
- Time consuming
- Considerable expertise needed
- Yet, it needs just one emulator specification per old/obsolete platform
Slide 26
Migration vs. Emulation
- Needs intervention
- Evolving functions
- Possible corruption
- Uncertainty
- Depends on value and use
|
- Constant
- Fixed functions
- More stable
- Greater certainty
- Independent of value and used
|
Slide 27
The future
How can libraries, archives and other similar institutions cope with the increasing digital information?
Will future generations be able to access our digital information at least in the same way we can access ancient manuscripts?
Slide 28
And the ultimate solution is…
“Exercise is always the best preserver. Major religious works are kept impressively persistent because each age copies, analyzes and uses them”
Stewart Brand
Slide 29
Thank you…
[image missing]