AALL 2006 - A4: Preservation of Digital Information

Submitted by Tom Boone on July 9, 2006 - 10:26am.

A4: Preservation of Digital Information: Global Trends in Digital and Analog Archiving
Stephane Cottin, Constitutional Council of France
Pascal Petitcollot, General Secretariat of the French Government
Russell J. Burkel, Analog Imaging LLC
Terrence McCormack (Moderator), University of Buffalo (SUNY)
Jerry Dupont, Law Library Microform Consortium

Preservation of Digital InformationStephane Cottin - Status and best practices for digital preservation in Europe

With 25 different countries, keeping track of European trends is a complex task. The EU has passed regulations concerning the formatting of data, and the Union only finances archive projects that meet this norm.

In France there are many different archiving norms in use. Only one organization is particularly concerned with standards, and that organization itself uses many different norms. Since there is a legal obligation to preserve digital information concerning legal texts, there is a lot of preservation taking place but standardized method. Thus, there are many methods.

The government of France has called for contributions to create best practices to support interoperability in digital archives, and the contributions are all being collected via an online wiki.

Pascal Petitcollot - Preservation of legal digital information in Europe.

A complete preservation of all available legal information in France (no matter how old) is desired in order to guarantee a fair application of the law, but there needs to be the same permanence and accessibility that exists with traditional print resources.

France requires that digital legal texts be electronically signed and guaranteed to be complete. The French legal Gazette is published simultaneously in both print and electronic versions. In isolated situations, however, some info is only available in print and some is only available in electronic. The government also requires that the electronic edition be accessible for free. To ensure permanence, a DVD of the e-Gazette is archived in national archives each month.

Standards for storing these documents are still needed, and cooperation between all the players is essential. In addition, mirroring and archiving are necessary. The Gazette currently uses XML for storing metadata. Since it is anticipated that the print Gazette will eventually cease to exist, common rules for archiving must be developed as quickly as possible. The cost will be high, and it still must be determined how that cost will be divided among all the involved parties.

Russell Burkel - Digital Amnesia

Digital versions are not always superior to analog ones due to changing platforms and standards. In 16 years since it's creation, the digital version of Doomsday Book has become unreadable, yet the analog original created 400 years ago is still perfectly readable.

Vendors blame cost for not developing standards that ensure long-term readability. No one wants to be the first to abandon its own proprietary version, and even when standards exist, they tend to live a short life.

Most digital storage media has a short life span. Microform and linen rag paper, on the other hand, have a life expectancy of 500 years, which seems to be long enough to satisfy most everyone's desire for permanence.

So is Optical Character Recognition (OCR) the solution for creating digital versions of analog archives? Traditional OCR systems can be as much as 99.5% accurate, but anything less than 100% requires extensive editing.

OCR-B is a viable alternative. It is an alphabet system that is easily read by both humans AND machines. With OCR-B as a standard, even future digital replacement systems could use the data. (Few currently consider that something will eventually replace our current digital systems.)

So why aren't we taking advantage of this simple, straightforward preservation solution?

Jerry Dupont - Digital Preservation in the U.S. legal profession

There are two separate and distinct areas in the world of digital archiving: digital conversion of existing print materials and the preservation of "born-digital" materials.

Americans have always been happy with reprints instead of originals, so long as they are reliable.

Digital archives make materials far more accessible than analog, but the life span of digital storage media is significantly less than analog media. Thus, an analog archive is mandatory. LLMC makes this part of its own process.

Everyone is excited by hardware that eases the conversion of print to digital. But the most important hardware is that which converts digital to analog, because it ensures that the user-friendly digital archive will remain available thanks to the analog backup.

Another mundane yet essential task taken on by LLMC: keeping track of WHERE analog archives exist. To this end, LLMC has created an informational database so libraries can make sure that their copy of a print archive publication is not the last known copy should they plan to discard it.