I attended a conference in Chicago, “Persistence of Memory: Sustaining Digital Collections,” on the topic of digital preservation, co-sponsored by the Society of American Archivists and the Northeast Document Conservation Center.I was there in December when sleet and Governor Blagojevich were falling.
My delay in blogging this conference is in no way an indication of the importance of the topic or of my interest in it.The information is timely today and will remain so for a time. The conference addressed the question of digital longevity for collections of digitized and born-digital resources.The conference highlighted evolving best practices for digital preservation to help with the life-cycle management of our digital collections.
Upon settling into my room that first afternoon, I was stunned to see a gorgeous view of Lake Michigan from my 15th floor room window.I was staying in the Intercontinental Hotel on North Michigan Avenue, right in the middle of the “Magnificent Mile.” Lights adorned Michigan Avenue and every shop window.Nordstrom’s and Eddie Bauer were across the street from the Intercontinental’s entrance and Bloomingdale’s, Saks, and Macy’s were just down the way.I did a lot of window shopping and a bit of buying!
Before the conference began, I quickly made my way to the Art Institute (AIC), my second visit in more than 10 years. The museum’s majestic entrance, adorned by two huge lions, made my heart beat a little faster as I anticipated the treasures within.Enjoying a “busman’s holiday,” I went first to the Rare Books Reading Room of AIC’s Ryerson and Burnham Libraries.On exhibition was “Art through the Pages: Library Collections at the Art Institute of Chicago.” The exhibition displayed artists’ books, ephemera from the archives, examples of fine printing, and much more, including a copy of Nuremburg Chronicle (printed by Anton Koberger, 1492).We also have a 1492 Nuremburg Chronicle in our own Rare Books Collection at ZSR and it is on display in our Reading Room!
Next I went to “Of National Interest: Photographs from the Collections.” Pulling from the Art Institute’s Julian Levy Collection-the legendary gallery director who assembled the first exhibition of Cartier-Bresson’s photographs in the United States-this exhibition provided a rare glimpse into the early stages of Cartier-Bresson’s career. Work by his painting instructor André Lhote parallels Cartier-Bresson’s early photographs, as does that of Salvador Dalí. Also included were works by Giorgio de Chirico, Henri Matisse, Piet Mondrian, and Pablo Picasso that relate to photographs by Brassaï, André Kertész, and other photographers active in Paris between the World Wars.
Another favorite of mine was “Walt Disney and Bill Peet: The Storymen.” After a 27-year career working as Walt Disney’s principle animator and main “storyman,” Bill Peet devoted himself fulltime to writing and illustrating children’s books. Along with sketches and storyboards from his Disney days, this exhibition featured original works of art from 14 of Peet’s 34 published books, including Buford the Little Bighorn, The Caboose Who Got Loose, Capyboppy, Chester the Worldly Pig, Cowardly Clyde, Ella, How Droofus the Dragon Lost His Head, Kermit the Hermit, Pamela Camel, The Wump World, and the Caldecott Honor Book Bill Peet: An Autobiography.
The conference began the next morning with the keynote address, “Preservation in the Age of Google,” by Paul Conway, professor at University of Michigan School of Information.The address focused on the question of how the cultural heritage community can embrace the innovative aspects of the digital world while preserving the values that have motivated stewardship efforts for generations. He reviewed the past decade’s transformation in the uses of digital technologies and highlighted some dilemmas for the community that may only resolve through a fundamental shift in how we conceive, manage, and fund preservation activities.
Conway discussed HathiTrust, a digital repository for the nation’s great research libraries, bringing together the immense collections of partner institutions.Hathi (pronounced hah-tee) is the Hindi word for elephant, an animal highly regarded for its memory, wisdom, and strength. In combination, the words “hathi” and “trust” convey the key benefits researchers can expect from a first-of-its-kind shared digital repository.
HathiTrust was conceived as a collaboration of the thirteen universities of the Committee on Institutional Cooperation and the University of California system to establish a repository for these universities to archive and share their digitized collections. The Committee on Institutional Cooperation (CIC) is the academic equivalent of the Big Ten. It includes the University of Illinois, the University of Chicago, University of Illinois at Chicago, Indiana University, University of Iowa, University of Michigan, Michigan State University, University of Minnesota, Northwestern University, Ohio State University, Penn State University, Purdue University and University of Wisconsin-Madison.
The University of Michigan, Indiana University, and the University of California system, all highly regarded for their digital libraries and project management, are leading the partnership effort through their expertise and financial commitment. All members of the CIC are founding partners.
Partnership is open to all.Researchers benefit from the reputable curation and consistent access long associated with research libraries, but instead of having to search across each institution’s repository, researchers benefit from a shared collection.All content to date has been supplied by the University of Michigan and the University of Wisconsin, both leaders in mass digitization efforts. The University of California, the University of Virginia, Indiana University and Purdue University will soon be contributing their digital materials.
HathiTrust complements Google’s massive undertaking to digitize the world’s library collections. While both systems offer digitized books via the Internet, it is likely that HathiTrust will provide some content Google will not, such as digital collections unique to each institution, works from institutional repositories, and native born-digital materials.
HathiTrust is making bibliographic records for the public domain (HathiTrust materials) available so that institutions around the world can load them into their online catalogs, alerting users to the availability of these digitized volumes. Currently there are 2,678,060 volumes, 937,321,000 pages, 32 miles and 2176 tons with 410,586 volumes (~ 15% of total in the public domain) digitized.
For more on HathiTrust, see http://www. hathitrust.org.
Conway continued by noting certain dilemmas in preservation of digital materials, such as the quality of reformatting. Do we establish separate preservation standards for all media or do we let access drive digitization quality standards? Digitization and digital preservation are ever increasing in technical complexity. Do we retrain/recruit or outsource?
Conway’s first recommendation was to distinguish digitization from digital preservation. The realm of digitization is very local and highly collaborative while digital preservation is (currently) “designated communities” at a national and international level. Another recommendation was to establish new community priorities by: 1) building environmental facilities – – the need for print will not go away no matter what we do with digitization and digital preservation; 2) building capacity to digitize audio, video, visual, graphic, film; 3) selecting collections for impact and distinctiveness.
His final recommendations for institutional digital preservation were: 1) know who you serve and support; 2) choose for use – – build for reuse; 3) choose your partners carefully; 4) use other people’s solutions.
Bernard F. Reilly, President of the Center for Research Libraries, a partnership of 245 U.S. and Canadian universities, colleges and independent research libraries, spoke on “Digital Collections that Persist: Learning from the Corporate Sector.” He noted that the survival of much data will rely upon actions taken by consortia of scientists and researchers, institutes, and corporations like ProQuest, LexisNexis, and the Associated Press. “It’s not just libraries,” he said. “There are much larger stakeholders and we need to learn together.” He reminded us that transparency is critical, as is functionality, realistic timeframes, and diversification of products.
Robin L. Dale, Associate University Librarian at the University of California, Santa Cruz spoke on “Trusted Digital Repositories: What You Need to Know Beyond the “Alphabet Soup” of Standards. “Trusted digital repositories (TDR),” she said, “are part unicorn, part leprechaun, and part pot o’gold at the end of the rainbow.” She said right off the bat that UC, Santa Cruz should not build a digital repository for it is far too small for such.
Dale stated that in an effort to understand and explain the meaning of repositories, a semantically confusing scene what with digital repositories, institutional repositories, digital archives, data archives, digital object management systems, and digital asset management systems, one must consider: 1) orientation; 2) coverage; 3) object management; and 4) preservation.
Trustworthy Repositories: Audit & Certification (TRAC) can serve as a guide and is helpful for: 1) digital repository planning; 2) digital repository audit/evaluation; 3) evaluating third party service provider capabilities.The German counterpart is nestor’s Criteria Catalogue for Trusted Repositories. Dale said that trustworthiness comes only through transparency and only a very few institutions are willing to go through certification since folks are not readily willing to “have their laundry exposed.”
She went on to say that, in order to move principles into practice, an institution should: 1) ground principles locally; 2) identify institutional digital preservation needs; 3) identify resources available; 4) identify what sustainable services the institutions want to and can provide locally. She emphasized the need to fight the “one size fits all” mentality. Repositories will differ but good practices and reliable technical infrastructure are universal.
The 5 Maori Principles for a trusted repository are:
1.Receive information with utmost accuracy
2.Store information with integrity beyond doubt
3.Retrieve the information without amendment, non-corrupted
4.Apply appropriate judgment in the use of information
5.Pass information on appropriately and accurately
Dale gave HathiTrust as an example of a trusted collaborative digital repository with shared, transparent leadership; sustainable funding by 24 major academic institutions;a 5-year plan; persistent, high availability storage; and documented long-term preservation commitment. In 5 more years, Dale hopes that not many more TDRs will exist. She closed by saying minimize silos and maximize repository sustainability by working with partners while maintaining local, stable, and secure access services.
Andy Kolovos, Archivist at Vermont Folklife Center, discussed “Audio Preservation Digitization: Best Practices and Smaller-Scale Solutions,” providing an introduction to current best practices for audio preservation digitization in the archival context. He covered activities from the playback of analog source material to the storage and management of digital audio files. He provided insight into the nature of analog and digital audio, explaining what digital standards mean, and provided suggestions for preservation assessment, vendor selection, and scalable, sustainable approaches for the archival storage of digital audio files. He mentioned George Blood as a vendor for older audio files transfer for smaller libraries that do not own the equipment for such. (George Blood is a vendor we have used in Special Collections for audio file transfers for our audio collections in Rare Books and Archives.)
Standard rules of digital preservation are:
2.Redundancy – – storing files in more than one place and creating multiple back-ups
3.Migration – – shifting files before old media and formats are no longer supported
4.Documentation – – tracking metadata
Practical approaches for small scale institutions:
1.Use a high quality CD-R burner (e.g., Plexor)
2.Use high quality CD-R media (e.g., MAM-A 74 min/650 MB gold in jewel case)
3.Burn at no slower than 8x and no faster than 16x
4.Write on plastic inner ring only – – no labels
5.Remember: DVD is not considered an archival storage medium, only short term use.
Sarah Stauderman, Preservation Manager of the Smithsonian Institution Archives where she oversees the care of paper, book, photograph, moving image, and recorded sound materials, spoke on “Magnetic Videotape Recordings: Preservation, Assessment, and Migration.”
As preservation manager at the Smithsonian, Sarah has the opportunity to see, restore, and preserve many, many things.Her telling of preserving “panda porn” was hilarious as she reminded us of the months and months of panda dating before a baby panda came to be.Of course, the zoologists were video recording all for posterity so the world could see and learn.Maybe the pandas too!
Stauderman, author of the web site “Video Format Identification Guide shared assessment tools to determine video collections for migration: 1) date of tape; 2) content of tape; 3) known storage needs; 4) obsolence rate; 5) level of risk.
Basic preservation guidelines to follow are: 1) replace tapes every 10-30 years; 2) examine visually to assess quality of playback. Agents of deterioration are: 1) heat; 2) light; 3) excessive moisture; 4) extreme mechanical stress; 5) dust – – clean is very, very important!Basic housekeeping requires: 1) dust free environment; 2) grounded metal shelves; 3) upright, like books; 4) wound position.
Appropriate climate for storage is:10- year storage = 46-73 degrees F. and 15-50% humidity; 50-year storage = 51 F. and 50% RH and constant.Never place magnetic media under 46 F.
She developed and shared with us a helpful “Preservation Priority Worksheet for Videotape Collections” to evaluate video collections for preservation and reformatting as well as a matrix to determine priorities.
Richard Rinehart, digital media artist and Digital Art Curator at the University of California, Berkeley Art Museum/Pacific Film Archive, spoke on “Preserving Digital Art: New Medium and Social Memory.” Richard manages research projects in the area of digital culture, including the NEA-funded project “Archiving the Avant Garde,” a national consortium of museums and artists distilling the essence of digital art in order to document and preserve it. Rinehart’s papers, projects, and more can be found at www.coyoteyip.com
Rinehart elaborated that works of digital and Internet art represent some of the most compelling and significant artistic creations of our time. These works constitute a history of alternative artistic practice, but because of their ephemeral, technical, or variable natures, they also present significant obstacles to accurate documentation, access, and preservation. Rinehart’s talk outlined the problems and latest innovative approaches to preserving new media art, including the Variable Media Initiative and the Media Art Notation System where metadata can be supplied.
He presented and explored ways to come to come to a better understanding of media art forms.He showed us the Rhizome ArtBase (http://rhizome.org/art/?tag=folksonomy), founded in 1999. ArtBase is an online archive of new media art containing some 2470 art works, and growing. The ArtBase encompasses a vast range of projects by artists all over the world that employ materials such as software, code, websites, moving images, games and browsers to aesthetic and critical ends. Rhizome also supports Creative Commons licenses, which allow creators to shift the terms of copyright from “All Rights Reserved” to “Some Rights Reserved,” therefore enabling authors to mark their creative works. Rhizome’s hope is that through the use of these licenses, artists will have greater access to each other’s work in furtherance of their goals.
Richard also shared with us information regarding the University of California, Berkeley Art Museum and Pacific Film Archive (BAM/PFA)’s Open Museum, the first “open-source” museum collection. The Open Museum, currently in the early stages of planning and development, will be:
- A preservation repository for born-digital art works
- A public website that allows unprecedented access to these works
- An innovative legal, economic, and cultural framework for the digital arts
Building on the foundation of research resulting from past digital media projects pioneered by BAM/PFA – including an NEA-funded national consortium project to preserve digital art – the Open Museum project will develop systems that will preserve and provide open access to “born digital” art, that is, art that was created and exists in a digital format. What underlies all digital art is code, code that can be distributed online via the Open Museum. The digital art in the Open Museum will be open source in the sense that, with the permission of participating artists, source code for art works will be free for others to download, study, or re-mix into new works. See more at http://openmuseum.berkeley.edu/.
Rinehart also mentioned Franklin Furnace Archives, Inc. for Artists (http://www.franklinfurnace.org). Franklin Furnace is a small New York City arts organization that has been at the forefront of contemporary site-specific and performance art for the past thirty years. Over the years, Franklin Furnace has maintained archival records documenting its activities. Today the curation of these archives, along with continued grants for new performance and digital art, constitute Franklin Furnace’s main activities in pursuit of its goal to “make the world safe for avant-garde art.”
Other sites he mentioned concerning digital art were “Forging the Future: New Media at the University of Maine” (http://newmedia.umaine.edu/feature.php?id=685) and “The Long Now Foundation” (http://www.longnow.org/), a California think tank established in 1996 to foster “creativity and long term thinking for the next 10,000 years”!(forget the 5-year plan – – let’s go for 10,000!!)Such a conference cannot be without a lecture on copyright!
Ken Withers, Director of Judicial Education and Content, The Sedona Conference for trial lawyers, in-house counsel, and judges (www.thesedonaconference.org ), delivered “Electronic Collections and the Law: Atticus Finch Visits the Digital Archives.” Withers said that the information explosion is affecting the law and litigants who used to bury each other in paper and are now burying each other in electrons, bringing into question the nature of information, evidence, and truth. Judges are rendering opinions on such diverse issues as the appropriate format for digital preservation of evidence, the value of metadata, and the efficacy of Boolean search methodologies for text retrieval.
Suddenly, he said, the legal profession has discovered the value of good digital archiving practices. This means, as a result, that there is a market for research, development, and the application of new digital preservation techniques that reach far beyond museum and libraries, and that there is the prospect of partnerships between academia, government, the corporate world, and the legal profession to advance mutual interests in digital preservation.
“Today, he said, a young person graduating from law school and joining a large firm in one of our major cities can look forward to perhaps three or four years of doing nothing but sitting in
front of a computer screen reviewing e-mail and other electronic documents for litigation.”
Digital e-mail evidence must stand up to certain questions to determine admissibility in court:
1. Is the email message relevant?
2. Is the email message authentic?
3. Is the email message hearsay?
4. If it is hearsay, does an exception to the hearsay rule apply?
5.Does the email message satisfy the “original writing” rule?
In a case in Arizona regarding age discrimination – – Williams v. Sprint (D. Kansas, Sept. 29, 2005) – – there was a reduction in work force resulting in age discrimination charges.Excel spreadsheets were requested by the court; .pdf files were provided.The court ultimately found the. pdf format inadequate and ordered production of materials “in native format.”
Withers said the University of Arizona Digital Information Management Certificate Program is one to consider for those seeking further education in the field of digital preservation.
The next speaker was Shelby Sanett, Visiting Assistant Professor at Pratt Institute School of Information Studies in New York where she teaches the Conservation and Preservation course. She also works at the National Archives and Records Administration (NARA) in College Park, MD as a Management and Program Analyst Team Leader in the Office of Space and Security Management.
Sanett discussed “Building a Successful Digital Preservation Program.” Sanett recently completed a seven-year study of the digital preservation program at the National Archives of Australia. The study focused on three core areas of practice in the emerging digital preservation program: staffing, costs, and policy.
She said to consider, before establishing a digital preservation program, the following:
1.Why is the program needed?
2.Is it mandated by law, the organizational mission or something else?
3.Can it be scalable to the organization’s needs?
4.What is the timing for this project to be initiated, the pilot run and the decision to go forward?
5.Who will lead this project and what qualifications are necessary?
6.What skills are needed?
7.Where in the organization will this program be located?
8.What position does the program director report to?
9.Does the department where the program will reside share resources and staff with
Katherine Skinner, Emory University where she is Digital Projects Librarian and is author of “Strategies for Sustaining Digital Libraries (2008), spoke on “Collaborative Adventures in Digital Preservation: Creating and Sustaining External Partnerships.” Her presentation addressed some of the major opportunities and challenges presented by cross-institutional collaborative activities in the field of digital preservation. Some emerging organization models include MetaArchive Cooperative; HathiTrust (discussed earlier); Arizona Persistent Digital Archives and Library System (PeDALS); and Alabama Digital Preservation Network (ADPN).Emerging standards include OAIS Reference Model; Preservation Metadata (PREMIS); Trustworthy Repositories Audit and Certification (TRAC); and Digital Repository Audit Method Based on Risk Assessment (DRAMBORA = U.K. model).
NEDCC’s Stewardship of Digital Assets 2007-2008 surveys indicated that while 94.7% reporting were engaging in backup strategies, only 21% employed off-site storage of backups.16.7% reported that they were creating no metadata for their digital collections.13.6% had a digital preservation plan and 12% reported operating a digital preservation solution. The goal, Skinner said, of all digital preservation is the accurate rendering of authenticated content.
The MetaArchive Cooperative, begun by Emory University, is an independent, unincorporated, international membership association.Its purpose is to support, promote, and extend the MetaArchive approach to “distributed digital preservation” practices (http://www.metaarchive.org). “Distributed digital preservation” is the distribution, management, and maintenance of digital information over a wide geographical area and over a long period of time – – maintaining its viability, authenticity, and accessibility across changing technologies, formats, and user expectations.
To date, nine sustaining members make up the MetaArchive Cooperative with 7 others under contract:
University of Louisville
Florida State University
University of Hull (U.K.)
Membership is available at varying levels from Sustaining Members at $5000/year and a 3-year term with representation on the Steering Committee to Preservation Members to Contributing Members who do not host the infrastructure but need to preserve their materials and are contributing content ($200/year; 3-year term).(Wake Forest is a very good candidate for the Contributing Member level.) Services for members include:
Investing in a commonly owned solution, not purchasing a service
Sharing technological development and organizational tasks
Decentralization of preservation with a shared commitment to preservation
Decreasing dependence on third party solutions
Increased capacity for acting as a community of cultural stewards
Examples of MetaArchive’s materials:
Born digital and digitized collections
Digital image, sound, and video files
Datasets and databases
Electronic theses and dissertations (ETDs)
MetaArchive can ingest digital objects and their metadata from the web, OAI, CONTENT dm, DSpace, and Fedora.Collection-level metadata is required for retrieval purposes; item level is not.
There are 3 “archives” to date:
1.Southern Digital Culture (which all members are a part of)
2.Electronic Theses and Dissertations
3.History of the Slave Trade
More than 200 collections are preserved to date. New archives are established at member requests and curatorial decisions are made by the contributing institutions.
Simon Tanner, Director of King’s Digital Consulting Services (KDCS), King’s College, London, addressed “Making Digital Preservation Affordable: Values and Business Models.” Tanner discussed strategies for effectively financing digital preservation. Users and patrons, he said, ultimately define the economic factors by which digital information is valued, used and retained. In preparing to finance digital preservation, he shared a number of different issues to consider including business planning, risk management, possible revenue streams and a clear cost benefit relationship.
Tanner declared, “Time is the scarce resource in the information age” and it is time we are trading for an “attention economy.” To our benefactors, access is valued more than preservation and materials selected to preserve should assume an understanding of future significance. Items of high consequence for future must be preserved no matter the risk, according to Tanner. For understanding users, he quoted Glen Campbell (for those of you old enough to remember him!!) and his song Wichita Lineman – – “I want you more than need you and I need you for all time.”!
Tanner referenced Princeton University’s “Global Consciousness Project” (http://noosphere.princeton.edu/ and also Global Voices Online (http://globalvoicesonline.org/about/). The Global Consciousness Project’s purpose is to examine subtle correlations that may reflect the presence and activity of consciousness in the world. The Global Consciousness Project, also called the EGG Project, is an international, multidisciplinary collaboration of scientists, engineers, artists and others. Global Voices Online Global Voices is a leading participatory media news room for voices from the developing world. Begun in 2005 as a simple blog hosted at the Berkman Center for Internet and Society at Harvard University, Global Voices has grown into a vibrant global community of more than 150 active volunteer authors and translators and more than 20 freelance part- time regional and language editors.
Tanner encouraged us to pitch our own digital projects by finding ways to express interest, excitement and activity. He said to find ways to “lever” our products in order to receive funding and find benefactors, understanding that preservation is synonymous with long term access.
The final speaker, David Liroff, Senior Vice President, Corporation for Public Broadcasting, addressed “Toward an Emerging Global Consciousness.” He noted that Marshall McLuhan famously popularized the idea of “the global village” – – a world of simultaneous happening. Less well known is the fact that McLuhan’s mentor was a Jesuit paleontologist named Pierre Teilhard de Chardin who – more than a half century ago – anticipated the Internet when he envisioned a “membrane of information enveloping the world.” Liroff said it is fully appropriate, when we address the challenges of persistence of memory and sustaining digital collections, to do so in the context of an emerging global conciousness in which every object, every memory, has its place in a greater order.