Last Wednesday I traveled with Rebecca and Tanya to CurateGear 2014 in Chapel Hill, NC. In its third year, CurateGear is a day-long event that showcases tools that facilitate digital curation. The three tools I found most interesting were MetaArchive, a TRAC review tool, and BitCurator.
MetaArchive is a co-op of university libraries and independent research libraries who work together to preserve their digital content. Each MetaArchive member institution contributes a secure, closed-access, preservation server to the MetaArchive LOCKSS network. After an institution ingests content to its own preservation server, six other servers in the MetaArchive LOCKSS network replicate that content. Servers are assigned to content in order to maximize geographic distribution.New or changed content is stored alongside the original, and in fact, this support for versioning is a huge advantage of MetaArchive’s preservation strategy. The seven servers check in with each other periodically in order to perform fixity checks and verify that all seven copies remain identical. If a mismatch is identified, the servers reach consensus about which copy is “correct” and repair the mismatch. The repair is treated as a version and stored alongside the original. The co-op model offers economies of scale, and membership in MetaArchive seems very reasonable. The knowledge community of MetaArchive strikes me as an appealing alternative to preservation-as-a-service vendors such as DuraCloud and Preservica.
TRAC review tool
Acronyms abound in our profession, and for those who aren’t familiar, TRAC refers to Trustworthy Repositories Audit and Certification (TRAC): Criteria and Checklist, which is now ISO 16363. Essentially, TRAC is a method for demonstrating that a digital repository meets certain criteria for trustworthiness. There are 88 criteria on the checklist, and they fall into three categories:
- Organizational Infrastructure – e.g. mission statement, succession plans, professional development, financial stability
- Digital Object Management – e.g. metadata templates, persistent unique identifiers, registries of formats ingested, preservation planning
- Technologies, Technical Infrastructure, and Security – e.g. detecting bit corruption, migration processes, off-site backup
While TRAC is designed for repositories to become certified as trustworthy, many institutions simply use it as a self-assessment tool. Developed by Nancy McGovern, the Head of Curation and Preservation Services at MIT Libraries, the TRAC review tool enables the assessor to provide evidence of how well a repository meets a TRAC criterion and rate its compliance on a five-point scale:
- 4 = fully compliant – the repository can demonstrate that has comprehensively addressed the requirement
- 3 = mostly compliant – the repository can demonstrate that it has mostly addressed the requirement and is on working on full compliance
- 2 = half compliant – the repository has partially addressed the requirement and has significant work remaining to fully address the requirement
- 1 = slightly compliant – the repository has something in place, but has a lot of work to do in addressing the requirement
- 0 = non-compliant or not started – the repository has not yet addressed the requirement or has not started the review of the requirement
Of course, knowledge of whether a repository meets all of these 88 criteria isn’t the purview of one person, and another benefit of the TRAC review tool is that it enables the lead assessor to assign certain criteria to other people (such as admin or tech team), making the whole process of assessing repository activities more transparent across an organization.
Technically speaking, the TRAC review tool is simply a Drupal instance with a page for each TRAC criterion, so it’s very lightweight and easy to begin using after download!
BitCurator bundles open-source digital forensics tools to help memory institutions manage born-digital materials and perform tasks such as:
- acquiring disk images of floppies, hard drives, laptops, or desktops
- generating technical metadata for the disk images
- identifying and retracting sensitive information such as SSNs, credit card information, etc.
Most of the tools that BitCurator is adapting for use by memory institutions originate in the law enforcement world, whose purposes are very different from our own. BitCurator repurposes these tools for the curation tasks of special collections and archives. For example, capturing a disk image (rather than file by file by file) not only preserves the environment in which the creator worked, but also in a certain sense preserves the “original order” of the records. Last summer I attended a BitCurator hackathon hosted by the Open Planets Foundation, where my main output was a detailed draft of a workflow for ingesting born-digital materials. At CurateGear 2014, I was pleased to hear about some updates to BitCurator 0.5.8 and pleased, too, that my draft workflow doesn’t yet need revision!