As part of Erik’s and my grant to explore the subject of data and data sets, we also wanted to explore what is being done to preserve the data that is being generated. So, we both attended CurateCamp, an “Unconference” that is held periodically to discuss issues pertaining to data curation. Data curation is defined as (in Wikipedia): “the active and on-going management of data through its life cycle of interest and usefulness to scholarship, science, and education. Data curation activities enable data discovery and retrieval, maintain its quality, add value, and provide for re-use over time, and this new field includes authentication, archiving, management, preservation, retrieval, and representation.”
As more researchers generate data sets in their work, and as funding agencies start to require that these data sets be preserved and made available, this has become a hot topic in library circles. This year’s two day unconference was held at Stanford University and attracted over 100 attendees from mostly academic libraries. The attendees ranged from administrators to programmers to user experience professionals. The unconference format is becoming popular for technology focused conferences because sessions are not determined until you arrive. This allows potential for new topics and ideas that have come on the horizon recently rather than having sessions whose topics were set in stone 6-12 months before the conference happens. However, this also means that it is (in my opinion) sort of a crap shoot. You don’t really know what you are going to be able to learn or whether it will cover subjects of interest at all.
The first half day was spent with introductions to discovers folks’ areas of interest (relating to data curation!) and then people pitched their ideas for sessions to be held. As a consensus was reached on each topic, it went into the schedule on a post-it-note (as the picture above shows). The variety of topics was wide and, since the sessions were all discussion formats, the conversations started with the main topic but veered off at will as new ideas were introduced. Some of the broad data curation themes that I was introduced to were: provenance of digital objects, institutional storage capacities, versioning control methods, the Hydra Project, digital forensics, and faculty outreach.
During the two days of the conference, I found myself focusing on topics that would perhaps have current or future applicability at Wake Forest. The idea of instituting a data curation program is one that has been brought forth by a few different people on campus, but it really hasn’t taken off as something that is a priority at the University level. What I heard at the conference confirmed my thoughts that this is currently more of a priority at very large institutions that are generating a great deal of research and/or have sophisticated record management/archival programs (how much of the University business is born digital these days?). However, Wake still has the same issues, just on a smaller scale. The libraries at the conference agreed that establishing policies, workflows and compliance is a much large issue than can be handled by the library. It should be an institutional initiative that includes the resources to do curation correctly.
One other main topic that was discussed that aligned with what is happening at WFU concerned how institutions are handling born digital video. There were all sorts of issues discussed, from problems with transfer rates (slow network) to the lack of standards for video capture, to difficulties in conversion to long term storage capacity. Nobody seemed to have agreed upon “best standards.” All are dealing with trying to figure out how to get the videos that professors are putting out on such sites as UTube and Vimeo, so that they can be archived properly. At least, it made me feel better about the status of video capture and archiving at Wake. We recognize a need to come up with a long term solution and different parties on campus are working toward that as video gains a more prominent role in teaching and learning.
Overall, the two days of being introduced to the field of data curation was quite valuable. It generated plenty of ideas for what might be feasible for us to do at ZSR Library and in partnership with other units in the University.