Professional Development

Leslie at MLA 2016

Monday, March 14, 2016 8:08 pm

This year’s meeting of the Music Library Association was held in Cincinnati, where, during breaks and receptions, we enjoyed 1920s tunes performed by members of the Cincinnati Opera, and by MLA’s own big band, in the Netherland Plaza Hotel’s beautifully-restored 1930 Art Deco ballroom.

DIVERSITY

It has long been recognized that America’s conservatories and orchestras remain overwhelmingly white (less than 5% of students in music schools are non-Asian minorities). While administrators of these institutions are currently struggling to rectify the situation, libraries (it was noted at the MLA meeting) have a chance to be an exemplar. In a joint project with ARL called the Diversity & Inclusion Initiative, MLA has supported internships and fellowships for MLIS students with music backgrounds to work in music libraries. The diversity aimed for includes not just race/ethnicity, but also gender, marital status, disabilities, etc. In the opening plenary session, we heard from some of the former fellows. Benefits that were particularly appreciated included the visibility and recognition acquired while a student, which subsequently opened doors to professional opportunities; peer mentors (previous fellows) who provided ongoing support with entry into the profession, and after; and help with the hidden costs of college (additional fees, textbooks, etc.) for which first-generation students are often unprepared. Difficulties encountered included locating sources of help – one fellow reported “cold calling” random MLA members before discovering the DII program. This prompted a discussion, during the Q&A, on how the program could be better publicized.

On a similar outreach note, MLA (whose membership encompasses North America – U.S. and Canada) plans to invite Latin American colleagues to next year’s meeting in Orlando, billing it a Pan-American conference.

LINKED DATA

MLA’s initiatives in this field:

  • Two new thesauri have been published in the past year — for medium-of-performance terms (LCMGT), and for music genre/form terms (LCGFT) – along with best-practices documents for both.
  • Involvement in LD4L (Linked Data for Libraries), a collaborative project of Cornell, Harvard, and Stanford.
  • The NACO Music Project, working on authority data.
  • A Bibframe Task Force, which is undertaking various projects to enhance the new encoding schema to meet music users’ needs.

We heard about other projects that member libraries have done to enhance discoverability of special collections:

The Linked Jazz Project, best known for its visualizations, is based on data extracted from oral-history transcripts in numerous jazz archives. The data is then converted to RDF triples reflecting relationships between jazz artists (x talks about y; y knows of x). The data is enhanced via crowdsourcing. The developers hope others will use the LJ data to build additional linked-data sets: mashing LJ data with performances at Carnegie Hall is one such project; another is unearthing female jazz artists (neglected in traditional jazz histories) by enriching LJ data with other sources such as DBpedia, MusicBrainz, and VIAF (the international authority file).

Colleagues at Michigan State used Discogs (a crowdsourced, expert-community-reviewed database of metadata on pop music recordings) to process a gift collection of 1200 LPs of Romani music, which also included pop music containing Gypsy stereotypes. They hope to use this collection as a pilot to develop a process for a much larger corporate gift of 800,000 pop recordings and videos. They were able to extract data directly from the Discogs website using Discogs’ API (which outputs in JSON – they used Python to convert the JSON to XML and then MARCXML). Cataloging challenges included: dealing with usage differences between Discogs’ “release” and RDA’s “manifestation”; similarly, between Discogs’ “roles” for artists and RDA’s “relationship designators”; and mapping Discogs’ genres and subgenres to LC’s genre/form terms and medium-of-performance terms, supplementing with LC subject headings as needed. Discogs’ strengths: expertise in languages (from its international contributor community) and in obsolete formats; and the ability to link to the Discogs entry from the library catalog. Our presenters plan to propose to the Discogs community indexing the UPC (universal product code, the barcodes on CDs); a similar resource, MusicBrainz, does this.

A third project, at Cornell, was ultimately unsuccessful, but also illustrates the variety of data resources and tools that people are trying to link up. For a collection of hip-hop flyers, they constructed RDF triples using data from MusicBrainz, ArtStor, and Cornell’s existing metadata on the related events etc. They chose Bibframe for their encoding schema, and compiled an ontology from Getty’s AAT vocabulary, various music and event ontologies, and Schema.org. Reconciliation of names from all these sources was done using the open-source analytics tool OpenRefine. The problems developed as they came to feel that Bibframe did not meet their test for describing flyers; they decided to abandon it in favor of LD4L. Reconciliation of names also proved more problematic than expected.

DISCOVERY

In a session on music discovery requirements, colleagues noted two things that current ILSs and discovery layers are not good at: showing hierarchies (for instance, making available additional search terms in thesauri, ontologies, etc.); and mapping multiple physical formats to one title (for multi-media items, such as a book issued with a disc, or a score with a recording, or a CD with a DVD – in most interfaces, the content of the second piece will not be retrieved under a format-facet search).

A presenter from Stanford proposed facet displays that include drop-down menus showing a relevant thesaurus, allowing users to further narrow to a subgenre, for instance. For music, the newly-developed medium-of-performance thesaurus, if displayed with multiple search instances, could enable musicians to enter all the instruments in their ensemble, and retrieve music for that specific combination of instruments. Also discussed were domain-specific search interfaces, such as the ones done by UVA for music and videos. Needless to say, there are potential applications for other disciplines.

Colleagues at East Carolina have made use of Blacklight to map multiple physical formats to the same title.

Lauren at ALA 2015 in San Francisco

Thursday, July 2, 2015 5:13 pm

It probably seemed like everyone was talking about linked data because that was the focus of most of the sessions I attended.

One of the more interesting ones was the Library of Congress BIBFRAME Update Forum, because in addition to Sally McCallum and Beacher Wiggins of LC, they had speakers from Ex Libris, Innovative Interfaces, SirsiDynix, Atlas (think ILLIAD and ARES), OCLC, and Zepheira. At this stage, I think they were all trying to reassure clients that they will keep up with change. I took more notes on Ex Libris than the others since we’re a current customer: After some prologue on revolution vs evolution, Ido Peled, VP, Solutions and Marketing, said, that moving to a native linked data catalog is more revolutionary and Ex Libris is more comfortable with evolution. But I thought he gave more concrete evidence of readiness for linked data than the others because he said ALMA was built to support MARC and Dublin Core already and that Primo Central is already in RDF format, using JSON-LD. He also emphasized the multi-tenant environment and said, “Technology isn’t the focus. The focus is outcomes.” Because linked data includes relying on the data of others and interlinking with your own data, the “multi-tenant” environment concept made sense suddenly and helped me understand why I keep hearing about groups moving to ALMA, like Orbis-Cascade. I’ve also heard from individuals that it hasn’t been easy, but when is a system migration ever easy?

I also attended “Getting Started with Linked Open Data: Lessons from UNLV and NCSU.” They each worked on their own linked data projects, figuring out tools to use (like OpenRefine) and work flows. Then they tested on each other’s data to help them refine the tools for use with different future projects and for sharing them broadly in the library community. They both said they learned a lot and made adjustments to the tools they used. I got a much better sense of what might be involved in taking on a linked data project. Successes and issues they covered reminded me of our work on authority control and RDA enhancement: matches and near matches through an automated process, hits and non-hits against VIAF, cleaning up and normalizing data for extra spaces, punctuation, etc. In fact this session built well on “Data Clean-Up: Let’s Not Sweep it Under the Rug,” which was sponsored by the committee I’m on with Erik Mitchell, the ALCTS/LITA Metadata Standards Committee. I got a good foundation regarding use of MARCedit and OpenRefine for normalizing data to eliminate spaces and punctuation. While I knew regular expressions were powerful, I finally learned what they can do. In one example, punctuation stemming from an ampersand in an organization name caused data to become parsed incorrectly, breaking apart the name of the organization every time for the thousands of times it appeared. A regular expression can overcome this problem in an automated way — there’s no need to fix each instance one by one. (Think in terms of how macros save work.)

The ALCTS President’s Program: Three Short Stories about Deep Reading in the Digital Age featured Maryanne Wolf, Director, Center for Reading and Language Research and John DiBaggio Professor of Citizenship and Public Service, Tufts University. It was interesting to learn from her that brains weren’t designed for reading — think about cave men and their primary goals, which didn’t include reading. She gave a great overview of the development of language and reading and incidentally showed that those who operate in CJK languages have different parts of the brain lighting up than those of us who operate in other languages. This was all foundation leading up to how the brain operates and the effects of reading on the screen. The way we read on a screen results in the loss of certain abilities like reflection and creating connections. She measured that it takes time to regain those abilities too. She isn’t by any means anti-electronic though — she’s doing interesting work in Ethiopia with kids learning by using tablets. We’ll have to get her forthcoming book when it is finished!

I also attended committee meetings, met with vendors, networked, and got to catch up with former colleagues Erik Mitchell and Lauren Pressley over a dinner that Susan organized. (Thanks, Susan!) I especially enjoyed catching up with former colleagues Charles Hillen and Ed Summers, both dating back to my days at ODU in Norfolk, Virginia. Charles now works for YBP as Director of Library Technical Services and Ed just received the Kilgour Award from LITA/OCLC. Thanks to Ed, I got to meet Eric Hellman, president of the company that runs Unglue.it. And thanks to WFU Romance Languages faculty member Alan Jose, who mentioned the idea, I went Monday afternoon with Derrik and Carolyn to visit the Internet Archive offices, where we met Brewster Kahle. The volume the organization handles is mind-blowing! Kahle says they only collect about 40 TV channels right now and it is not enough. They have designed the book digitization equipment they are using (and selling it at a reasonable price too). They have people digitizing reels of films, VHS, and audio, but Kahle says they’ve got to come up with a better method than equipment using magnetic heads that are hard to find. Someone is working on improving search right now too. Some major advice offered was to learn Python!

 

Lauren at ALA Midwinter 2015 (aka Chicago’s 4th Biggest Blizzard)

Thursday, February 5, 2015 5:59 pm

My notes on: IPEDS, ebook STLs and video, our vendors, linked data, BIBFRAME, OCLC and Schema.org, ALCTS/LITA Metadata Standards Committee, advocacy

At the ARL Assessment Forum, there was much complaining over the contradiction in instructions with IPEDs collection counts and circulation. Susan and I had the luck of chatting in the hallway with Bob Dugan from UWF, who turned out to be the main official communicator from libraryland with the person for the library section of IPEDs. Bob is also the author of a LibGuide with clarification info from the IPEDs help desk. Bob seems hopeful that changes in definitions for gathering the info (but not the numbers/form) could happen in time for the next cycle. My main specific takeaways from the various speakers:

  • the only figures that that will be checked between the current IPEDs survey and the previous survey is total library expenditures (not just collection);
  • in spite of the language, the physical circulation part of the survey seems to focus on lending, not borrowing, and may duplicate the ILL info section;
  • some libraries are thinking to use COUNTER BR1 and BR2 reports for ebook circulation and footnote which vendors use which type (BR1 or BR2).

ALCTS Technical Services Managers in Academic Libraries Interest Group discussed a wide range of current issues and it was both reassuring and annoying that no matter the library size, public or private, right now everyone has the same problems and no great answers: high cost ebook STLs, difficulties with video, etc. I inferred that our tactic of explaining prices and the options to faculty (e.g. explaining a mediation message about an EBL ebook or that the producer of a desired video is requiring libraries to pay significantly more than the individual pricing advertised) produces greater customer satisfaction than setting broad restrictive rules to stay within budget.

Jeff, Derrik, and I had a good meeting with a domestic vendor regarding ebooks and I discussed some specific needs with a foreign vendor. All felt like we made progress.

Linked data in libraries is for real (and will eventually affect cataloging). I attended several relevant sessions and here is my distillation: LD4L and Vivo, as a part of LD4L, are the best proof-of-concept work I’ve heard about. When starting to learn about linked data, there is no simple explanation; you have to explore it and then try to wrap your brain around it. Try reading the LD4L Use Cases webpages to get an understanding of what can be achieved and try looking at slide #34 in this LD4L slideshow for a visual explanation of how this can help researchers find each other. Here’s a somewhat simple explanation of Vivo from a company that helped start it and now is the “first official DuraSpace Registered Service Provider for VIVO.” OCLC is doing a lot of groundwork for linked data, using Schema.org, and that effort plays into the work being done by LD4L. While OCLC has been using Schema.org, Library of Congress has invested in developing BIBFRAME. I’m looking forward to reading the white paper about compatibility of both models, released just before the conference. The joint ALCTS/LITA Metadata Standards Committee (which replaced MARBI) is naturally interested in this topic and it was discussed at the Committee meeting. The Committee also gathered input from various groups on high level guidelines (or best practices) for metadata that Erik Mitchell, a committee member, originally drafted.

I also attended the meeting of the ALCTS Advocacy Committee, which has a liaison to the ALA Advocacy Coordinating Group. I understand that advocacy will be emphasized in ALA’s forthcoming strategic plan. If you’re not familiar with the Coordinating Group, it has a broader membership than just ALA division representation, but does include ACRL, LITA, and APALA in addition to ALCTS. I believe ZSR is well-represented in these groups and thus has some clear channels for advocacy!

 

 

 

 

 

 

The Ellers Visit the In-Laws; Charleston 2014

Wednesday, November 12, 2014 12:00 pm

Eleven-day-old daughter and sleep-deprived wife in tow, I attended the 2014 Charleston Conference flying arguably in the face of reason. I had the advantage of a free place to stay: my parents-in-law live out on James Island, a 15-minute drive to the Francis Marion Hotel where the conference is held. Given this fact and the conference’s unique focus on acquisitions, it makes sense for this meeting to become an annual excursion for me.

The opening speaker, Anthea Stratigos (apparently her real last name) from Outsell, Inc. talked about the importance of strategy, marketing, and branding the experience your library provides. She emphasized that in tough budgetary times it is all the more important to know your target users and to deliver the services, products, and environment they are looking for rather than mindlessly trying to keep up with the Joneses and do everything all at once. “Know your portfolio,” advised Ms. Stratigos. I would say that we at ZSR do a good job of this.

At “Metadata Challenges in Discovery Systems,” speakers from Ex Libris, SAGE, Queens University, and the University of Waterloo discussed the functionality gap that exists in library discovery systems. While tools like Summon have great potential and deliver generally good results, they are reliant on good metadata to function. In an environment in which records come from numerous sources, the task of normalizing data is a challenge for library, vendor, and system provider alike. Consistent and rational metadata practices, both across the industry and within a given library, are essential. To the extent that it is possible, a good discovery system ought to be able to smooth out issues with inconsistent/bad metadata; but the onus is largely on catalogers. I for one am glad that we are on top of authority control. I am also glad that at the time of implementation I was safely 800 miles away in Louisiana.

In a highly entertaining staged debate over the premise that “Wherever possible, library collections should be shaped by patrons instead of librarians,” Rick Anderson from Utah and David Magier from Princeton contested the question of how large a role PDA/DDA should play in collection development in an academic context. Arguing pro-DDA, Mr. Anderson claimed that we’ve confused the ends with the means in providing content: the selection process by librarians ought properly to be seen simply as a method for identifying needed content, and if another more automated process (DDA) can accomplish the same purpose (and perhaps do it better), then it ought to be embraced. Arguing the other side, Mr. Magier emphasized DDA’s limitations, eloquently comparing over-reliance on it to eating mashed potatoes with a screwdriver just because a screwdriver is a useful tool. He pointed out that even in the absence of DDA, librarians have always worked closely and directly with patrons to answer their collection needs. In truth, both debaters would have agreed that a balance of DDA and traditional selection by librarians is the ideal model.

One interesting program discussed the inadequacy of downloads as proxy for usage given the amount of resource-sharing that occurs post-download. At another, librarians from UMass-Amherst and Simmons College presented results of their Kanopy streaming video DDA (PDA to them) program, similar to the one we’ll be rolling out later this month; they found that promotion to faculty was essential in generating views. On Saturday morning, librarians from Utah State talked about the importance of interlibrary loan as a supplement to acquisitions budgets and collection development policies in a regional consortium context. On this point, they try to include in all e-resource license agreements a clause specifying that ILL shall be allowed “utilizing the prevailing technology of the day” – an attempt at guaranteeing that they will remain able to loan their e-materials regardless of format, platform changes, or any other new technological developments.

Also on Saturday Charlie Remy of UT-Chattanooga and Paul Moss from OCLC discussed adoption of OCLC’s Knowledge Base and Cooperative Management Initiative. This was of particular interest as we in Resource Services plan on exploring use of the Knowledge Base early next year. Mr. Remy shared some of the positives and negatives he has experienced: among the former, the main one would be the crowdsourcing of e-resource metadata maintenance in a cooperative environment; among the negatives were slow updating of the knowledge base, especially with record sets from new vendors, along with the usual problem of bad vendor-provided metadata. The final session I attended was about link resolvers and the crucial role that delivery plays in our mission. As speakers pointed out, we’ve spent the past few years focusing on discover, discovery, discovery. Now might be a good time to look again at how well the content our users find is being delivered.

ALA Annual 2014 Las Vegas – Lauren

Thursday, July 3, 2014 4:08 pm

Three segments to my post: 1) Linked Data and Semantic Web, 2) Introverts at Work, and 3) Vendors and Books and Video — read just the part that interests you!

1. Linked Data and Semantic Web (or, Advances in Search and Discovery)

Steve Kelley sparked my interest in the Semantic Web and Linked Data with reports after conferences over the past few years. Now that I’ve been appointed to the joint ALCTS/LITA Metadata Standards Committee and attended a meeting at this conference, I’ve learned more:

Google Hummingbird is a recent update to how Google searching functions, utilizing all the words in the query to provide more meaningful results instead of just word matches.

Catalogers and Tech Team take note! Work is really happening now with Linked Data. In Jason Clark’s presentation,”Schema.org in Libraries,” see the slide with links to work being done at NCSU and Duke (p. 28 of the posted PDF version).

I’m looking forward to working with Erik Mitchell and other Metadata Standards Committee members in the coming year.

2. Introverts at Work!

The current culture of working in meetings (such as brainstorming) and reaching quick decisions in groups or teams is geared towards extroverts while about 50% of the population are introverts. Introverts can be most productive and provide great solutions when given adequate time for reflection. (Extrovert and introvert were defined in the Jung and MBTI sense of energy gain/drain.) So says Jennifer Kahnweiler, the speaker for the ALCTS President’s Program and author of Quiet Influence. Another book discussing the same topic is Quiet: The Power of Introverts in a World That Can’t Stop Talking by Susan Cain. Many ZSRians attended this session!

3.Vendors and Books and Video

I spent a lot of time talking with vendors. Most notable was the meeting that Derrik, Jeff, and I attended with some of the publishers that are raising DDA short term loan prices. This will affect our budget, but our plan is to watch it for a bit, to develop our knowledge and determine appropriate action. It was helpful to learn more from the publishers. Some publishers are able to switch to print on demand, while others cannot because traditional print runs are cheaper than print on demand and their customers still want print. Print-driven publishers have to come up with a sustainable model to cover all of the costs, so they are experimenting with DDA pricing. DDA overall is still an experiment for publishers, while librarians already have come to think of it as being a stable and welcome method of providing resources.

Derrik and I also started conversing with Proquest about how we will manage our existing DDA program in regards to the addition of ebrary Academic Complete to NC LIVE.

“The combined bookshops of Aux Amateurs de Livres and Touzot Librarie Internationale will be called Amalivre effective July 1, 2014.”

Regarding video, Mary Beth, Jeff, Derrik and I attended a presentation by two Australian librarians from different large universities (QUT and La Trobe, with FTE in tens of thousands). They reported on their shift to streaming video with Kanopy and here are a few bullets:

  • Among drivers for change were the flipped classroom and mobile use
  • 60% of the DVD collection had less than 5 views while streaming video titles licensed through Kanopy averaged over 50 views
  • 23% and 15% (two universities) of DVDs have never been viewed once
  • 1.7 and 1.8 (two universities) times is the true cost of DVD ownership
  • They have a keyboard accessibility arrangement for the visually impaired
  • Usage is growing for PDA and non-PDA titles in Kanopy [reminds us of our experience with e-books]
  • Discovery of the streaming videos came largely through faculty embedding videos in the CMS
  • Other discovery is not good for video, so they had Proquest add a radio button option for video to Summon to help promote discovery [can we do this?]
  • They concluded that because of greater use,online video is the greater value for the money spent

 

Contributing ZSR Digital Collections to the DPLA!

Friday, October 25, 2013 4:07 pm

Tanya, Craig, and Vicki all mentioned the keynote about the DPLA (Digital Public Library of America) at the Tri-State Archivists’ Conference. Before Emily Gore of the DPLA headed to Greenville, SC to deliver her keynote, she was in Greensboro, NC meeting with digital collection managers. I attended the meeting to learn more about the nitty gritty how-to of contributing ZSR’s digital collections to the DPLA.

For those who aren’t familiar, the DPLA aggregates metadata from the digital collections of libraries, archives, and museums across the United States. In addition to providing a slick search interface at dp.la, the DPLA also makes its API open to developers and encourages the building of apps on top of this platform. By contributing our metadata to the DPLA, we will expose our collections to a national audience. In addition, we will drive traffic to our site from both the dp.la site and apps built on top of the DPLA API.

DPLA App Library

At DPLAfest 2013, the North Carolina Digital Heritage Center was recognized as one of three new service hubs that will aggregate metadata from their regions and serve as a conduit to the DPLA. Over 120,000 records from North Carolina institutions are currently available at dp.la, including records from the State Library of North Carolina, State Archives of North Carolina, and the libraries at the University of North Carolina at Chapel Hill, East Carolina University, and the University of North Carolina at Greensboro in addition to all the records made available by the North Carolina Digital Heritage Center itself at digitalnc.org.

When an institution contributes collections to the DPLA via a service hub such as the North Carolina Digital Heritage Center, they share an item’s metadata as well as its thumbnail.

The DPLA record recognizes both the service hub (in the example above the North Carolina Digital Heritage Center) and the contributing institution (Transylvania County Library). Clicking on either the item’s thumbnail or “View Object” takes the user to the item as it appears on the original site, in this case digitalnc.org (see below).

One more interesting thing to note about the DPLA’s approach to aggregating digital collections is that metadata shared with the DPLA is made available under a CC0 license. By participating in the DPLA, we agree that others may re-use our metadata. However, it’s important to recognize that metadata rights are not equal to digital object rights. Rather, the digital objects we make available via Wake Space remain available under whatever terms we determine.

The North Carolina Digital Heritage Center is currently in the process of evaluating our feeds before adding selected collections to the DPLA. Feel free to contact me if you have any questions!

Steve at NASIG 2012

Thursday, June 14, 2012 5:03 pm

Last Thursday, Chris, Derrik and I hopped in the library van and drove to Nashville for the NASIG Conference, returning on Sunday. It was a busy and informative conference, full of lots of information on serials and subscriptions. I will cover a few of the interesting sessions I attended in this post.
One such session was called “Everyone’s a Player: Creation of Standards in a Fast-Paced Shared World,” which discussed the work of NISO and the development of new standards and “best practices.” Marshall Breeding discussed the ongoing development of the Open Discovery Initiative (ODI), a project that seeks to identify the requirements of web-scale discovery tools, such as Summon. Breeding pointed out that it makes no sense for libraries to spend millions of dollars on subscriptions, if nobody can find anything. So, in this context, it makes sense for libraries to spend tens of thousands on discovery tools. But, since these tools are still so new, there are no standards for how these tools should function and operate with each other. ODI plans to develop a set of best practices for web-scale discovery tools, and is beginning this process by developing a standard vocabulary as well as a standard way to format and transfer data. The project is still in its earliest phases and will have its first work available for review this fall. Also at this session, Regina Reynolds from the Library of Congress discussed her work with the PIE-J initiative, which has developed a draft set of best practices that is ready for comment. PIE-J stands for the Presentation & Identification of E-Journals, and is a set of best practices that gives guidance to publishers on how to present title changes, issue numbering, dates, ISSN information, publishing statements, etc. on their e-journal websites. Currently, it’s pretty much the Wild West out there, with publishers following unique and puzzling practices. PIE-J hopes to help clean up the mess.
Another session that was quite useful was on “CONSER Serials RDA Workflow,” where Les Hawkins, Valerie Bross and Hien Nguyen from Library of Congress discussed the development of RDA training materials at the Library of Congress, including CONSER serials cataloging materials and general RDA training materials from the PCC (Program for Cooperative Cataloging). I haven’t had a chance yet to root around on the Library of Congress website, but these materials are available for free, and include a multi-part course called “Essentials for Effective RDA Learning” that includes 27 hours (yikes!) of instruction on RDA, including a 9 hour training block on FRBR, a 3 hour block on the RDA toolkit, and 15 hours on authority and description in RDA. This is for general cataloging, not specific to serials. Also, because LC is working to develop a replacement for the MARC formats, there is a visualization tool called RIMMF available at marcofquality.com that allows for creating visual representations of records and record-relationships in a post-MARC record environment. It sounds promising, but I haven’t had a chance to play with it yet. Also, the CONSER training program, which focuses on serials cataloging, is developing a “bridge” training plan to transition serials catalogers from AACR2 to RDA, which will be available this fall.
Another interesting session I attended was “Automated Metadata Creation: Possibilities and Pitfalls” by Wilhelmina Randtke of Florida State University Law Research Center. She pointed out that computers like black and white decisions and are bad with discretion, while creating metadata is all about identifying and noting important information. Randtke said computers love keywords but are not good with “aboutness” or subjects. So, in her project, she tried to develop a method to use computers to generate metadata for graduate theses. Some of the computer talk got very technical and confusing for me, but her discussion of subject analysis was fascinating. Using certain computer programs for automated indexing, Randtke did a data scrape of the digitally-encoded theses and identified recurring keywords. This keyword data was run through ontologies/thesauruses to identify more accurate subject headings, which were applied to the records. A person needs to select the appropriate ontology/thesaurus for the item(s) and review the results, but the basic subject analysis can be performed by the computer. Randtke found that the results were cheap and fast, but incomplete. She said, “It’s better than a shuffled pile of 30,000 pages. But, it’s not as good as an organized pile of 30,000 pages.” So, her work showed some promise, but still needs some work.
Of course there were a number of other interesting presentations, but I have to leave something for Chris and Derrik to write about. One idea that particularly struck me came from Rick Anderson during his thought provoking all-conference vision session on the final day, “To bring simplicity to our patrons means taking on an enormous level of complexity for us.” That basic idea has been something of an obsession of mine for the last few months while wrestling with authority control and RDA and considering the semantic web. To make our materials easily discoverable by the non-expert (and even the expert) user, we have to make sure our data is rigorously structured and that requires a lot of work. It’s almost as if there’s a certain quantity of work that has to be done to find stuff, and we either push it off onto the patron or take it on ourselves. I’m in favor of taking it on ourselves.
The slides for all of the conference presentations are available here: http://www.slideshare.net/NASIG/tag/nasig2012 for anyone who is interested. You do not need to be a member of NASIG to check them out.

Steve at ALA Annual 2011

Tuesday, July 5, 2011 5:33 pm

I’m a bit late in writing up my report about the 2011 ALA in New Orleans, because I’ve been trying to find the best way to explain a statement that profoundly affected my thinking about cataloging. I heard it at the MARC Formats Interest Group session, which I chaired and moderated. The topic of the session was “Will RDA Be the Death of MARC?” and the speakers were Karen Coyle and Diane Hillmann, two very well-known cataloging experts.

Coyle spoke first, and elaborated a devastating critique of the MARC formats. She argued that MARC is about to collapse due to its own strange construction, and that we cannot redeem MARC, but we can save its data. Coyle argued that MARC was great in its day, it was a very well developed code for books when it was designed. But as other materials formats were added, such as serials, AV materials, etc., additions were piled on top of the initial structure. And as MARC was required to capture more data, the structure of MARC became increasingly elaborate and illogical. Structural limitations to the MARC formats required strange work-arounds, and different aspects of MARC records are governed by different rules (AACR2, the technical requirements of the MARC format itself, the requirements of ILS’s, etc.). The cobbled-together nature of MARC has led to oddities such as the publication dates and language information being recorded in both the (machine readable) fixed fields of the record and in the (human readable) textual fields of the record. Coyle further pointed out the oddity of the 245 title field in the MARC record, which can jumble together various types of data, the title of a work, the language, the general material designation, etc. This data is difficult to parse for machine-processing. Although RDA needs further work, it is inching toward addressing these sorts of problems by allowing for the granular recording of data. However, for RDA to fully capture this granular data, we will need a record format other than MARC. In order to help develop a new post-MARC format, Coyle has begun a research project to break down and analyze MARC fields into their granular components. She began by looking at the 007/008 fields, finding that they have 160 different data elements, with a total of 1,530 different possible values. This data can be used to develop separate identifies for each value, which could be encoded in a MARC-replacement format. Coyle is still working on breaking down all of the MARC fields.

After Karen Coyle, Diane Hillmann of Metadata Management Associates spoke about the developing RDA vocabularies, and it was a statement during her presentation that really struck me. The RDA vocabularies define a set of metadata elements and value vocabularies that can be used by both humans and machines. That is, they provide a link between the way humans think about and read cataloging data and the way computers process cataloging data. The RDA vocabularies can assist in mapping RDA to other vocabularies, including the data vocabularies of record schemas other than the MARC formats. Also, when RDA does not provide enough detailed entity relationships for particular specialized cataloging communities, the RDA vocabularies can be extended to detail more subproperties and relationships. The use of RDA vocabulary extensions means that RDA can grow, and not just from the top-down. The description of highly detailed relationships between bibliographic entities (such as making clear that a short story was adapted as a radio play script) will increase the searching power of our patrons, by allowing data to be linked across records. Hillmann argued that the record has created a tyranny of thinking in cataloging, and that our data should be thought of as statements, not records. That phrase, “our data should be thought of as statements, not records,” struck me as incredibly powerful, and the most succinct version of why we need to eventually move to RDA. It truly was a “wow” moment for me. Near the end of her presentation, Hillmann essentially summed up the thrust of her talk, when she said that we need to expand our ideas of what machines can and should be doing for us in cataloging.

The other session I went to that is really worth sharing with everybody was the RDA Update Forum. Representatives from the Library of Congress and the two other national libraries, as well as the chair of the PCC (Program for Cooperative Cataloging), discussed the results of the RDA test by the national libraries. The national libraries have requested that the PCC (the organization that oversees the RDA code) address a number of problems in the RDA rules over the next eighteen months or so. LC and the other national libraries have decided to put off implementing RDA until January 2013 at the earliest, but all indications were that they plan to adopt RDA eventually. As the PCC works on revising RDA, the national libraries are working to move to a new record format (aka schema or carrier) to replace the MARC formats. They are pursuing a fairly aggressive agenda, intending to, by September 30 of this year, develop a plan with a timeline for transitioning past MARC. The national libraries plan to identify the stakeholders in such a transition, and want to reach out to the semantic web community. They plan for this to be a truly international effort that extends well beyond the library community as it is traditionally defined. They plan to set up communication channels, including a listserv, to share development plans and solicit feedback. They hope to have a new format developed within two years, but the process of migrating their data to the new format will take at least several more years after the format is developed. Needless to say, if the library world is going to move post-MARC format, it will create huge changes. Catalogs and ILS systems will have to be completely re-worked, and that’s just for starters. If some people are uncomfortable with the thought of moving to RDA, the idea of moving away from MARC will be truly unsettling. I for one think it’s an exciting time to be a cataloger.

Leslie at MLA 2011

Monday, February 14, 2011 2:08 am

I’m back from another Music Library Association conference, held this year in Philadelphia. Some highlights:

Libraries, music, and digital dissemination

Previous MLA plenary sessions have focused on a disturbing new trend involving the release of new music recordings as digital downloads only, with licenses restricting sale to end users, which effectively prevents libraries either from acquiring the recordings at all, or from distributing (i.e., circulating) them. This year’s plenary was a follow-up featuring a panel of three lawyers — a university counsel, an entertainment-law attorney, and a representative of the Electronic Frontiers Foundation — who pronounced that the problem was only getting worse. It is affecting more formats now, such as videos and audio books — it’ not just the music librarian’s problem any more — and recent court decisions have tended to support restrictive licenses.

The panelists suggested two approaches libraries can take: building relationships, and advocacy. Regarding relationships, it was noted that there is no music equivalent of LOCKSS or Portico: Librarians should negotiate with vendors of audio/video streaming services for similar preservation rights. Also, libraries can remind their resident performers and composers that if their performances are released as digital downloads with end-user-only licenses, libraries cannot preserve their work for posterity. The panelists drew an analogy to the journal pricing crisis: libraries successfully raised awareness of the issue by convincing faculty and university administrators that exorbitant prices would mean smaller readerships for their publications. On the advocacy side, libraries can remind vendors that federal copyright law pre-empts non-negotiable licenses: a vendor can’t tell us not to make a preservation copy when Section 108 says we have the right to make a preservation copy. We can also lobby state legislatures, as contract law is governed by state law.

The entertainment-law attorney felt that asking artists to lobby their record labels was, realistically speaking, the least promising approach — the power differential is too great. Change, the panelists agreed, is most likely to come through either legislation or the courts. Legislation is the more difficult to affect (there are too many well-funded commercial interests ranged on the opposing side); there is a better chance of a precedent-setting court case tipping the balance in favor of libraries. Such a case is most likely to come from the 2nd or 9th Circuit, which have a record of liberal rulings on Fair Use issues. One interesting observation from the panel was that most of the cases brought so far have involved “unsympathetic figures” — individuals who blatantly abused Fair Use on a large scale, provoking draconian rulings. What’s needed is more cases involving “sympathetic figures” like libraries — the good guys who get caught in the cross-fire. Anybody want to be next? :-)

Music finally joins Digital Humanities

For a couple of decades now, humanities scholars have been digitizing literary, scriptural, and other texts, in order to exploit the capabilities of hypertext, markup, etc. to study those texts in new ways. The complexity of musical notation, however, has historically prevented music scholarship from doing the same for its texts. PDFs of musical scores have long been available, but they’re not searchable texts, and not encoded as digital data, so can’t be manipulated in the same way. Now there’s a new project called the Music Encoding Initiative, jointly funded by the National Endowment for the Humanities and the German Deutsche Forschungsgemeinschaft. MEI (yes, they’ve noticed it’s also a Chinese word for “beauty”) has just released a new digital encoding standard for Western classical musical notation, based on XML. It’s been adopted so far by several European institutions and by McGill University. If, as one colleague put it, it “has legs,” the potential is transformative for the discipline. Whereas critical editions in print force editors to make painful decisions between sources of comparable authority — the other readings get relegated to an appendix or supplementary volume — in a digital edition, all extant readings can be encoded in the same file, and displayed side by side. An even more intriguing application of this concept is the “user-generated edition”: a practicing musician could potentially approach a digital edition of a given work, and choose to output a piano reduction, or a set of parts, or modernized notation of a Renaissance work, for performance. Imagine the savings for libraries, which currently have to purchase separate editions for all the different versions of a work.

http://music-encoding.org

Music and metadata

In a session titled “Technical Metadata for Music,” two speakers, from SUNY and a commercial audio-visual preservation firm respectively, stressed the importance of embedded metadata in digital audio files. Certain information, such as recording date, is commonly included in filenames, but this is an inadequate measure from a long-term preservation standpoint: filenames are not integral to the file itself, and are typically associated with a specific operating system. One speaker cited a recent Rolling Stone article, “File not Found: the Recording Industry’s Storage Crisis” (December 2010), describing the record labels’ inability to retrieve their backfiles due to inadequate filenames and lack of embedded metadata. Metadata is now commonly embedded in many popular end-user consumer products, such as digital cameras and smartphones.

For music, embedded metadata can include not only technical specifications (bit-depth, sample rate, and locations of peaks, which can be used to optimize playback) but also historical context ( the date and place of performance, the performers, etc.) and copyright information. The Library of Congress has established sustainability factors for embedded metadata (see http://digitizationguidelines.gov). One format that meets these requirements is Broadcast Wave Format, an extension of WAV: it can store metadata as plain text, and can include historical context-related data. The Technical Committee of ARSC (Association of Recorded Sound Collections) recently conducted a test wherein they added embedded metadata to some BWF-format audio files, and tested them with a number of popular applications. The dismaying results showed that many apps not only failed to display the embedded metadata, but also deleted it completely. This, in the testers’ opinion, calls for an advocacy campaign to raise awareness of the importance of embedded metadata. ARSC plans to publish its test report on its website (http://www.arsc-audio.org/). The software for embedded metadata that they developed for the test is also available as a free open-source app at http://sourceforge.net/projects/bwfmetaedit.

Music cataloging

A pre-conference session held by MOUG (Music OCLC Users Group) reported on an interesting longitudinal study that aimed to trace coverage of music materials in the OCLC database. The original study was conducted in 1981, when OCLC was relatively new. MOUG testers searched newly-published music books, scores, and sound recordings, as listed in journals and leading vendor catalogs, along with core repertoire as listed in ALA’s bibliography Basic Music Library, in OCLC, and assessed the quantity and quality of available cataloging copy. The study was replicated in 2010. Exact replication was rendered impossible by various developments over the intervening 30 years — changes in the nature of the OCLC database from a shared catalog to a utility; more foreign and vendor contributors; and the demise of some of the reference sources used for the first sample of searched materials, necessitating substitutions — but the study has nevertheless produced some useful statistics. Coverage of books. not surprisingly, increased over the 30 years to 95%; representation of sound recordings also increased, to around 75%; but oddly, scores have remained at only about 60%. As for quality of the cataloging, the 2010 results showed that about 20% of sound recordings have been cataloged as full-level records, about 50% as minimal records; about a quarter of scores get full-level treatment, about 50% minimal. The study thus provides some external corroboration of long-perceived music cataloging trends, and also a basis for workflow and staffing decisions in music cataloging operations.

A session titled “RDA: Kicking the Tires” was devoted to the new cataloging standard that the Library of Congress and a group of other libraries have just finished beta-testing. Music librarians from four of the testing institutions (LC, Stanford, Brigham Young, U North Texas, and U Minnesota) spoke about their experiences with the test and with adapting to the new rules.

All relied on LC’s documentation and training materials, recording local decisions on their internal websites (Stanford has posted theirs on their publicly-accessible departmental site). An audience member urged libraries to publish their workflows in the Toolkit, the online RDA manual. It was generally agreed that the next step needed is the development of guidelines and best practices.

None of the testers’ ILSs seem to have had any problems accomodating RDA records in MARC format. LC has had no problems with their Voyager system, corroborating our own experience here at WFU. Some testers reported problems with some discovery layers, including PRIMO (fortunately, we haven’t seen any glitches so far with VuFind). Stanford reported problems with their (un-named) authorities vendor, mainly involving “flipped” (changed name order) entries. Most testers are still in the process of deciding which of the new RDA data elements they will display in their OPACs.

Asked what they liked about RDA, both the LC and Stanford speakers cited the flexibility of the new rules, especially in transcribing title information, and in the wider range of sources from which bib info can be drawn. Others welcomed the increased granularity, designed to enhance machine manipulation, and the chance this affords to “move beyond cataloging for cards” towards the semantic web and relation-based models. It was also noted that musicians are already used to thinking in FRBR fashion — they’ve long dealt with scores and recordings, for instance, as different manifestations of the same work.

Asked what they thought “needed fixing” with RDA, all the panelists cited access points for music (the LC speaker put up a slide displaying 13 possible treatments of Rachmaninoff’s Vocalise arranged for saxophone and piano). There are other areas — such as instrument names in headings — that the RDA folks haven’t yet thought about, and the music community will probably have to establish its own practice. Some catalogers expressed frustration with the number of matters the new rules leave to “cataloger’s judgment.” Others mentioned the difficulty of knowing just how one’s work will display in future FRBRized databases, and of trying to fit a relational structure into the flat files most of us currently have in our ILSs.

What was most striking about the session was the generally upbeat tone of the speakers — they saw more positives than negatives with the new standard, assured us it only took some patience to learn, and were convinced that it truly was a step forward in discoverability. One speaker, who trains student assistants to do copy-cataloging, telling them “When in doubt, make your best guess, and I’ll correct it later,” observed that her students’ guesses consistently conformed to RDA practice — some anecdotal evidence suggesting that the new standard may actually be more intuitive for users, and that new catalogers will probably learn it more easily than those of us who’ve had to “unlearn” AACR2!

Sidelights

Our venue was the Loews Philadelphia Hotel, which I must say is the coolest place I’ve ever stayed in. The building was the first International Style high-rise built in the U.S., and its public spaces have been meticulously preserved and/or restored, to stunning effect. The first tenant was a bank, and so you come across huge steel vault doors and rows of safety-deposit boxes, left in situ, as you walk through the hotel. Definitely different!

Another treat was visiting the old Wanamaker department store (now a Macy’s) to hear the 1904 pipe organ that is reputed to be the world’s largest (http://www.wanamakerorgan.com/about.php).

Vufind updates

Wednesday, February 9, 2011 12:05 am

JP already talked about Vufind but I thought I would add in my notes from the Vufind talk today. Demian Katz (Villanova) took some time in the afternoon to talk about Vufind and its growing support for metadata standards other than MARC. The update centered on how Vufind had been re-tuned to be more agnostic with regards to metadata standards and encoding models. The redesign made use of “Record Drivers” to take control of both screen display functionality and data retrieval processes, OAI harvesters to gather data and XSL importing tools to facilitate metadata crosswalks and full text indexing.

Demian talked at some length about basic features of the metadata indexing toolkit. At the Vufind 2.0 conference he talked a bit about his ability to use the MST from the Extensible Catalog project and I wonder (no answer, just a question) how the toolkit development with Vufind matches with the XC project. Demian reported on the OAI-PMH harvester that will gather records remotely and load them into Vufind. i have used an early version of this tool to successfully harvest and import HathiTrust records and am encouraged to see that development has continued. Demian also mentioned a new XSLT importer tool that enables mapping an XML document into an existing SOLR configuration.

This represents an interesting step forward for Vufind as it will allow ZSR to think about harvesting and indexing data from our Dspace instance as well as other sources that support OAI harvesting. All of these features are going to come in the Vufind 1.1 release on March 21st! More to come on this as we get our test instance of Vufind running.


Pages
About
Categories
ACRL
ALA
ALA Annual
ALA Midwinter
ALCTS
ALFMO
ANCHASL
ANSS
APALA
ARLIS
ASERL
ASIS&T
ATLA
Career Development for Women Leaders
Carolina Consortium
CASE Conference
Celebration: Entrepreneurial Conference
Charleston Conference
Coalition for Networked Information
code4lib
Conferences
CurateGear
DHSI
DigCCurr
Digital Forsyth
EDUCAUSE
edUI
Electronic Resources and Libraries
Elon Teaching and Learning Conference
Entrepreneurial Conference
Evidence Based Library and Information Practice (EBLIP)
Ex Libris Users of North America (ELUNA)
FDLP
First-Year Experience Conference
Handheld Librarian
ILLiad Conference
Immersion
Innovative Library Classroom Conference
IRB101
Journal reading group
LAUNC-CH
Leadership Institute for Academic Librarians
Library Assessment Conference
Lilly Conference
LITA
LITA National Forum
LLAMA
LOEX
Mentoring Committee
MERLOT
Metrolina
Music Library Association
NASIG
NC-LITe
NCCU Conference on Digital Libraries
NCICU
NCLA
NCPC
NCSLA
NISO
North Carolina Serials Conference
online course
Online Learning Summit
Open Repositories
Professional Development Center
RBMS
RTSS
RUSA
SACSCOC
Site Visits and Tours
Society of American Archivists
Society of North Carolina Archivists
SOLINET
Southeast Music Library Association
SPARC
STS
Sun Webinar Series
symposium
TALA Conference
UNC Teaching and Learning with Technology Conference
Uncategorized
University Libraries Group
Webinar
WebWise
WGSS
workshops
ZSR Library Leadership Retreat
Tags
Archives
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007

Powered by WordPress.org, protected by Akismet. Blog with WordPress.com.