Library Gazette

Improving the Format Facet in the Catalog

Saturday, July 28, 2012 11:33 am

by Kevin Gilbertson and Carolyn McCallum

Earlier this year, Carolyn and I embarked on an ambitious project to revise VuFind’s format facet. This facet – Book, eBook, DVD, etc. – powers the main search box on the library’s homepage and provides enhanced browsing in the catalog itself. While the immediate reason for the project was a request to identify streaming videos in the catalog, the need for a significant revision and the awareness of its importance had been growing for some time. That is, with the increasing number of electronic materials we were adding to the catalog, it was clear that the then-current format mappings were limited, often inconsistent, and wholly ignorant of the nuances in new format designations.

To resolve VuFind’s format mapping issues, we delved into learning about MARC’s fixed-field elements and the 007 field (physical characteristics of non-print items). The coding of fixed-field elements and of the variable 007 field in a MARC record are critical to how VuFind determines an item’s format. Based on our view of these MARC codings, we adjusted VuFind’s mapping algorithm, re-indexed the catalog several times, and reviewed our changes in a test version of VuFind.

As we worked, we came across many unexpected format assignments. For example, during one of these reviews, we noticed the inclusion of a university press book in the ‘GovDoc’ facet. After inspecting the coding, we discovered that state university press publications are coded as government publications in MARC records (the fixed-field GPub element) and therefore map to the govdoc facet in VuFind. According to OCLC’s Bibliographic Formats and Standards, libraries are to “treat an item published by an academic institution as a government publication if the government created or controls the institution. For example, publications of state university presses in the United States are government publications at the state level.” While our mapping was technically correct, we thought most users would expect to find a book published by a university press under ‘Book’ and not under ‘Government Document’. As we encountered these unexpected results, we reviewed the MARC codings and made adjustments to VuFind’s mapping algorithm.

Another example of what we addressed was the ‘Electronic’ facet. When we began our project, the catalog showed 615,320 items as ‘Electronic’. While this facet may have been accurate given an item’s coding, in use it was problematic because it lacked adequate differentiation and served to hide items, not handled elsewhere in the assignment process, in its indiscriminate muddle. So, while some ebooks were ‘ebooks’, others were simply (and only) ‘electronic’. In our last test version, we had reduced the electronic facet to just 569 items. Where did the other 614,751 items go? The bulk of these items went to the ‘eBook‘ facet – 23,267 ebooks became 487,633 ebooks – and over 2,000 items were added to the ‘Streaming Video‘ facet. The remaining items were distributed in other new electronic format facets, including Streaming Audio, eGovDocs, and eJournals.

We pushed our changes into production in March and have been watching to see how they have performed during the past few months. It was not easy work and you may continue to see items with questionable formats. There are limits to what we can achieve with the format mapping algorithm based on the MARC codings we have.

With the recent OCLC reclamation project and the authority control work, there is a healthy confluence of effort to improve our data and its representation in the catalog and we wanted to share a before-and-after view of our improvements. If you see areas that need further improvement, please let us know.

4 Responses to “Improving the Format Facet in the Catalog”

  1. Carolyn and Kevin,

    Thanks for your hard work on this. Those of us who use VuFind everyday at work certainly appreciate it!

  2. This is a great improvement! (and an interesting project.) thanks for sharing.

  3. algorithms, format mapping, coding, authority control- all these things are great in concept. You guys have shown how important an actual librarian is in the process.

  4. I agree with the other comments! This is a fascinating and worthwhile project. Great work!

May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
October 2006
September 2006
August 2006
May 2006
April 2006
February 2006
January 2006
December 2005
October 2005
August 2005
July 2005

Powered by, protected by Akismet. Blog with