Professional Development

Carolyn at ALA Annual 2012

Monday, July 9, 2012 7:29 pm

Early Saturday morning, I attended a 4 hour panel discussion on linked data (LD) and next generation catalogs. I wanted to gain a better understanding of what exactly linked data is since that term is batted about frequently in the literature. I will try to explain it to the best of my ability, but I still have much to learn. So here it goes.

Uniform resource identifiers (URI) is a string of characters used to identify names for “things”. Specifically, HTTP URIs should be used so that people are able to look up those names. Useful information should be provided with URIs, as well as, links to other URIs so that individuals can discover even more useful things.Per Corey Harper, NYU’s Metadata Services Librarian, we need to start thinking about metadata as a graph instead of string based as is most of our data currently. Typed “things” are named by URIs, and relationships between “things” are also built on URIs. LD allows users to move back and forth between information sources where the focus is on identification rather than description.

Mr. Harper provided several examples of LD sites available on the Web, some of which individuals and institutions may contribute data. Google owned Freebase is a community curated collection of RDF data of about 21 million “things”. Freebase provides a link to Google Refine that allows individuals to dump their metadata, clean it up, and then link it back to Freebase. Thinkbase displays the contents of Freebase utilizing mindmap to explore millions of interconnected topics.

Phil Schreur, who is the head of the Metadata Department for Stanford University libraries, talked about shattering the catalog, freeing the data, and linking the pieces. Today’s library catalogs are experiencing increased stressors such as:

  • Pressure to be inclusive–the more is better approach as seen with Google
  • Loss of cataloging–the acceptance and use of vendor bulk records; by genericizing our catalogs, we are weakening our ties to our user/collection community
  • Variations in metadata quality
  • Supplementary data–should the catalog just be an endless supply of links
  • Bibliographic records–catalogers spend lots of time tinkering with them
  • Need for a relational database for discovery–catalogs are domain silos that are unlinked to anything else
  • Missing or hidden metadata–universities are data creation powerhouses (e.g. reading lists, course descriptions, student research/data sets, faculty collaborations/lectures); these are often left out of catalog, and it would be costly to include them

Linked open data is the solution along with some reasons why:

  • It puts information on the Web and eliminates Google as our users’ first choice
  • Expands discoverability
  • Opens opportunities for creative innovation
  • Continuous improvement of data
  • Creates a store of machine-actionable data–semantic meaning in MARC record is unintelligible to machines
  • Breaks down silos
  • Provides direct access to data based in statements and not in records–less maintenance of catalog records
  • Frees ourselves from a parochial metadata model to a more universal one

Schreur proceeded to discuss 4 paradigm shifts involving data.

  1. Data is something that is shared and is built upon, not commodified. Move to open data, not restricted records.
  2. Move from bibliographic records to statements linked by RDF. One can reach into documents at chapter and document level.
  3. Capture data at point of creation. The model of creating individual bibliographic records cannot stand. New means of automated data will need to be developed.
  4. Manage triplestores; not adding more records to catalog. The amount of data is overwhelming. Applications will need to be developed to bring in data.

He closed by stating the notion of authoritative is going to get turned on its head. The Web is already doing that. Sometimes Joe Blow knows more than the national library. This may prove difficult for librarians and catalogers to accept since our work has revolved around authoritative sources and data.

OCLC’s Ted Fons spoke about WorldCat.org”s June 20, 2012 adoption of schema.org descriptive mark-up to its database. Schema.org is a collaboration between Bing, Google, Yahoo, and Russian search index Yandex and is an agreed ontology for harvesting structured data from the web. The reasons behind doing this includes:

  • Makes library data appear more relevant in search engine results
  • Gain position of authority in data modeling in a post-MARC era
  • Promote internal efficiency and new services

Jennifer Bowen, Chair of the eXtensible Catalog Organization, believes LD can help libraries assist and fulfill new roles in the information needs of our users. Scholars want their research to be findable by others, and they want to connect with others. Libraries are being bypassed not only by Google and the Web, but users are also going to tailored desktops, mobile, and Web apps. Libraries need to push their collections to mobile apps and LD allows us to do just that. Hands-on experience with LD to understand its potential and to develop LD best practices is needed. We need to create LD for our local resources (e.g. Institutional Repository) to showcase special collections. Vendors need to be encouraged to implement LD now! Opportunities for creative innovation in digital scholarship and participation can be fostered by utilizing LD.

A tool that will enable libraries to move from its legacy data to LD is needed. The eXtensible Catalog (XC) is open source software for libraries and provides a discovery system and set of tools available for download. It provides a platform for risk-free experimentation with metadata transformation/reuse. RDF/XML, RDFa, and SPARQL are 3 methods of bulk creating metadata. XC converts MARC data to FRBR entities and enables us to produce more meaningful LD. Reasons to use FRBR for LD include:

  • User research shows that users want to see the relationships between resources, etc. Users care about relationships.
  • Allows scholars to create LD statements as part of the scholarly process. Vocabularies are created and managed. Scholars’ works become more discoverable.
  • Augments metadata.

The old model of bibliographic data creation will continue for some time. We are at the beginning of the age of data, and the amount of work is crushing. Skills in cataloging is what is needed in this new age, but a recasting of what we do and use is required. We are no longer the Cataloging Department but the Metadata Department. The tools needed to create data and make libraries’ unique collections available on the Web will change, and catalogers should start caring more about the context and curation of metadata and learning LD vocabulary.

While this was my second visit to Anaheim, CA to attend ALA’s Annual Conference, it was my first time ever presenting at a national conference. On Sunday morning starting at 8 am, Erik Mitchell and I hosted and convened the panel discussion, Current Research on and Use of FRBR in Libraries. The title of our individual presentation was FRBRizing Mark Twain.

We began the session with a quick exploration of some of the metadata issues that libraries are encountering as we explore new models including FRBR and linked open data. Erik and I discussed our research which explored metadata quality issues that arose when we applied the FRBR model to a selected set of records in ZSR’s catalog. The questions to our research were two-fold:

  1. What metadata quality problems arise in application of FRBRization algorithms?
  2. How do computational and expert approaches compare with regards to FRBRization?

So in a nutshell, this is how we did it:

  1. Erik extracted 848 catalog records on books either by or about Mark Twain.
  2. He extracted data from the record set and normalized text keys from elements of the metadata.
  3. Data was written to a spreadsheet and loaded into Google Refine to assist with analysis.
  4. Carolyn grouped records into work-sets and created a matrix of unique identifiers.
  5. Because of metadata variation, Carolyn performed a secondary analysis using book-in-hand approach for 5 titles (approx. 100 books).
  6. Expert review found 410 records grouped in 147 work-sets with 2 or more expressions and 420 records grouped into 420 single expression work sets. Lost/missing or checked out books were not looked at and account for the numbers not adding up to the 848 records in the record set.
  7. Metadata issues encountered included the need to represent whole/part or manifestation to multiple work relationships, metadata inconsistency (i.e. differences in record length, composition, invalid unique identifiers), and determining work boundaries.
  8. Utilizing algorithms, Erik performed a computational assessment to identify and group work-sets.
  9. Computational and expert assessments were compared to each other.

Erik and I were really excited to see that computational techniques were largely as successful as expert techniques. We found, for example, that normalized author/title strings created highly accurate keys for identifying unique works. On the other hand, we also found that MARC metadata did not always contain the metadata needed to identify works entirely. Our detailed findings will be presented at the ASIS&T conference in October. Here are our slides:

Current Research on and Use of FRBR in Libraries

Our other invited speakers included:

  • OCLC’s Chief Scientist Thom Hickey who spoke about clustering at the FRBR entity 1 work level OCLC’s database, which is under 300 million records, and clustering within work-sets by expression using algorithm keys; FRBR algorithm creation and development; and the fall release of GLIMIR which attempts to cluster WorldCat’s records and holdings for the same work at the manifestation level.
  • Kent State’s School of Information and Library Science professors Drs. Athena Salaba and Yin Zhang discussed their IMLS (Institute of Museum and Library Services) funded project, a FRBR prototype catalog. Library of Congress cataloging records were extracted from WorldCat to create a FRBRized catalog. Users were tested to see if they could complete a set of user tasks in the library’s current catalog and in the prototype.
  • Jennifer Bowen, Chair of XC organization and Assistant Dean for Information Management Services at the University of Rochester, demonstrated the XC catalog to the audience. The XC project didn’t set out to see if people liked FRBR, but what are our users trying to do with the catalog’s data. According to Ms. Bowen, libraries are/should be moving away from thinking we know what users need to what do users need to do in their research. How do users keep current in their field? In regards to library data, we need to ask our users, “What would they do with a magic wand?” and continue to ponder “What will the user needs of the future be?

Following our session, I attended a packed room of librarians eager to hear more about Library of Congress’ (LC) Bibliographic Framework Transition Initiative (BFI) which is looking to translate the MARC21 format, a 40 year old standard, to a LD model. LC has contracted with Zepheira to help accelerate the launch of BFI. By August/September, an LD working draft will hopefully be ready to present to the broader library community.

ALA Anaheim with Lynn

Wednesday, June 27, 2012 10:22 am

I like Anaheim as a conference site. While it doesn’t have big city stores and museums, the weather was perfect for four days, the Convention Center was like a garden, and the hotels and programs were all close. It was a nice change.

One of the things I enjoyed most about the conference was attending presentations by ZSR people. Carolyn teamed up with former colleague Erik Mitchell to coordinate a panel and give the paper “Current Research on and Use of FRBR in Libraries.” I know very little about FRBR but they presented the information very clearly and what I got out of it was that in a pilot competition to analyze the works of Mark Twain, Carolyn went up against The Machine and won. Go Carolyn!

I also saw Molly more than hold her own on a distinguished panel of copyright experts analyzing the Georgia State e-reserves case. Molly was the only non- lawyer on the panel but she was every bit as knowledgeable and insightful as the other panelists. A person from Georgia State was in the audience and contributed quite a bit to the discussion. It must be strange to hear your own library discussed at every library conference in such great detail!

Sarah did a poster session for the Science and Technology Section of ACRL that put together all of her experiences in serving science students at Wake Forest. I remember reading and hearing her talk about these things individually, but when put all together, it was quite impressive. Her poster was very well received, and I even had to stand in line to talk to her about it!

Due to conflicts, I missed Hu’s presentation of Susan’s paper on embedded librarianship but I ran into three different people afterward who told me how good it was! And Roz reported that there were about 400 people present in a standing room only crowd to hear her presentation “Critical Thinking & Library Instruction: Fantasyland or Adventureland?” Did I miss any other presentations? It is so wonderful to see ZSR and Wake Forest shine so brightly in a national forum!

One of the main reasons I attended Annual this year was to co-chair the Cyber Zed Shed Committee for the 2013 ACRL National Conference. I attended three sets of meetings related to this task as we prepare to receive and judge the submissions this fall and winter. Roz is on the Committee with me.

The other main reason I came to Anaheim was to meet with groups of WFU alums in southern California. Angela Glover did a great job at setting up events where I could meet with people and tell them what we are doing in the Library. On Saturday afternoon, Angela, Roz and I drove up to Pacific Palisades to meet with about 25 people in the gorgeous backyard of a former student of Roz and Hu’s. I talked for about 10 minutes and then answered lots of good questions from people who were genuinely interested in books, reading, libraries and, pf course, Wake Forest! The event made it to the WFU Alumni blog, which you can read here.

Then, Angela drove me down to San Diego Monday afternoon, where I had dinner with a group of 10 alums who are just about the most fun group of people I have ever met! I never did get around to giving my speech because we were having too much fun, but I did talk a lot about how ZSR has changed through the years, how our mission helps the university advance it’s mission, etc. They said that not too many WFU administrators take the time to come all the way out to San Diego, zo they were thrilled to have me and made me promise to come back, which I will!

All in all, a great conference!


Pages
About
Categories
2007 ACRL Baltimore
2007 ALA Annual
2007 ALA Gaming Symposium
2007 ALA Midwinter
2007 ASERL New Age of Discovery
2007 Charleston Conference
2007 ECU Gaming Presentation
2007 ELUNA
2007 Evidence Based Librarianship
2007 Innovations in Instruction
2007 Kilgour Symposium
2007 LAUNC-CH Conference
2007 LITA National Forum
2007 NASIG Conference
2007 North Carolina Library Association
2007 North Carolina Serials Conference
2007 OCLC International ILLiad Conference
2007 Open Repositories
2007 SAA Chicago
2007 SAMM
2007 SOLINET NC User Group
2007 UNC TLT
2007_ASIST
2008
2008 Leadership Institute for Academic Librarians
2008 ACRL Immersion
2008 ACRL/LAMA JVI
2008 ALA Annual
2008 ALA Midwinter
2008 ASIS&T
2008 First-Year Experience Conference
2008 Lilly Conference
2008 LITA
2008 NASIG Conference
2008 NCAECT
2008 NCLA RTSS
2008 North Carolina Serials Conference
2008 ONIX for Serials Webinar
2008 Open Access Day
2008 SPARC Digital Repositories
2008 Tri-IT Meeting
2009
2009 ACRL Seattle
2009 ALA Annual
2009 ALA Annual Chicago
2009 ALA Midwinter
2009 ARLIS/NA
2009 Big Read
2009 code4lib
2009 Educause
2009 Handheld Librarian
2009 LAUNC-CH Conference
2009 LAUNCH-CH Research Forum
2009 Lilly Conference
2009 LITA National Forum
2009 NASIG Conference
2009 NCLA Biennial Conference
2009 NISOForum
2009 OCLC International ILLiad Conference
2009 RBMS Charlottesville
2009 SCLA
2009 UNC TLT
2010
2010 ALA Annual
2010 ALA Midwinter
2010 ATLA
2010 Code4Lib
2010 EDUCAUSE Southeast
2010 Handheld Librarian
2010 ILLiad Conference
2010 LAUNC-CH Research Forum
2010 LITA National Forum
2010 Metrolina
2010 NASIG Conference
2010 North Carolina Serials Conference
2010 RBMS
2010 Sakai Conference
2011 ACRL Philadelphia
2011 ALA Annual
2011 ALA Midwinter
2011 CurateCamp
2011 Illiad Conference
2012 SNCA Annual Conference
ACRL
ACRL 2013
ACRL New England Chapter
ACRL-ANSS
ACRL-STS
ALA Annual
ALA Annual 2013
ALA Editions
ALA Midwinter
ALA Midwinter 2012
ALA Midwinter 2014
ALCTS Webinars for Preservation Week
ALFMO
APALA
ARL Assessment Seminar 2014
ARLIS
ASERL
ASU
Audio streaming
authority control
Berkman Webinar
bibliographic control
Book Repair Workshops
Career Development for Women Leaders Program
CASE Conference
cataloging
Celebration: Entrepreneurial Conference
Charleston Conference
CIT Showcase
CITsymposium2008
Coalition for Networked Information
code4lib
commons
Conference Planning
Conferences
Copyright Conference
COSWL
CurateGear 2013
CurateGear 2014
Designing Libraries II Conference
DigCCurr 2007
Digital Forsyth
Digital Humanities Symposium
Disaster Recovery
Discovery tools
E-books
EDUCAUSE
Educause SE
EDUCAUSE_SERC07
Electronic Resources and Libraries
Embedded Librarians
Entrepreneurial Conference
ERM Systems
evidence based librarianship
FDLP
FRBR
Future of Libraries
Gaming in Libraries
General
GODORT
Google Scholar
govdocs
Handheld Librarian Online Conference
Hurricane Preparedness/Solinet 3-part Workshop
ILS
information design
information ethics
Information Literacy
innovation
Innovation in Instruction
Inspiration
instruction
IRB101
Journal reading group
Keynote
LAMS Customer Service Workshop
LAUNC-CH
Leadership
Learning spaces
LibQUAL
Library 2.0
Library of Congress
licensing
Lilly Conference
LITA
LITA National Forum
LOEX2008
Lyrasis
Management
Marketing
Mentoring Committee
MERLOT
metadata
Metrolina 2008
MOUG 09
MOUG 2010
Music Library Assoc. 07
Music Library Assoc. 09
Music Library Assoc. 2010
NASIG
NC-LITe
NCCU Conference on Digital Libraries
NCICU
NCLA
NCLA Biennial Conference 2013
NCPC
NCSLA
NEDCC/SAA
NHPRC-Electronic Records Research Fellowships Symposium
NISO
North Carolina Serial Conference 2014
Offsite Storage Project
OLE Project
online catalogs
online course
OPAC
open access
Peabody Library Leadership Institute
plagiarism
Podcasting
Preservation
Preservation Activities
Preserving Forsyth LSTA Grant
Professional Development Center
rare books
RDA/FRBR
Reserves
RITS
RTSS 08
RUSA-CODES
SAA Class New York
SAMM 2008
SAMM 2009
Scholarly Communication
ScienceOnline2010
Social Stratification in the Deep South
Social Stratification in the Deep South 2009
Society of American Archivists
Society of North Carolina Archivists
SOLINET
Southeast Music Library Association
Southeast Music Library Association 08
Southeast Music Library Association 09
SPARC webinar
subject headings
Sun Webinar Series
tagging
Technical Services
technology
ThinkTank Conference
Training
ULG
Uncategorized
user studies
Vendors
video-assisted learning
visual literacy
WakeSpace
Web 2.0
Webinar
WebWise
WFU China Initiative
Wikis
Women's History Symposium 2007
workshops
WSS
ZSR Library Leadership Retreat
Tags
Archives
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007

Powered by WordPress.org, protected by Akismet. Blog with WordPress.com.