Professional Development

Carolyn at ALA Annual 2012

Monday, July 9, 2012 7:29 pm

Early Saturday morning, I attended a 4 hour panel discussion on linked data (LD) and next generation catalogs. I wanted to gain a better understanding of what exactly linked data is since that term is batted about frequently in the literature. I will try to explain it to the best of my ability, but I still have much to learn. So here it goes.

Uniform resource identifiers (URI) is a string of characters used to identify names for “things”. Specifically, HTTP URIs should be used so that people are able to look up those names. Useful information should be provided with URIs, as well as, links to other URIs so that individuals can discover even more useful things.Per Corey Harper, NYU’s Metadata Services Librarian, we need to start thinking about metadata as a graph instead of string based as is most of our data currently. Typed “things” are named by URIs, and relationships between “things” are also built on URIs. LD allows users to move back and forth between information sources where the focus is on identification rather than description.

Mr. Harper provided several examples of LD sites available on the Web, some of which individuals and institutions may contribute data. Google owned Freebase is a community curated collection of RDF data of about 21 million “things”. Freebase provides a link to Google Refine that allows individuals to dump their metadata, clean it up, and then link it back to Freebase. Thinkbase displays the contents of Freebase utilizing mindmap to explore millions of interconnected topics.

Phil Schreur, who is the head of the Metadata Department for Stanford University libraries, talked about shattering the catalog, freeing the data, and linking the pieces. Today’s library catalogs are experiencing increased stressors such as:

  • Pressure to be inclusive–the more is better approach as seen with Google
  • Loss of cataloging–the acceptance and use of vendor bulk records; by genericizing our catalogs, we are weakening our ties to our user/collection community
  • Variations in metadata quality
  • Supplementary data–should the catalog just be an endless supply of links
  • Bibliographic records–catalogers spend lots of time tinkering with them
  • Need for a relational database for discovery–catalogs are domain silos that are unlinked to anything else
  • Missing or hidden metadata–universities are data creation powerhouses (e.g. reading lists, course descriptions, student research/data sets, faculty collaborations/lectures); these are often left out of catalog, and it would be costly to include them

Linked open data is the solution along with some reasons why:

  • It puts information on the Web and eliminates Google as our users’ first choice
  • Expands discoverability
  • Opens opportunities for creative innovation
  • Continuous improvement of data
  • Creates a store of machine-actionable data–semantic meaning in MARC record is unintelligible to machines
  • Breaks down silos
  • Provides direct access to data based in statements and not in records–less maintenance of catalog records
  • Frees ourselves from a parochial metadata model to a more universal one

Schreur proceeded to discuss 4 paradigm shifts involving data.

  1. Data is something that is shared and is built upon, not commodified. Move to open data, not restricted records.
  2. Move from bibliographic records to statements linked by RDF. One can reach into documents at chapter and document level.
  3. Capture data at point of creation. The model of creating individual bibliographic records cannot stand. New means of automated data will need to be developed.
  4. Manage triplestores; not adding more records to catalog. The amount of data is overwhelming. Applications will need to be developed to bring in data.

He closed by stating the notion of authoritative is going to get turned on its head. The Web is already doing that. Sometimes Joe Blow knows more than the national library. This may prove difficult for librarians and catalogers to accept since our work has revolved around authoritative sources and data.

OCLC’s Ted Fons spoke about WorldCat.org”s June 20, 2012 adoption of schema.org descriptive mark-up to its database. Schema.org is a collaboration between Bing, Google, Yahoo, and Russian search index Yandex and is an agreed ontology for harvesting structured data from the web. The reasons behind doing this includes:

  • Makes library data appear more relevant in search engine results
  • Gain position of authority in data modeling in a post-MARC era
  • Promote internal efficiency and new services

Jennifer Bowen, Chair of the eXtensible Catalog Organization, believes LD can help libraries assist and fulfill new roles in the information needs of our users. Scholars want their research to be findable by others, and they want to connect with others. Libraries are being bypassed not only by Google and the Web, but users are also going to tailored desktops, mobile, and Web apps. Libraries need to push their collections to mobile apps and LD allows us to do just that. Hands-on experience with LD to understand its potential and to develop LD best practices is needed. We need to create LD for our local resources (e.g. Institutional Repository) to showcase special collections. Vendors need to be encouraged to implement LD now! Opportunities for creative innovation in digital scholarship and participation can be fostered by utilizing LD.

A tool that will enable libraries to move from its legacy data to LD is needed. The eXtensible Catalog (XC) is open source software for libraries and provides a discovery system and set of tools available for download. It provides a platform for risk-free experimentation with metadata transformation/reuse. RDF/XML, RDFa, and SPARQL are 3 methods of bulk creating metadata. XC converts MARC data to FRBR entities and enables us to produce more meaningful LD. Reasons to use FRBR for LD include:

  • User research shows that users want to see the relationships between resources, etc. Users care about relationships.
  • Allows scholars to create LD statements as part of the scholarly process. Vocabularies are created and managed. Scholars’ works become more discoverable.
  • Augments metadata.

The old model of bibliographic data creation will continue for some time. We are at the beginning of the age of data, and the amount of work is crushing. Skills in cataloging is what is needed in this new age, but a recasting of what we do and use is required. We are no longer the Cataloging Department but the Metadata Department. The tools needed to create data and make libraries’ unique collections available on the Web will change, and catalogers should start caring more about the context and curation of metadata and learning LD vocabulary.

While this was my second visit to Anaheim, CA to attend ALA’s Annual Conference, it was my first time ever presenting at a national conference. On Sunday morning starting at 8 am, Erik Mitchell and I hosted and convened the panel discussion, Current Research on and Use of FRBR in Libraries. The title of our individual presentation was FRBRizing Mark Twain.

We began the session with a quick exploration of some of the metadata issues that libraries are encountering as we explore new models including FRBR and linked open data. Erik and I discussed our research which explored metadata quality issues that arose when we applied the FRBR model to a selected set of records in ZSR’s catalog. The questions to our research were two-fold:

  1. What metadata quality problems arise in application of FRBRization algorithms?
  2. How do computational and expert approaches compare with regards to FRBRization?

So in a nutshell, this is how we did it:

  1. Erik extracted 848 catalog records on books either by or about Mark Twain.
  2. He extracted data from the record set and normalized text keys from elements of the metadata.
  3. Data was written to a spreadsheet and loaded into Google Refine to assist with analysis.
  4. Carolyn grouped records into work-sets and created a matrix of unique identifiers.
  5. Because of metadata variation, Carolyn performed a secondary analysis using book-in-hand approach for 5 titles (approx. 100 books).
  6. Expert review found 410 records grouped in 147 work-sets with 2 or more expressions and 420 records grouped into 420 single expression work sets. Lost/missing or checked out books were not looked at and account for the numbers not adding up to the 848 records in the record set.
  7. Metadata issues encountered included the need to represent whole/part or manifestation to multiple work relationships, metadata inconsistency (i.e. differences in record length, composition, invalid unique identifiers), and determining work boundaries.
  8. Utilizing algorithms, Erik performed a computational assessment to identify and group work-sets.
  9. Computational and expert assessments were compared to each other.

Erik and I were really excited to see that computational techniques were largely as successful as expert techniques. We found, for example, that normalized author/title strings created highly accurate keys for identifying unique works. On the other hand, we also found that MARC metadata did not always contain the metadata needed to identify works entirely. Our detailed findings will be presented at the ASIS&T conference in October. Here are our slides:

Current Research on and Use of FRBR in Libraries

Our other invited speakers included:

  • OCLC’s Chief Scientist Thom Hickey who spoke about clustering at the FRBR entity 1 work level OCLC’s database, which is under 300 million records, and clustering within work-sets by expression using algorithm keys; FRBR algorithm creation and development; and the fall release of GLIMIR which attempts to cluster WorldCat’s records and holdings for the same work at the manifestation level.
  • Kent State’s School of Information and Library Science professors Drs. Athena Salaba and Yin Zhang discussed their IMLS (Institute of Museum and Library Services) funded project, a FRBR prototype catalog. Library of Congress cataloging records were extracted from WorldCat to create a FRBRized catalog. Users were tested to see if they could complete a set of user tasks in the library’s current catalog and in the prototype.
  • Jennifer Bowen, Chair of XC organization and Assistant Dean for Information Management Services at the University of Rochester, demonstrated the XC catalog to the audience. The XC project didn’t set out to see if people liked FRBR, but what are our users trying to do with the catalog’s data. According to Ms. Bowen, libraries are/should be moving away from thinking we know what users need to what do users need to do in their research. How do users keep current in their field? In regards to library data, we need to ask our users, “What would they do with a magic wand?” and continue to ponder “What will the user needs of the future be?

Following our session, I attended a packed room of librarians eager to hear more about Library of Congress’ (LC) Bibliographic Framework Transition Initiative (BFI) which is looking to translate the MARC21 format, a 40 year old standard, to a LD model. LC has contracted with Zepheira to help accelerate the launch of BFI. By August/September, an LD working draft will hopefully be ready to present to the broader library community.

Chris at the 2010 NASIG Conference- Day 1

Sunday, June 27, 2010 8:41 pm

This year, NASIG celebrated its 25th anniversary at its conference in Palm Springs, California. Since I was not part of the conference planning committee, I was able to be an “attendee” once again and learn more about the latest challenges for serials and other continuing resources. These are the highlights for the sessions I attended on the first day.

Vision Session #1: Eric Miller of Zepheira, LLC on Linked Data and Librarians

With linked data becoming the latest trend in computing, I was glad that I attended Erik‘s session before going to the conference! Linked data allows users to pull information that had been previously inaccessible on the “front end” of websites and makes it available for users to connect it to other data points across the Internet. Miller went further to explain that this does not involve bringing this data together into one database: rather, applications and similar programs would manipulate the data without harvesting it locally.

Where does this leave libraries? Miller suggested that libraries can participate by contributing their expertise in specific areas such as controlled vocabulary and data portability. Sites such as the BBC and The New York Times have made their information available to users on the back end, but creating standards for that data would be the next possible step. As with so many other emerging technologies, libraries may have an advantage in bringing eventual order to the initial chaos.

Strategy Session- Not for the Faint of Heart! A New Approach to “Serials” Management

This session was presented by two members of OCLC about developing new approaches to managing the workflows required to serials in electronic format. Working as a partner with several libraries, OCLC has begun to develop a user-driven product that can respond to the specific needs of a particular institution. Core portions of the electronic management workflow have been outlined already: selecting and ordering, negotiation and licensing, receiving and maintenance, and payment and invoicing. Combining these with several “pain points” that can create potential bottlenecks in the workflow, OCLC hopes to aid libraries by making this process as routine and painless as possible.

The results for this study by OCLC are expected to be released later this year, and the presenters sought feedback from the audience as to any information that they may have missed. Although the title and description of this presentation did not correspond with what was presented, it was interesting nonetheless. It demonstrated that others are attempting the grapple with the issues associated with the concerns of electronic serials management.

Tactics Session- Don’t Pay Twice! Leveraging Licenses to Lower Student Costs

UCLA relies heavily on printed course readers that supplement the textbooks that students are required to purchase for their classes. In 2008, several student organizations approached the library about how to reduce the costs for these readers, which were usually assembled using articles and other materials that had been licensed by the library. Two librarians approached this dilemma by examining every aspect of a course pack, from the license negotiations for journals all the way to the costs of with the campus copy center. As a result, the library was able to reduce the costs for the readers by as much as $42,000 over three quarters (depending on the discipline, emphasis on journals over monographs, and so forth) as well as hundreds of dollars in copying fees. In the end, the library was not only able to gain more from its license negotiations, but it was able to leverage its campus connections to create successful partnerships with student organizations.

Moving forward, the librarians considered other possibilities: developing potential partnerships with the bookstore, analyzing the pros and cons of an annual license with the Copyright Clearance Center, assessing whether the potential risk of fair use would be viable and sustainable, examining other options such as the public domain and Creative Commons, and support for license portals. The question of developing electronic course readers that could be placed behind course management software has also emerged, and that may reduce costs further. By successfully marketing this program through student organizations, its continued growth and success seems assured. This library service can be progressive as the licensing process will evolve in the coming years.

Tactics Session- Licensing Electronic Journals through Non-Subscription-Agent “Go Betweens”

Subscription agents have long served an essential function in serials management, serving as intermediaries between libraries and publishers. However, there are areas around the world where subscription agents neither have a significant presence nor a relationship with the local publishers. This is where non-agents can play a role. Non-agents function as either for-profit or non-profit entities that work between libraries and publishing agencies- particularly society presses and small agencies- in foreign countries. The cost of their business is not passed to libraries, and the invoices for purchased items come directly from those publishers.

This is a business model of which I was unaware before the conference. As the curriculum of the university continues to build an international focus, the usefulness of these non-agents becomes clear. I believe that it could have possibilities for subscriptions that cannot be secured by any other method, and it could have a similar benefit for monographs. Two organizations that serve in this capacity are Accucoms and FASEB.

* * *

Here is a photo taken from the flight on the way to Palm Springs. More to come in Day 2!

code4lib 2009 – Vufind

Monday, February 23, 2009 2:03 pm

This morning kicked off code4lib 2009 with a series of pre-conferences. Both Kevin and I attended the Vufind preconference session which included an overview of vufind, install exercise, and a q&a session on vufind features and issues.

I documented lots of notes & tips on our Vufind project page in the library wiki and once we get back to WS we will be able to make some good progress on ironing out some of the features that we are interested in implementing.

Following an interesting lunch with some folks from Equinox (and a cold walk back to the hotel) I am getting ready for the Linked Data session. . .


Pages
About
Categories
2007 ACRL Baltimore
2007 ALA Annual
2007 ALA Gaming Symposium
2007 ALA Midwinter
2007 ASERL New Age of Discovery
2007 Charleston Conference
2007 ECU Gaming Presentation
2007 ELUNA
2007 Evidence Based Librarianship
2007 Innovations in Instruction
2007 Kilgour Symposium
2007 LAUNC-CH Conference
2007 LITA National Forum
2007 NASIG Conference
2007 North Carolina Library Association
2007 North Carolina Serials Conference
2007 OCLC International ILLiad Conference
2007 Open Repositories
2007 SAA Chicago
2007 SAMM
2007 SOLINET NC User Group
2007 UNC TLT
2007_ASIST
2008
2008 Leadership Institute for Academic Librarians
2008 ACRL Immersion
2008 ACRL/LAMA JVI
2008 ALA Annual
2008 ALA Midwinter
2008 ASIS&T
2008 First-Year Experience Conference
2008 Lilly Conference
2008 LITA
2008 NASIG Conference
2008 NCAECT
2008 NCLA RTSS
2008 North Carolina Serials Conference
2008 ONIX for Serials Webinar
2008 Open Access Day
2008 SPARC Digital Repositories
2008 Tri-IT Meeting
2009
2009 ACRL Seattle
2009 ALA Annual
2009 ALA Annual Chicago
2009 ALA Midwinter
2009 ARLIS/NA
2009 Big Read
2009 code4lib
2009 Educause
2009 Handheld Librarian
2009 LAUNC-CH Conference
2009 LAUNCH-CH Research Forum
2009 Lilly Conference
2009 LITA National Forum
2009 NASIG Conference
2009 NCLA Biennial Conference
2009 NISOForum
2009 OCLC International ILLiad Conference
2009 RBMS Charlottesville
2009 SCLA
2009 UNC TLT
2010
2010 ALA Annual
2010 ALA Midwinter
2010 ATLA
2010 Code4Lib
2010 EDUCAUSE Southeast
2010 Handheld Librarian
2010 ILLiad Conference
2010 LAUNC-CH Research Forum
2010 LITA National Forum
2010 Metrolina
2010 NASIG Conference
2010 North Carolina Serials Conference
2010 RBMS
2010 Sakai Conference
2011 ACRL Philadelphia
2011 ALA Annual
2011 ALA Midwinter
2011 CurateCamp
2011 Illiad Conference
2012 SNCA Annual Conference
ACRL
ACRL 2013
ACRL New England Chapter
ACRL-ANSS
ACRL-STS
ALA Annual
ALA Annual 2013
ALA Editions
ALA Midwinter
ALA Midwinter 2012
ALA Midwinter 2014
ALCTS Webinars for Preservation Week
ALFMO
APALA
ARL Assessment Seminar 2014
ARLIS
ASERL
ASU
Audio streaming
authority control
Berkman Webinar
bibliographic control
Book Repair Workshops
Career Development for Women Leaders Program
CASE Conference
cataloging
Celebration: Entrepreneurial Conference
Charleston Conference
CIT Showcase
CITsymposium2008
Coalition for Networked Information
code4lib
commons
Conference Planning
Conferences
Copyright Conference
COSWL
CurateGear 2013
CurateGear 2014
Designing Libraries II Conference
DigCCurr 2007
Digital Forsyth
Digital Humanities Symposium
Disaster Recovery
Discovery tools
E-books
EDUCAUSE
Educause SE
EDUCAUSE_SERC07
Electronic Resources and Libraries
Embedded Librarians
Entrepreneurial Conference
ERM Systems
evidence based librarianship
FDLP
FRBR
Future of Libraries
Gaming in Libraries
General
GODORT
Google Scholar
govdocs
Handheld Librarian Online Conference
Hurricane Preparedness/Solinet 3-part Workshop
ILS
information design
information ethics
Information Literacy
innovation
Innovation in Instruction
Inspiration
instruction
IRB101
Journal reading group
Keynote
LAMS Customer Service Workshop
LAUNC-CH
Leadership
Learning spaces
LibQUAL
Library 2.0
Library of Congress
licensing
Lilly Conference
LITA
LITA National Forum
LOEX2008
Lyrasis
Management
Marketing
Mentoring Committee
MERLOT
metadata
Metrolina 2008
MOUG 09
MOUG 2010
Music Library Assoc. 07
Music Library Assoc. 09
Music Library Assoc. 2010
NASIG
NC-LITe
NCCU Conference on Digital Libraries
NCICU
NCLA
NCLA Biennial Conference 2013
NCPC
NCSLA
NEDCC/SAA
NHPRC-Electronic Records Research Fellowships Symposium
NISO
North Carolina Serial Conference 2014
Offsite Storage Project
OLE Project
online catalogs
online course
OPAC
open access
Peabody Library Leadership Institute
plagiarism
Podcasting
Preservation
Preservation Activities
Preserving Forsyth LSTA Grant
Professional Development Center
rare books
RDA/FRBR
Reserves
RITS
RTSS 08
RUSA-CODES
SAA Class New York
SAMM 2008
SAMM 2009
Scholarly Communication
ScienceOnline2010
Social Stratification in the Deep South
Social Stratification in the Deep South 2009
Society of American Archivists
Society of North Carolina Archivists
SOLINET
Southeast Music Library Association
Southeast Music Library Association 08
Southeast Music Library Association 09
SPARC webinar
subject headings
Sun Webinar Series
tagging
Technical Services
technology
ThinkTank Conference
Training
ULG
Uncategorized
user studies
Vendors
video-assisted learning
visual literacy
WakeSpace
Web 2.0
Webinar
WebWise
WFU China Initiative
Wikis
Women's History Symposium 2007
workshops
WSS
ZSR Library Leadership Retreat
Tags
Archives
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007

Powered by WordPress.org, protected by Akismet. Blog with WordPress.com.