Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Collecting and evaluating large volumes of bibliographic metadata aggregated in the WorldCat database: a proposed methodology to overcome challenges

Collecting and evaluating large volumes of bibliographic metadata aggregated in the WorldCat... This paper aims to discuss the challenges encountered in collecting, cleaning and analyzing the large data set of bibliographic metadata records in machine-readable cataloging [MARC 21] format. Possible solutions are presented.Design/methodology/approachThis mixed method study relied on content analysis and social network analysis. The study examined subject representation in MARC 21 metadata records created in 2020 in WorldCat – the largest international database of “big smart data.” The methodological challenges that were encountered and solutions are examined.FindingsIn this general review paper with a focus on methodological issues, the discussion of challenges is followed by a discussion of solutions developed and tested as part of this study. Data collection, processing, analysis and visualization are addressed separately. Lessons learned and conclusions related to challenges and solutions for the design of a large-scale study evaluating MARC 21 bibliographic metadata from WorldCat are given. Overall recommendations for the design and implementation of future research are suggested.Originality/valueThere are no previous publications that address the challenges and solutions of data collection and analysis of WorldCat’s “big smart data” in the form of MARC 21 data. This is the first study to use a large data set to systematically examine MARC 21 library metadata records created after the most recent addition of new fields and subfields to MARC 21 Bibliographic Format standard in 2019 based on resource description and access rules. It is also the first to focus its analyzes on the networks formed by subject terms shared by MARC 21 bibliographic records in a data set extracted from a heterogeneous centralized database WorldCat. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Electronic Library Emerald Publishing

Collecting and evaluating large volumes of bibliographic metadata aggregated in the WorldCat database: a proposed methodology to overcome challenges

The Electronic Library , Volume 39 (3): 18 – Nov 4, 2021

Loading next page...
 
/lp/emerald-publishing/collecting-and-evaluating-large-volumes-of-bibliographic-metadata-MIz8BITeTl
Publisher
Emerald Publishing
Copyright
© Emerald Publishing Limited
ISSN
0264-0473
eISSN
0264-0473
DOI
10.1108/el-11-2020-0316
Publisher site
See Article on Publisher Site

Abstract

This paper aims to discuss the challenges encountered in collecting, cleaning and analyzing the large data set of bibliographic metadata records in machine-readable cataloging [MARC 21] format. Possible solutions are presented.Design/methodology/approachThis mixed method study relied on content analysis and social network analysis. The study examined subject representation in MARC 21 metadata records created in 2020 in WorldCat – the largest international database of “big smart data.” The methodological challenges that were encountered and solutions are examined.FindingsIn this general review paper with a focus on methodological issues, the discussion of challenges is followed by a discussion of solutions developed and tested as part of this study. Data collection, processing, analysis and visualization are addressed separately. Lessons learned and conclusions related to challenges and solutions for the design of a large-scale study evaluating MARC 21 bibliographic metadata from WorldCat are given. Overall recommendations for the design and implementation of future research are suggested.Originality/valueThere are no previous publications that address the challenges and solutions of data collection and analysis of WorldCat’s “big smart data” in the form of MARC 21 data. This is the first study to use a large data set to systematically examine MARC 21 library metadata records created after the most recent addition of new fields and subfields to MARC 21 Bibliographic Format standard in 2019 based on resource description and access rules. It is also the first to focus its analyzes on the networks formed by subject terms shared by MARC 21 bibliographic records in a data set extracted from a heterogeneous centralized database WorldCat.

Journal

The Electronic LibraryEmerald Publishing

Published: Nov 4, 2021

Keywords: Metadata; Data analysis; Cataloguing; Data processing; Data collection; Linked data; Information; Information retrieval; Information science

References