Data Wrangling from Socio-Academic Web-Space: Designing a Meta Model
Keywords:Data Wrangling, Data Carpentry, OpenRefine, ODbL Databases, Socio-Academic Data, REST/API
Data carpentry is an emerging field in the domain of LIS and has opened new possibilities for information professionals to survive in the age of data-intensive information services. However, library professionals face the challenges of information overload because of the free availability of data, both in terms of quantity and variety. The role of library professionals is moving from tech-savvy to data-savvy. This research discusses the possibilities of ODbL-based data sources that offer freely accessible data through API calls and proposes a meta-model for fetching and extracting datasets from these databases using an open-Source Data Wrangling Tool (OpenRefine). Further, it discusses the possible application of data wrangling techniques from diverse sources in libraries and how information professionals can take advantage of openly available data to provide value-added information services to users. The practical implications of an array of databases are projected through two case studies: Case Study I deals with measuring the productivity of individual institutions through different metrics, and Case Study II projects a coverage comparison among the ODbL-based citation and Altmetric databases. This meta-model will aid in understanding the potential application of data wrangling techniques to an array of library services.
Eberendu, A. C. (2016). Unstructured Data: an overview of the data of Big Data. International Journal of Computer Trends and Technology, 38, 46-50. https://doi. org/10.14445/22312803/IJCTT-V38P109
Kandel, S., Heer, J. and Buono, P. (2011). Research directions in data wrangling: Visualizations and transformations for usable and credible data. Information Visualization, 10(4), 271-288. https://doi.org/10.1177/1473871611415994
Kinney, R., Anastasiades, C., Authur, R., Beltagy, I., Bragg, J., Buraczynski, A., et al. (2023). The Semantic Scholar Open Data Platform. arXiv. Available at: http://arxiv.org/ abs/2301.10140
Kusumasari, T. F. (2016). Data profiling for data quality improvement with OpenRefine. In 2016 International Conference on Information Technology Systems and Innovation (ICITSI 2016), 24-27 Oct 2016, Bandung- Bali, Indonesia. p. 1-6. https://doi.org/10.1109/ ICITSI.2016.7858197
Mani, N. S., Cawley, M., Henley, A., Triump, T. and Williams, J. M. (2021). Creating a data science framework: A model for academic research libraries. Journal of Library Administration. 61(3): 281-300. https://doi.org/10.1080/01 930826.2021.1883366
Mukhopadhyay, P., Mitra, R. and Mukhopadhyay, M. (2021). Library carpentry: Towards a new professional dimension (Part I–Concepts and case studies). SRELS Journal of Information Management, 58(2), 67-80. https://doi.org/10.17821/srels/2021/v58i2/159969
Peroni, S., Shotton, D. and Vitali, F. (2017). One year of the OpenCitations Corpus: Releasing RDF-based scholarly citation data into the public domain. In 16th International Semantic Web Conference, 21-25 Oct 2017, Vienna, Austria, edited by C. d’Amato, M. Fernandez, V. Tamma, F. Lecue, P. Cudré-Mauroux, J. Sequeda, C. Lange, and J. Heflin, 2012, p. 184-192. https://doi.org/10.1007/978-3- 319-68204-4_19
Robinson, L. and Bawden, D. (2017). “The story of data”: A socio-technical approach to education for the data librarian role in the CityLIS library school at City, University of London. Library Management, 38(6/7), 312-322. https://doi.org/10.1108/LM-01-2017-0009
How to Cite
Copyright (c) 2023 Journal of Information and Knowledge
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All the articles published in Journal of Information and Knowledge are held by the Publisher. Sarada Ranganathan Endowment for Library Science (SRELS), as a publisher requires its authors to transfer the copyright prior to publication. This will permit SRELS to reproduce, publish, distribute and archive the article in print and electronic form and also to defend against any improper use of the article.