Data Wrangling from Socio-Academic Web-Space: Designing a Meta Model


  • Department of Library and Information Science, University of Kalyani, Kalyani-741235, WB
  • Department of Library and Information Science, University of Kalyani, Kalyani-741235, WB



Data Wrangling, Data Carpentry, OpenRefine, ODbL Databases, Socio-Academic Data, REST/API


Data carpentry is an emerging field in the domain of LIS and has opened new possibilities for information professionals to survive in the age of data-intensive information services. However, library professionals face the challenges of information overload because of the free availability of data, both in terms of quantity and variety. The role of library professionals is moving from tech-savvy to data-savvy. This research discusses the possibilities of ODbL-based data sources that offer freely accessible data through API calls and proposes a meta-model for fetching and extracting datasets from these databases using an open-Source Data Wrangling Tool (OpenRefine). Further, it discusses the possible application of data wrangling techniques from diverse sources in libraries and how information professionals can take advantage of openly available data to provide value-added information services to users. The practical implications of an array of databases are projected through two case studies: Case Study I deals with measuring the productivity of individual institutions through different metrics, and Case Study II projects a coverage comparison among the ODbL-based citation and Altmetric databases. This meta-model will aid in understanding the potential application of data wrangling techniques to an array of library services.


Download data is not yet available.


Metrics Loading ...


Eberendu, A. C. (2016). Unstructured Data: an overview of the data of Big Data. International Journal of Computer Trends and Technology, 38, 46-50. https://doi. org/10.14445/22312803/IJCTT-V38P109

Kandel, S., Heer, J. and Buono, P. (2011). Research directions in data wrangling: Visualizations and transformations for usable and credible data. Information Visualization, 10(4), 271-288.

Kinney, R., Anastasiades, C., Authur, R., Beltagy, I., Bragg, J., Buraczynski, A., et al. (2023). The Semantic Scholar Open Data Platform. arXiv. Available at: abs/2301.10140

Kusumasari, T. F. (2016). Data profiling for data quality improvement with OpenRefine. In 2016 International Conference on Information Technology Systems and Innovation (ICITSI 2016), 24-27 Oct 2016, Bandung- Bali, Indonesia. p. 1-6. ICITSI.2016.7858197

Mani, N. S., Cawley, M., Henley, A., Triump, T. and Williams, J. M. (2021). Creating a data science framework: A model for academic research libraries. Journal of Library Administration. 61(3): 281-300. 930826.2021.1883366

Mukhopadhyay, P., Mitra, R. and Mukhopadhyay, M. (2021). Library carpentry: Towards a new professional dimension (Part I–Concepts and case studies). SRELS Journal of Information Management, 58(2), 67-80.

Peroni, S., Shotton, D. and Vitali, F. (2017). One year of the OpenCitations Corpus: Releasing RDF-based scholarly citation data into the public domain. In 16th International Semantic Web Conference, 21-25 Oct 2017, Vienna, Austria, edited by C. d’Amato, M. Fernandez, V. Tamma, F. Lecue, P. Cudré-Mauroux, J. Sequeda, C. Lange, and J. Heflin, 2012, p. 184-192. 319-68204-4_19

Robinson, L. and Bawden, D. (2017). “The story of data”: A socio-technical approach to education for the data librarian role in the CityLIS library school at City, University of London. Library Management, 38(6/7), 312-322.




How to Cite

Nath, A., & Jana, S. (2023). Data Wrangling from Socio-Academic Web-Space: Designing a Meta Model. Journal of Information and Knowledge, 60(2), 113–125.