Semantic Annotator for Knowledge Graph Exploration : Pattern-Based NLP Technique


  • Associate Professor, DRTC, Indian Statistical Institute, Bangalore - 560059, Karnataka
  • DRTC, Indian Statistical Institute, Bangalore – 560059, Karnataka



Application, Automated Annotation, Entity Annotation, Knowledge Graph Exploration, NLP, Semantic Annotation, Semantic Annotation Platform, Thing Annotation, Thing Spotting.


Semantic Annotator for knowledge Graph Exploration, abbreviated as SAGE is a “Thing” annotation system. Here, “Thing” refers to any concept, named individuals (aka entities), entity relations, and attributes. The system is primarily built based on the idea of “string to thing” where the “string” is any given text (e.g., abstract of an article) as input by the user. For annotation, the system utilises knowledge graph(s). SAGE can be used by anyone for annotating Things and for their exploitation on the Web. The annotation of things is done through exact and partial matches. For exact matches, the system makes explicit the name of the knowledge graphs it is sourced from. It also shows the type hierarchies for the matched named entities. In the current work, we describe the SAGE annotation system, designed on pattern-based NLP techniques, along with its features and various usage, and the experimental results.


Download data is not yet available.


Metrics Loading ...


BioAssay Ontology. (n.d.). Retrieved from: https://bioportal.

Blumaumer, A., and Kiryakov, A. (n.d.). Knowledge Graphs: 5 Use cases and 10 steps to get there - Ontotext. Retrieved from:

Brat Rapid Annotation Tool. (n.d.). Retrieved from: https://

Chabchoub, M., Gagnon, M. and Web, A. Z. (2018). FICLONE: Improving DBpedia spotlight using named entity recognition and collective disambiguation. Open Journal Semantic Web, 5(1), 12-28.

Chen, S., Karaoglu, A., Negreanu, C., Ma, T., Yao, J.-G., Williams, J., Jiang F, Gordon, A., Lin, C.-Y. (2022). LinkingPark: An automatic semantic table interpretation system. Journal of Web Semantics, 74. https://doi. org/10.1016/j.websem.2022.100733

Ciotti, M., Ciccozzi, M., Terrinoni, A., Jiang, W.-C., Wang C.-B., and Bernardini, S. (2020). The COVID-19 pandemic. Critical Reviews in Clinical Laboratory Sciences, 57(6), 365-388. 1783198 PMid:32645276

CovidGraph. (n.d.). Retrieved from: covidgraph/

Daiber, J., Jakob, M., Hokamp, C., and Mendes, P. N. (2013). Improving efficiency and accuracy in multilingual entity extraction. In Proceedings of the 9th International Conference on Semantic Systems (I-SEMANTICS ‘13) (pp. 121-124.) Association for Computing Machinery, New York, NY, USA. https://doi. org/10.1145/2506182.2506198

DeBellis, M., and Dutta, B. (2021). The Covid-19 CODO development process: An agile approach to knowledge graph development. Communications in Computer and Information Science. 1459 CCIS, 153-168. https://doi. org/10.1007/978-3-030-91305-2_12

Doccano, GitHub. (n.d.). Retrieved from: https://github. com/doccano Dutta, B. and Das, P. (2023 April). SAGE: A semantic annotator for knowledge graph exploration. In ASIS&T Mid-Year Conference Expanding Horizons of Information Science and Technology and Beyond. Virtual.

Dutta, B., and DeBellis, M. (2020). CODO: An ontology for collection and analysis of Covid-19 data. In D. Aveiro, J. Dietz, & J. Filipe (Eds.), Proc of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management- KEOD (pp. 76-85). SciTePress. PMid:32515358 PMCid:PMC7269891

Giunchiglia, F., Maltese, V., and Dutta, B. (2012). Domains and context: First steps towards managing diversity in knowledge. Journal of Web Semantics: science, Services and Agents on the World Wide Web, 12-13, 53-63.

Google Knowledge Graph. (n.d.). Retrieved from:

Gupta, S., Szekely, P., Knoblock, C. A., Goel, A., Taheriyan, M., and Karma, M. M. (2012). A system for mapping structured sources into the Semantic Web. In Extended Semantic Web Conference (pp. 430-434). Springer, 2012.

He, Y., Yu, H., Ong, E., Wang, Y. and Liu, Y (2020). CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Scientific Data, 7(181). PMid:32533075 PMCid:PMC7293349

Hogan, W. R., Hanna, J., Hicks, A., Amirova, S., Bramblett, B., Diller, M., Enderez, R., Modzelewski, T., Vasconcelos, M., and Delcher, C. (2017). Therapeutic indications and other use-case-driven updates in the drug ontology: Antimalarials, anti-hypertensives, opioid analgesics, and a large term request. Journal of Biomedical Semantics, 8(1). PMid:28253937 PMCid:PMC5335794

Hogenboom, F., Frasincar, F., and Kaymak, U. (2010). An overview of approaches to extract information from natural language corpora. Information Foraging Lab, 69.

Huang, X., Zhang, J., Xu, Z. and Ou, L (2021). A knowledge graph-based question-answering method for medical domain. PeerJ Computer Science, 7. PMid:34604514 PMCid:PMC8444078

Idehen, K. U. (2020). Linked data, ontologies, and knowledge graphs. Retrieved from:

Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., ... and Bizer, C. (2015). Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web, 6(2), 167-195.

Lin, Y., Mehta, S., Küçük-McGinty, H., Turner, J. P., Vidovic, D., Forlin, M., Koleti, A., Nguyen, D. T., Jensen, L. J., Guha, R., Mathias, S. L., Ursu, O., Stathias, V., Duan, J., Nabizadeh, N., Chung, C., Mader, C., Visser, U., Yang, J. J., … and Schürer, S. C. (2017). Drug target ontology to classify and integrate drug discovery data. Journal of Biomedical Semantics, 8(1). PMid:29122012 PMCid:PMC5679337

Lotfi, M., Hamblin, M. and Acta, N. R. (2020). COVID19: Transmission, prevention, and potential therapeutic opportunities. Clinica Chimica Acta, 508, 254-266. PMid:32474009 PMCid:PMC7256510

Miller, G. A. (1995). WordNet: A lexical database for English. Communications of ACM, 38(11), 39-41.

Nguyen, P., Kertkeidkachorn, N., Ichise, R., and Takeda, H. (2022) MTab4D: Semantic annotation of tabular data with DBpedia. Semantic Web. SW-223098

Object Property Description, Protégé 5 Documentation, GitHub (n.d.). Retrieved from:

Penn Part-of-Speech tags (n.d.) Retrieved from:

Vrandečić, D., and Krötzsch, M. (2014). Wikidata: A free collaborative knowledgebase. Communications of the ACM, 57(10), 78-85.

Wolfram|Alpha. (n.d.). Retrieved from:



How to Cite

Dutta, B., & Das, P. (2023). Semantic Annotator for Knowledge Graph Exploration : Pattern-Based NLP Technique . Journal of Information and Knowledge, 60(1), 49–62.



Invited Paper