Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this page, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of research areas in the field of Computer Science.
The Computer Science Ontology is a large-scale ontology of research areas that was automatically generated using the Klink-2 algorithm on a dataset of about 16 million publications, mainly in the field of Computer Science. In the rest of the paper, we will refer to this corpus as the Rexplore dataset.
The current version of CSO includes 14,164 topics and 162,121 semantic relationships. The main root is Computer Science; however, the ontology includes also a few secondary roots, such as Linguistics, Geometry, Semantics, and so on.
CSO presents two main advantages over manually crafted categorisations used in Computer Science (e.g., 2012 ACM Classification, Microsoft Academic Search Classification). First, it can characterise higher-level research areas by means of hundreds of sub-topics and related terms, which enables to map very specific terms to higher-level research areas. Secondly, it can be easily updated by running Klink-2 on a set of new publications.
The CSO Classifier is an application for automatically classifying academic papers according to the rich taxonomy of topics from CSO. The aim is to facilitate the adoption of CSO across the various communities engaged with scholarly data and to foster the development of new applications based on this knowledge base.
Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 26K topics and 226K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles.
Couple of months ago, with my team, we attended the Springer Nature HackDay (here is the post). Just not long ago, Springer Nature released a short video featuring us. Summarised is also my interview, in which I discuss the advantages of making scholarly datasets, as SciGraph, available to anyone. Other media Building on the success […]
The Smart Book Recommender (SBR) is a semantic application designed to support the Springer Nature editorial team in promoting their publications at Computer Science venues. It takes as input the proceedings of a conference and suggests books, journals, and other conference proceedings that are likely to be relevant to the attendees of the conference in question. It […]
“Supporting Springer Nature Editors by means of Semantic Technologies” is a research paper accepted to the Industry Track at the International Semantic Web Conference (ISWC) 2017 , 21-25 October 2017, Vienna, Austria. Authors Francesco Osborne, Angelo Salatino, Thiviyan Thanapalasingam, Aliaksandr Birukou and Enrico Motta Abstract The Open University and Springer Nature have been collaborating since 2015 […]
“Smart Book Recommender: A Semantic Recommendation Engine for Editorial Products” is a poster paper that will be presented at the International Semantic Web Conference (ISWC) 2017, 21-25 October 2017, Vienna, Austria. Authors Francesco Osborne, Thiviyan Thanapalasingam, Angelo Salatino, Aliaksandr Birukou and Enrico Motta Abstract Academic publishers, such as Springer Nature, need to constantly make informed decisions […]
“Smart Topic Miner: Supporting Springer Nature Editors with Semantic Web Technologies” is poster paper presented at the Poster and Demo session [D45] on Wednesday 19th October 2016 at the 15th International Semantic Web Conference in Kobe, Japan Authors: Francesco Osborne, Angelo Antonio Salatino, Aliaksandr Birukou and Enrico Motta Abstract: Academic publishers, such as Springer Nature, annotate scholarly products […]
“Automatic Classification of Springer Nature Proceedings with Smart Topic Miner” is conference paper presented on Friday 21st October 2016 at the 15th International Semantic Web Conference in Kobe, Japan Authors: Francesco Osborne, Angelo Antonio Salatino, Aliaksandr Birukou and Enrico Motta Abstract: The process of classifying scholarly outputs is crucial to ensure timely access to knowledge. However, this […]