“The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas” is a journal paper submitted to the Special Issue on Best Resources papers of the Data Intelligence Jornal (MIT Press).
Angelo A. Salatino1, Thiviyan Thanapalasingam1, Andrea Mannocci1, Aliaksandr Birukou2, Francesco Osborne1, Enrico Motta1
1 Knowledge Media Institute, The Open University, MK7 6AA, Milton Keynes, UK
2 Springer-Verlag GmbH, Tiergartenstrasse 17, 69121 Heidelberg, Germany
Abstract
Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 14K topics and 162K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO, we have also released the CSO Classifier, a tool for automatically classifying research papers, and the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO. Users can use the portal to navigate and visualise sections of the ontology, rate topics and relationships, and suggest missing ones. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various research communities engaged with scholarly data.
Paper Download
Download from ORO: http://oro.open.ac.uk/66268/
Download from the Data Intelligence Journal website: http://www.data-intelligence-journal.org/p/62/
Download from MIT Press website: (available soon)