Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this page, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of research areas in the field of Computer Science.
Tag: Bibliographic Data

Awesome Scholarly Data Analysis
Awesome Scholarly Data Analysis is a curated collection of resources that can support Scholarly Data analytics. This list ranges from: Datasets, which includes different corpora of papers, citations, authors and others, as well as taxonomies and ontologies of research concepts; Tools for collecting and classifying research papers, information extraction, and visualization; and Venues, Summer Schools, […]

Computer Science Ontology
The Computer Science Ontology is a large-scale ontology of research areas that was automatically generated using the Klink-2 algorithm on a dataset of about 16 million publications, mainly in the field of Computer Science. In the rest of the paper, we will refer to this corpus as the Rexplore dataset.
The current version of CSO includes 14,164 topics and 162,121 semantic relationships. The main root is Computer Science; however, the ontology includes also a few secondary roots, such as Linguistics, Geometry, Semantics, and so on.
CSO presents two main advantages over manually crafted categorisations used in Computer Science (e.g., 2012 ACM Classification, Microsoft Academic Search Classification). First, it can characterise higher-level research areas by means of hundreds of sub-topics and related terms, which enables to map very specific terms to higher-level research areas. Secondly, it can be easily updated by running Klink-2 on a set of new publications.

Invited Talk – Early detection of Research Topics
On 2nd of August 2018, I have been invited by Boris Veytsman, Principal Research Scientist at Chan Zuckerberg Initiative (formerly Meta), to give a talk about my PhD work. Differently from my previous talk to the ORNL group, I had the opportunity to describe my doctoral work more comprehensively. More specifically, I initially showed what is available […]

Invited Talk – AUGUR: Forecasting the Emergence of New Research Topics
On 30th Jul 2018, I have been invited from Dasha Herrmannova, former PhD student at the KMi, to give a talk at the “Machine Learning and Graph Mining for Big Scholarly Data” workshop organised for the Computational Data Analytics Group at Oak Ridge National Laboratory (ORNL). In this talk, named “AUGUR: Forecasting the Emergence of New […]

Supporting Editorial Activities at Springer Nature
The project aims at fostering Springer Nature editorial activities by supporting them with a variety of smart solutions leveraging artificial intelligence, data mining, and semantic technologies. In particular, the KMi team will support Springer Nature editorial team in classifying proceedings and other editorial products, taking informed decisions about their marketing strategy, and improve their internal classification.
Springer Nature video
Couple of months ago, with my team, we attended the Springer Nature HackDay (here is the post). Just not long ago, Springer Nature released a short video featuring us. Summarised is also my interview, in which I discuss the advantages of making scholarly datasets, as SciGraph, available to anyone. Other media Building on the success […]

SpringerNature Hackday – London
On the 29th November 2017, myself with two KMi colleagues (Andrea Mannocci and Thiviyan Thanapalasingam) attended the second edition of SpringerNature HackDay in London (@ SpringerNature Campus). Aliaksandr Birukou, Executive Editor of Computer Science at Springer Nature and collaborator of our research team at the Knowledge Media Institute, also joined our group on the HackDay. The whole […]

Smart Book Recommender
The Smart Book Recommender (SBR) is a semantic application designed to support the Springer Nature editorial team in promoting their publications at Computer Science venues. It takes as input the proceedings of a conference and suggests books, journals, and other conference proceedings that are likely to be relevant to the attendees of the conference in question. It […]

Supporting Springer Nature Editors by means of Semantic Technologies
“Supporting Springer Nature Editors by means of Semantic Technologies” is a research paper accepted to the Industry Track at the International Semantic Web Conference (ISWC) 2017 , 21-25 October 2017, Vienna, Austria. Authors Francesco Osborne, Angelo Salatino, Thiviyan Thanapalasingam, Aliaksandr Birukou and Enrico Motta Abstract The Open University and Springer Nature have been collaborating since 2015 […]