Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this repository, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of research areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
Installation using PIP
- Ensure you have Python 3.6 or above installed. Download latest version.
- Use pip to install the classifier:
pip install cso-classifier
- Download English package for spaCy using
python -m spacy download en_core_web_sm
Installation using Github
- Ensure you have Python 3.6 or above installed.
- Install the necessary depepencies by executing the following command:
pip install -r requirements.txt
- Download English package for spaCy using
python -m spacy download en_core_web_sm