Materials for 'Analysing Semantic Textual Similarity of University Module Catalogue Entries for Linking Modules Comparable in Content'
This collection contains fitted models and the dataset used in the named paper.
Models: Fitted PyTorch and Gensim models used in the analysis for the 'Analysing Semantic Textual Similarity of University Module Catalogue Entries for Linking Modules Comparable in Content' paper. Models are used for document embedding generation and topic modelling. They were used with the corresponding dataset, but can be used with other document datasets as well.
Dataset: This dataset is an export of the SQL database backing the Newcastle University module catalogue, visible at Module Catalogue - Global Opportunities - Newcastle University (ncl.ac.uk), used in the 'Analysing Semantic Textual Similarity of University Module Catalogue Entries for Linking Modules Comparable in Content' paper. Contains 27 tables, where modules.csv is used directly in the analysis. modules.csv contains various metadata corresponding to the module catalogue entries, including module names, semantic descriptions and tabular data. Additionally, a set of paired module codes with corresponding semantic similarities are given in test_pairs_labelled.txt.
Corresponding repository found at: lukekaye/sts-university-modules: Analysing Semantic Textual Similarity of university module catalogue entries for linking modules comparable in content. (github.com)