Newcastle University
Browse
README (1.53 kB)
IMAGE
Example Images.png (709.97 kB)
metadata (0.09 kB)
.NPY
train_x.npy (7.23 GB)
.NPY
train_y.npy (308.7 kB)
.NPY
valid_x.npy (925.69 MB)
.NPY
valid_y.npy (38.7 kB)
.NPY
test_x.npy (925.69 MB)
.NPY
test_y.npy (38.7 kB)
1/0
9 files

Myofibre - Binary Classification Dataset

dataset
posted on 2024-09-12, 09:33 authored by David TowersDavid Towers, Linus EricssonLinus Ericsson, Andrew Stephen McGough, Amir Atapour-Abarghouei, Elliot J Crowley

The Myofibre dataset is a constructed dataset derived from the NCL_SM dataset (doi.org/10.25405/data.ncl.24125391), which contains high-quality images of human tissue.

This dataset is one of the three hidden datasets used by the 2024 NAS Unseen-Data Challenge.

The images include around 50,000 manually segmented muscle fibres (myofibres). These myofibres have been split into four categories: Analysable Myofibre, Non-Transverse Myofibre, Freezing Artefact Myofibre, and Folded Region.

We took each of the cells that the original authors manually labelled, rescaled them to a uniform (3, 128, 128) data shape, and used these images to create a binary-classification task.

We converted the task into a binary classification task due to the imbalance between the four original labels, there are 30,794 AM Myofibres, but only 18,102 NTMs, 1,538 FAMs, and 405 FRs. While there is still an imbalance, the balance is not as drastic and since this was created from real-world data, we hope that it is representative of that fact.

The data is in a channels-first format with a shape of (n, 3, 128, 128) where n is the number of samples in the corresponding set (39,497 for training, 4,937 for validation, and 4,937 for testing).

As this is a binary classification task, there are two classes in the dataset, with aproximately 30,000 AM images and 20,000 NAM images, randomly distributed between each of the three sets.

The two classes and corresponding numerical label are as follows:
Analysable Myofibre (AM): 0,
Not Analysable Myofibre (NAM): 1

NumPy (.npy) files can be opened through the NumPy Python library, using the `numpy.load()` function by inputting the path to the file into the function as a parameter. The metadata file contains some basic information about the datasets, and can be opened in many text editors such as vim, nano, notepad++, notepad, etc 


History

Usage metrics

    Newcastle University

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC