| Project Accession: | IBIAP_1000000021 |
| Title: | C-NMC: B-lineage acute lymphoblastic leukaemia (B-ALL): A blood cancer dataset |
| Representative Image: | |
| Description: | Development of computer-aided cancer diagnostic tools is an active research area owing to the advancements in deep-learning domain. Such technological solutions provide affordable and easily deployable diagnostic tools. Leukaemia, or blood cancer, is one of the leading cancers causing more than 0.3 million deaths every year. In order to aid the development of such an AI-enabled tool, we collected and curated a microscopic image dataset, namely C-NMC, of more than 15000 cancer cell images at a very high resolution of B-Lineage Acute Lymphoblastic Leukaemia (B-ALL). The dataset is prepared at the subject-level and contains images of both healthy and cancer patients. So far, this is the largest (as well as curated) dataset on B-ALL cancer in the public domain. C-NMC is also available at The Cancer Imaging Archive (TCIA), USA and can be helpful for the research community worldwide for the development of B-ALL cancer diagnostic tools. This dataset was utilized in an international medical imaging challenge held at ISBI 2019 conference in Venice, Italy. In the published article, we have presented a detailed description and challenges of this dataset. We have also presented benchmarking results of all the methods applied so far on this dataset. |
| Publications: | https://doi.org/10.1016/j.medengphy.2022.103793 |
| Funding agency: | Ministry of Communication and IT, Govt. of India and Department of Science and Technology (DST), Govt. of India. |
| Grant Number: | 1(7)2014-ME&HI and EMR2016006183 |
| Ethics Statement: | Download |
| Any Other Information : | In the directory named "C-NMC_test_final_phase_data", all image files that originally had the .bmp extension have been renamed so that .bmp is now replaced with "_final.bmp". The original version of this dataset is available at The Cancer Imaging Archive (TCIA; https://www.cancerimagingarchive.net/collection/c-nmc-2019/). The TCIA citation is: Mourya, S., Kant, S., Kumar, P., Gupta, A., & Gupta, R. (2019). ALL Challenge dataset of ISBI 2019 (C-NMC 2019) (Version 1) [dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.dc64i46r |
| Additional File: | Download |
| Acknowledgments: | Authors gratefully acknowledge the research funding support (Grant Number: 1(7)2014-ME&HI) from the Ministry of Communication and IT, Govt. of India and research grant funding (Grant No.: EMR2016006183) from the Department of Science and Technology (DST), Govt. of India, for this research work. AG and SG thank the Infosys Centre for Artificial Intelligence, IIIT-Delhi and SG thanks University Grant Commission (UGC), Govt. of India for the UGC-Senior Research Fellowship. |
| Sr.No | First name | Last name | Organization | Designation | |
|---|---|---|---|---|---|
| 1 | Ritu | Gupta | drritugupta@gmail.com | Laboratory Oncology Unit, Dr. B.R.A.IRCH, AIIMS, New Delhi, India | Principal Investigator |
| 2 | Shiv | Gehlot | shivg@iiitd.ac.in | SBILab, Department of ECE, IIIT-Delhi, Delhi, India | Research Scholar |
| 3 | Anubha | Gupta | anubha@iiitd.ac.in | SBILab, Department of ECE, IIIT-Delhi, Delhi, India | Principal Investigator |
| Study Accession: | HISTOS_1000000025 |
| Title: | An image dataset of B-lineage acute lymphoblastic leukaemia (B-ALL) and healthy hematogones |
| Imaging Type: | Histopathology (HISTO) |
| Imaging Sub-type: | Diagnostic Pathology |
| Summary: | This study provides a dataset of white blood cancer, namely, B-Lineage Acute Lymphoblastic Leukaemia (B-ALL) along with the healthy hematogones. The dataset has been split at the subject-level into the training and the test sets. Specifically, the training set contains 12528 cell images of 8491 cancer lymphoblasts and 4037 healthy blasts (also called as hematogones). Cancer cells belong to 60 cancer patients, while normal cells belong to 41 subjects. The test set contains 2586 cell images belonging to 8 healthy (or control) subjects and 9 cancer patients. The training and test set are distributed such that there are no common subjects between the two sets. This dataset was released during the IEEE ISBI 2019 medical imaging challenge in three phases: 1) initial train phase, 2) preliminary test phase, and 3) final test phase. In the initial train phase, the dataset was released for all the registered participants for training their classification networks. In the preliminary test phase, a preliminary test set was released to allow the testing of the performance of the participants’ models. The top participants in this phase were shortlisted to move to the next phase of the challenge and were also provided the ground truth (GT) of the preliminary test set for improving the performance in the next round. Hence, the participants had the GT of the initial training data and the preliminary stage’s test data. Together, this data was used by the participants for the training of their models and tested on the final test data released in the final test phase to decide the ranking on the leaderboard. The dataset arranged in these three phases was accordingly released publicly for use by the future researchers. The GT of training and preliminary test data is released, while those of test data have not been released. |
| Keywords: | Acute lymphoblastic leukaemia; Cancer dataset; Image database; Computer aided diagnosis; Microscopic image; Cancer diagnostics |
| Additional / Any Other Information: | Download |
| Release Date: | Aug. 25, 2025 |
| Access Licence Type: | Open Access |
| Sample Type ID | Organism | Taxon ID | Biological Entity | Laterality | Source Tissue | Source Cell/Cell-line | Cell Organelle |
|---|---|---|---|---|---|---|---|
| HISTOSMT_10000000053 | Homo sapiens | 9606 | Blood and Bone | Not Applicable | Bone marrow | Cancer lymphoblasts | N/A |
| HISTOSMT_10000000054 | Homo sapiens | 9606 | Blood and Bone | Not Applicable | Bone marrow | Hematogones | N/A |
| HISTOSMT_10000000055 | Homo sapiens | 9606 | Blood and Bone | Not Applicable | Bone marrow | N/A | N/A |
| Sample Type ID | Sample ID | Method used for Sample Collection | Cell Phenotype Studied | Data Collection Duration | ICD-11 Code (patient health condition) | Image category/label | Sample Source | Subject type |
|---|---|---|---|---|---|---|---|---|
| HISTOSMT_10000000055 | HISTOSM_10000285578 | Bone marrow aspiration | N/A | Two years | N/A | N/A | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | N/A |
| HISTOSMT_10000000053 | HISTOSM_10000270095 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270096 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270097 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270098 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270099 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270100 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270101 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270102 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000270103 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271527 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271528 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271529 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271530 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271534 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000271535 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000272144 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000272145 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000272146 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| HISTOSMT_10000000053 | HISTOSM_10000272147 | Bone marrow aspiration | N/A | Two years | 2B33.3&XH81V3 | B-ALL | Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India | B-ALL |
| Experiment Type ID | Instrument Name | Instrument Type | Manufacturer | Model |
|---|---|---|---|---|
| HISTOET_10000000022 | Microscope | Digital Microscope | Nikon | Nikon DS-5M |
| Experimental Design Summary (HISTOET_10000000022) |
|---|
| The dataset was prepared at the Laboratory Oncology Unit, Dr. B.R.A IRCH, All India Institute of Medical Sciences (AIIMS), New Delhi, India. The slides were prepared using the subjects’ bone marrow aspirate. The slides containing normal cells were prepared from the control subjects, while those containing cancer cells were prepared from subjects newly diagnosed with B-ALL. The slide preparation process involved staining with the Jenner-Giemsa stain for highlighting the cells of interest. The slides were then placed under a NIKON microscope mounted with NIKON DS5M Camera to capture microscopic images of size 2560 × 1920 pixels in the BMP format. Since the dataset was prepared over a period of two years, the microscopic images captured from slides exhibited a lot of stain color variability from subject to subject. Hence, these images were stain color normalized with the GCTI-SN method using a reference image to counter stain color variability. Next, these cell images were segmented from the microscopic images using our inhouse segmentation pipeline. Cells lying in clusters were also segmented successfully into separate cell images and stored. Since the cell images are of different sizes, a constant size of 350 × 350 is achieved for every image by padding columns and rows of zero intensity after aligning every cell at the center of its respective image. The presented dataset is, so far, the largest cell imaging dataset in the public domain for B-ALL cancer classification problem containing 15,114 cell images. |
| Acquired Images Annotation Description (HISTOET_10000000022) |
|---|
| The cells of interest were marked in the microscopic images by an expert onco-pathologist. This is to note that multiple types of cells including lymphoblasts, plasma cells, red blood cells and so on, are visible in a microscopic image captured from the slide of the bone marrow aspirate or the peripheral blood smear. Since B-ALL cancer is caused by the lymphoblasts, only these cells are required to be marked and segmented to check whether they are healthy or cancer cells. At this stage, it is fairly easy for an expert onco-pathologist to identify different cells. Hence, the lymphoblasts were marked by only one expert onco-pathologist. |
| Sample ID | Experiment Type ID | Experiment ID | Image type (Original / Derived / Unknown) | Any Other Information | Staining Type | Images Magnification | Tissue / Tumor Fixative Used | Camera Used to Capture Images | Data Repository Name (If already deposited in another repository) | Dataset Split Type (Training / Validation / Test) | Licence Type (original source) | Stain Normalization Method |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HISTOSM_10000273075 | HISTOET_10000000022 | HISTOE_10000246759 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273186 | HISTOET_10000000022 | HISTOE_10000246864 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273296 | HISTOET_10000000022 | HISTOE_10000246969 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273407 | HISTOET_10000000022 | HISTOE_10000247074 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273518 | HISTOET_10000000022 | HISTOE_10000247179 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273629 | HISTOET_10000000022 | HISTOE_10000247284 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273740 | HISTOET_10000000022 | HISTOE_10000247389 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273851 | HISTOET_10000000022 | HISTOE_10000247494 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000273961 | HISTOET_10000000022 | HISTOE_10000247599 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274072 | HISTOET_10000000022 | HISTOE_10000247704 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274183 | HISTOET_10000000022 | HISTOE_10000247809 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274294 | HISTOET_10000000022 | HISTOE_10000247914 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274405 | HISTOET_10000000022 | HISTOE_10000248019 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274516 | HISTOET_10000000022 | HISTOE_10000248124 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274626 | HISTOET_10000000022 | HISTOE_10000248229 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274737 | HISTOET_10000000022 | HISTOE_10000248334 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274848 | HISTOET_10000000022 | HISTOE_10000248439 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000274959 | HISTOET_10000000022 | HISTOE_10000248544 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000275070 | HISTOET_10000000022 | HISTOE_10000248649 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| HISTOSM_10000275181 | HISTOET_10000000022 | HISTOE_10000248754 | Derived | N/A | Jenner-Giemsa | N/A | N/A | NIKON DS5M | The Cancer Imaging Archive (TCIA) | Initial Train Set | CC BY 3.0 | GCTI-SN |
| Experiment ID | Image File Name (with path) | Image Preview | Image Size |
|---|---|---|---|
| HISTOE_10000251145 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1037.bmp | ![]() Download Image |
596K |
| HISTOE_10000251146 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1038.bmp | ![]() Download Image |
596K |
| HISTOE_10000251147 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1040.bmp | ![]() Download Image |
596K |
| HISTOE_10000251148 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1041.bmp | ![]() Download Image |
596K |
| HISTOE_10000251149 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1042.bmp | ![]() Download Image |
596K |
| HISTOE_10000251150 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1044.bmp | ![]() Download Image |
596K |
| HISTOE_10000251151 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1045.bmp | ![]() Download Image |
596K |
| HISTOE_10000251152 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1046.bmp | ![]() Download Image |
596K |
| HISTOE_10000251153 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1047.bmp | ![]() Download Image |
596K |
| HISTOE_10000251154 | PKG-C-NMC_2019/C-NMC_test_prelim_phase_data/1049.bmp | ![]() Download Image |
596K |