Machine learning predicts putative haematopoietic stem cells within large single-cell transcriptomics datasets

2019 
Abstract Haematopoietic stem cells (HSCs) are an essential source and reservoir for normal haematopoiesis, and their function is compromised in many blood disorders. HSC research has benefitted from the recent development of single-cell molecular profiling technologies, where single-cell RNA-sequencing (scRNA-seq) in particular has rapidly become an established method to profile HSCs and related haematopoietic populations. The classical definition of HSCs relies on transplantation assays, which have been used to validate HSC function for cell populations defined by flow cytometry. Flow cytometry information for single cells however is not available for many new high-throughput scRNA-seq methods, thus highlighting an urgent need for the establishment of alternative ways to pinpoint the likely HSCs within large scRNA-seq datasets. To address this, we tested a range of machine learning approaches and developed a tool, hscScore, to score single-cell transcriptomes from murine bone marrow based on their similarity to gene expression profiles of validated HSCs. We evaluated hscScore across scRNA-seq data from different laboratories, which allowed us to establish a robust method that functions across different technologies. To facilitate broad adoption of hscScore by the wider haematopoiesis community, we have made the trained model and example code freely available online. In summary, our method hscScore provides fast identification of mouse bone marrow HSCs from scRNA-seq measurements and represents a broadly useful tool for analysis of single-cell gene expression data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    13
    Citations
    NaN
    KQI
    []