Sparse Representation Based Genetic Biomarker Evaluation for Congenital Heart Defects

2016 
Background: Congenital heart defects (CHD) are the most common type of birth defect, affecting approximately 8 in 1,000 newborns. Hundreds of genes have been reported as CHD candidate genes. Nevertheless, each patient/patient group may demonstrate unique etiologic characteristics requiring personalized treatment.Methods: We proposed a sparse representation-based variable selection (SRVS) approach to select disease-related genetic markers from a huge disease candidate gene pool acquired from ResNet relation database. The proposed approach was used to evaluate 167 CHD candidate genes and was followed by validation on a microarray expression data set. Pathway enrichment analysis (PEA), sub-network enrichment analysis (SNEA), and network connectivity analysis (NCA) were conducted to study the functional profile of the variables selected by SRVS and compare them with previous reported genetic markers.Results: A significantly high disease prediction accuracy of 81.40 % was obtained (permutation p-value < 0.0002) using the top 24 SRVS-selected genes, which had been enriched within multiple pathways and sub-networks that had been previously implicated with CHD. Using the most frequently reported genes out of the 167 CHD candidate genes, the highest accuracy of 69.77 % was obtained with a permutation p-value = 0.017. Enrichment analysis and NCA showed that the top genes selected by the proposed SRVS approach were strongly related to the frequently reported CHD genes, although functional differences were present.Conclusion: Our study suggests that SRVS is an effective method in data driven variable selection for CHD and that frequently reported CHD candidate genes may not be the best biomarkers for a specific CHD patient/ patient group.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []