Filtering Variables for Supervised Sparse Network Analysis

2020 
Motivation: We present a method for dimension reduction designed to filter variables or features such as genes considered to be irrelevant for a downstream analysis designed to detect supervised gene networks in sparse settings. This approach can improve interpret-ability for a variety of analysis methods. We present a method to filter genes and transcripts prior to network analysis. This method has applications in a setting where the downstream analysis may include sparse canonical correlation analysis. Results: Filtering methods specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. Our proposed method performs favorably eliminating irrelevant features but maintaining important biological signal under a variety of different signal settings. We show that the speed and accuracy of methods such as sparse canonical correlation are increased after filtering, thus greatly improving the scalability of these approaches. Availability: Code for performing the gene filtering algorithm described in this manuscript may be accessed through the geneFiltering R package available on Github at https://github.com/lorinmil/geneFiltering. Functions are available to filter genes and perform simulations of a network system. For access to the data used in this manuscript, contact corresponding author.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    1
    Citations
    NaN
    KQI
    []