Detection of non-structural outliers for microarray experiments

2014 
Outliers are unavoidable in many experiments due to various complex reasons ranging from equipment resolution to data contamination. The presence of outliers in microarray gene expression data can affect the quality of gene selection and ranking. This effect is severe when a microarray gene expression data is composed of too few samples. We classify outliers occurred in microarray gene expression data as structural and non-structural outliers. Structural outliers are gene dependent or sample dependent (or both) whereas non-structural outliers are gene and sample-independent. They are uninformative to gene expression differentiation but can cause misclassification of a differentially expressed gene as a non-differentially expressed one. While there are algorithms for detecting structural outliers, a different strategy is required for detecting non-structural outliers. We show the impact of non-structural outliers on gene selection/ranking and false discovery rate control. We also show the unsuitableness of existing outlier detection algorithms for detecting non-structural outliers. We propose a new algorithm for detecting non-structural outliers. It models the consecutive differences of ordered gene expressions as exponentially distributed. We use simulated and real data to demonstrate the efficacy of the proposed algorithm in correcting for non-structural outliers and improving gene selection/ranking and false discovery rate control.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    1
    Citations
    NaN
    KQI
    []