Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites

2018 
RNA-binding proteins (RBPs) regulate post-transcriptional gene expression by recognizing short and degenerate sequence elements in their target transcripts. Despite the expanding list of RBPs with in vivo binding sites mapped genomewide using crosslinking and immunoprecipitation (CLIP), defining precise RBP binding specificity remains challenging. We previously demonstrated that the exact protein-RNA crosslink sites can be mapped using CLIP data at single-nucleotide resolution and observed that crosslinking frequently occurs at specific positions in RBP motifs. Here we have developed a computational method, named mCross, to jointly model RBP binding specificity while precisely registering the crosslinking position in motif sites. We applied mCross to 112 RBPs using ENCODE eCLIP data and validated the reliability of the resulting motifs by genome-wide analysis of allelic binding sites also detected by CLIP. We found that the prototypical SR protein SRSF1 recognizes GGA clusters to regulate splicing in a much larger repertoire of transcripts than previously appreciated.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    60
    References
    0
    Citations
    NaN
    KQI
    []