Simulated annealing algorithm for the multiple sequence alignment problem : The approach of polymers in a random medium

2005 
We proposed a probabilistic algorithm to solve the Multiple Sequence Alignment problem. The algorithm is a Simulated Annealing (SA) that exploits the representation of the Multiple Alignment between $D$ sequences as a directed polymer in $D$ dimensions. Within this representation we can easily track the evolution in the configuration space of the alignment through local moves of low computational cost. At variance with other probabilistic algorithms proposed to solve this problem, our approach allows for the creation and deletion of gaps without extra computational cost. The algorithm was tested aligning proteins from the kinases family. When D=3 the results are consistent with those obtained using a complete algorithm. For $D>3$ where the complete algorithm fails, we show that our algorithm still converges to reasonable alignments. Moreover, we study the space of solutions obtained and show that depending on the number of sequences aligned the solutions are organized in different ways, suggesting a possible source of errors for progressive algorithms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    12
    Citations
    NaN
    KQI
    []