SnakeLines: integrated set of computational pipelines for sequencing reads.

Jaroslav Budiš,Werner Krampl,Marcel Kuchařík,Rastislav Hekel,Adrian Goga,Michal Lichvár,David Smolak,Miroslav Böhmer,Andrej Baláž,František Ďuriš,Juraj Gazdarica,Katarína Šoltys,Ján Turňa,Jan Radvanszky,Tomáš Szemes

SnakeLines: integrated set of computational pipelines for sequencing reads.

2021

Background: With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilizing sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centers with inconsistent versions of installed libraries and bioinformatics tools. Results: We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including mapping, assembly, variant calling, viral identification, transcriptomics, metagenomics, and methylation analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms. Conclusion: SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility. Keywords: Computational pipeline, framework, massively parallel sequencing, reproducibility, virtual environment

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations