Paper on reproducible bioinformatics pipelines with Guix
I’m happy to announce that the bioinformatics group at the Max Delbrück Center that I’m working with has released a preprint of a paper on reproducibility with the title Reproducible genomics analysis pipelines with GNU Guix.
We built a collection of bioinformatics pipelines called "PiGx" ("Pipelines in Genomix") and packaged them as first-class packages with GNU Guix. Then we looked at the degree to which the software achieves bit-reproducibility, analysed sources of non-determinism (e.g. time stamps), discussed experimental reproducibility at runtime (e.g. random number generators, the interface provided by the kernel and the GNU C library, etc) and commented on the practice of using “containers” (or application bundles) instead.
Reproducible builds is a crucial foundation for computational experiments. We hope that PiGx and the reproducibility analysis we presented in the paper can serve as a useful case study demonstrating the importance of a principled approach to computational reproducibility and the effectiveness of Guix in the pursuit of reproducible software management.
Rilataj temoj:Bioinformatics High-performance computing Reproducibility Reproducible builds Research
Unless otherwise stated, blog posts on this site are copyrighted by their respective authors and published under the terms of the CC-BY-SA 4.0 license and those of the GNU Free Documentation License (version 1.3 or later, with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts).