Why and how can you make your developments and calculations more reproducible? Reproducibility of research is a theme that concerns all scientific disciplines and is fully in line with the Open Science plan and the law on scientific integrity. The benefits are numerous, and concern a whole range of people. Revisiting your work 5 years later, passing it on to colleagues in your team or to an incoming PhD student, or ensuring wider dissemination are all the easier if certain good practices have been followed during the production, use and dissemination phases.
For further information, please consult this document published by IT specialists from DEVLOG :
To make your work reproducible, you need to pay particular attention to the software environment in which you carried out your developments and executed your code. The description of this environment and how to recreate it elsewhere and/or later must be distributed in the same way as your software.
On Gricad clusters, several tools are available for deploying the software environment you need: see the High-Performance Computing section. Some are more reproducible than others, for intrinsic reasons. For example, Guix or Nix allow you to build an environment isolated from the computing machine, which is less true in the case of conda / mamba / micromamba, which rely on software components already installed on the machine.
Another way to reproduce your software environment is to use containers. This makes it possible to provide a user with software and a complete, isolated software environment (libraries and dependencies) to run it. Popular containerization platforms include Docker and Apptainer, but there are many others.
In terms of reproducibility, Guix and Nix are clearly the best performers, followed by containers. Tools based on conda / mamba perform less well in terms of reproducibility.
Apart from containers, which have a different operating mode, each tool has its own way of “capturing” the runtime environment:
guix package --export-manifest > manifest.scm
nix-env --switch-profile $NIX_USER_PROFILE_DIR/test_profile
conda create -n <env_name> <pkg-name1> <pkg-name2> <pkg-name3>
and replace it at a later date:
guix shell -m manifest.scm
nix-env --switch-profile $NIX_USER_PROFILE_DIR/test_profile
conda activate /path/to/env
To ensure reproducibility, the project’s git repository will contain this information. In this way, a user who has downloaded the software sources will be able to know and redeploy the execution software environment.
Last but not least, a document describing the various execution stages, and possibly a test case, is also necessary: this will enable the user to clearly identify the parts to be executed and in what order.
For a more exhaustive description, you can find information on this site.
If your developments are associated with a publication, the reader will want to have access to the version that corresponds exactly to the one described in the article. You will therefore need to indicate precisely the corresponding “commit”. The [Software Heritage] platform (https://www.softwareheritage.org/?lang=fr) will be a great help in this respect.
To distribute your software, there are several preliminary steps to take:
The HAL reference normally contains all the information you need to download and reuse your software!
For further information, please consult this document published by the IT specialists of DEVLOG :
In short, you need to supply your software with the chosen license, the list of authors and contributors, its SWHID or HAL identifier, a description of the software environment and its documentation (code + execution).
For any question: do not hesitate to contact us by mail: sos-gricad@univ-grenoble-alpes.fr
For more information on GRICAD, please visit our website.
If you’re interested in reproducibility, you’ll find more information on the National Reproducible Research Network website.