

Fundamentals of reproducible research and
free software
MVA course
Miguel Colom
About this course
This is a course on reproducible research and free/open source software (FOSS) at the MVA master. It includes topics such as how to write and publish reproducible research, legal aspects around the source code, article, and data, and eventually good practices when writing free software and performing reproducible research.Group discussions, debates, and individual dissertations are part of the activities of the course. The plan of the course covers the minimum knowledge that any masters' student or PhD candidate on computational sciences should reach to perform reliable scientific research.
Slides of the course
- Introduction. Software licenses. Patents. Economic models. Case studies.
- Towards reproducible research
Schedule
The course is on Wednesdays from 14:30h to 17:30h, at room 1M07 (bât. Ouest, 1st floor).- 08/10/2025. MC1. Presentation of the course. Introduction. Free and open-source software. Licensing.
- 15/10/2025. MC2. Patents. Economic model of FOSS projects. Reproducible research. Introduction to reproducible research.
- 22/10/2025. Presentation of the IPOL journal. TP1 (economic model and licensing in a free/open source software project).
- 29/10/2025. MC3. Publishing reproducible research. The editorial process. Legal aspects.
- 05/11/2025. P. Intermediate presentation (1/2). Group discussion on the presentations, questions, feedback.
- 12/11/2025. 🎤 Invited talk by Simon Tournier on the Guix and Software Heritage projects.
- 19/11/2025. MC4. Writing a reproducible scientific article (1/2). TP2 (reproducibility)..
- 26/11/2025. MC5 Writing a reproducible scientific article (2/2). Review of TP2. Start TP3 (scientific writing).
- 10/12/2025. Final presentations (2/2)
Invited talks
- Simon Tournier (talk on 12/11/2025, don't miss it!)
- Jaime Arias, Software Heritage, key infrastructure for Open Science and Software Science
- Enric Meinhardt, S2P: a reproducible satellite stereo pipeline
- Marina Gardella, Image forensic tools
- Charles Truong, Change point detection in Python
Reading
- Jeffrey Brainard. Open-access journal elife will lose its 'impact factor' over controversial publishing model. Science, 13/11/2024. DOI: 10.1126/science.zycyo78.
- Sheeba Samuel, Daniel Mietchen. Computational reproducibility of Jupyter notebooks from biomedical publications. arXiv preprint. 11 Aug. 2023.
- Anil Oza Reproducibility trial: 246 biologists get different results from same data sets. Nature news article. 12 Oct. 2023.
- Protzko et al. High replicability of newly discovered social-behavioural findings is achievable. Nature Human Behaviour. 9 Nov. 2023.
- Veritasium. The Problem With Science Communication. Youtube video. 1 Nov. 2023.
- Reuters. Moderna sues Pfizer/BioNTech for patent infringement over COVID vaccine (2022) → European Patent Office declares Moderna mRNA patent invalid (2023).
- Florian Prinz, Thomas Schlange and Khusru Asadullah. Believe it or not: how much can we rely on published data on potential drug targets? Nature Reviews, drug discovery.
- C. Glenn Begley and Lee M. Ellis. Raise standards for preclinical cancer research. 29 March 2012, vol. 483, Nature 531.
- Christian Fuchs and Marisol Sandoval. The Diamond Model of Open Access Publishing: Why Policy Makers, Scholars, Universities, Libraries, Labour Unions and the Publishing World Need to Take Non-Commercial, Non-Profit Open Access Serious. tripleC 13(2): 428-443, 2013.
- Charles Piller. Blots on a Field?. Science, Vol 377, Issue 6604. DOI: 10.1126/science.add9993.
- Alexandru Nedelcu. Akka is moving away from Open Source, September 7, 2022.
- Tom E. Hardwicke, Robert T. Thibault, Jessica E. Kosie, Loukia Tzavella, Theiss Bendixen, Sarah A. Handcock, Vivian E. Köneke and John P. A. Ioannidis. Post-publication critique at top-ranked journals across scientific disciplines: a cross-sectional assessment of policies and practice, R. Soc. open sci.9220139220139. DOI: 10.1098/rsos.220139.
- Unified Patents. Defending Open Source: An 2022 Litigation Update, Jun. 9, 2022.
- swyx. How Open Source is eating AI, Oct. 9, 2022.
- Juan Pablo Alperin. Why I think ending article-processing charges will save open access, Nature World View, 12 Oct. 2022. DOI: 10.1038/d41586-022-03201-w.
- Holly Else. Dozens of papers co-authored by Nobel laureate raise concerns, Nature News, 21 Oct. 2022. DOI: 10.1038/d41586-022-03032-9.
- eLife. eLife’s New Model: Changing the way you share your research, eLife, 20 Oct. 2022.
Interesting reading provided by the students
We might together discuss these topics within the course.- From Pedro Machado Santos Rohde (2023). An open source developer / lawyer that wants to sue Github because of Copilot, its autocomplete tool trained on public repos. It produces results which can clearly be traced back to the original code, but with no attribution or mention to licenses, e.g. [Twitter post]. I thought it was interesting to share, as it's very recent news and very much linked to our RR/FOSS class.
- From Solal Nathan (2023), about HuggingFace using DOIs: Introducing DOI: the Digital Object Identifier to Datasets and Models.
- From Solal Nathan (2023), The Turing Way handbook to reproducible, ethical and collaborative data science.
- From Théo Saulus (2023), patent Attention-based sequence transduction neural networks. Another example of a patent of software by alleging it's a system made of hardware + a computer program.
- From Solal Nathan (2024), French Court Issues Damages Award for Violation of GPL. Text of the decision, 14 février 2024 Cour d'appel de Paris RG n° 22/18071.
↩ Back to the main page