Overview
Teaching: 25 min Exercises: TODO minQuestions
How is reproducibility critical to fixing software bugs?
What steps can you take toward sharing your studies reproducibly?
Objectives
Underestand best practices for submitting bug reports
In this final lesson, we’ll cover some miscellaneous best practices for reproducibility in basic day-to-day activities. Although individually some practices may seem insignificant, they add up.
“Reproducibility” is at the heart of what constitutes a good bug report.
References
Additional materials:
- How to Report Bugs Effectively by Simon Tatham (1999) (10 min) – a classic essay that remains apropos; provides an overview of “good” and “bad” bug reports and how they can be expanded to provide sufficient information
- Mozilla: Bug writing guidelines (15 min) – although geared toward Mozilla products, this provides generally applicable guidelines on how to structure bug reports
Overall summary:
If possible, check if the issue persists with a newer version of the software/dataset. If not, it was likely fixed and you’ll need to upgrade. If the software comes from a centralized distribution, you can report the bug there, or request to have the package updated.
Do not try to describe the problem in your own words if you are not 100% sure what the problem is. And even if you are sure – provide a concise and reproducible example that demonstrates the problem, such as a script or a list of steps for a GUI-driven package.
Make sure that your example is complete; i.e. that it’s not just a a code snippet without the necessary context (e.g., imports) to reproduce the issue.
Provide relevant information, such as:
the operating system and version
the version of the software in question and how it was installed (e.g., via a package manager, or manually from a source tarball, or from a Git repository)
It’s better to err on the side of being exhaustive.
Right before you are ready to submit, (re-)read your report yourself to see if you can now get an idea of what might have gone wrong, or to see if you’ve provided all relevant information.
“The devil is in the details” and “Talk is cheap, show me the code” (L. Torvalds, Linux project) are two common idioms pointing to the fact that a verbal description alone, as typically condensed into a paper’s Methods section, is rarely sufficient to reproduce an empirical result.
This is why it’s so important to share relevant data, code, computation environments, etc. However, if you delay preparing your materials for sharing, you might find it difficult, if not impossible, to share your work later on – as your project has progressed forward or may even be completed. Keeping the goal of sharing in mind right from the start will make sharing easier when you’re actually ready or asked to share.
References
Additional materials:
- Four aspects to make science open “by design” and not as an after-thought (15 min) – a brief communication on four considerations when starting a project to facilitate sharing later on.
Key Points
If you can’t reproduce it, it’s likely that nobody else can
Having open sharing in mind from the beginning can simplify future reproducibility