Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document and / or correct repeatability #25

Open
chimaerase opened this issue Nov 20, 2021 · 3 comments
Open

Document and / or correct repeatability #25

chimaerase opened this issue Nov 20, 2021 · 3 comments

Comments

@chimaerase
Copy link
Contributor

I'm hoping to use PTMCMCSampler in a production-quality scientific application, where repeatability is important for error diagnosis and / or for peer review. I see that repeatability hasn't been specifically documented yet in the GitHub repo, so very likely I've missed something in my attempts to make results repeatable across (seeded) runs.

My tests imply I've either missed something in my attempts to seed random number generators for PTMCMCSampler, or that multi-chain runs aren't currently repeatable. I've attached results from my tests, using a similar test environment to that used in #23. My new branch has instructions updated to avoid the errors discussed in that ticket, which shouldn't be a factor here.

Using my current code, tests based on PTMCMCSampler's "simple" notebook are repeatable using a single chain, but are not repeatable across runs using more than one chain. My suggestion is to at least document the current repeatability status to save others from the need for similar work, or ideally, to correct either A) my assumptions or B) any repeatability errors in the library.

Thanks so much for your work and attention!

@jellis18
Copy link
Collaborator

@chimaerase thank you for your work on this. It would be good to eventually add your whole example using Docker into the main repo.

To be honest I haven't really looked at this code in several years other than to diagnose problems every now and then. I'm tagging a few former colleagues who also use this code for research and publications so maybe they have more insight on reproducibility.

@Hazboun6, @paulthebaker, @svigeland have you all been able to get reproducible results using the parallel tempering functionality?

@chimaerase
Copy link
Contributor Author

Thanks to @jellis18 for your replies on this and other recent issues! I realize this repo hasn't been very actively maintained of late, and I'm hoping you're still receptive to reviewing PR's that may benefit the scientific community and motivate additional citations. As I mentioned earlier in this thread, we hope to continue using PTMCMCSampler in a production quality application that's being used to design synthetic biology experiments for laboratories funded by the United States Department of Energy. I see many recent PTMCMCSampler citations in the astronomy domain too, and I'm hoping enhancements may be helpful to both communities.

Recently developed fixes that address the repeatability concerns mentioned here -- PR's to follow. It's also worth noting that this work was funded by the United States Department of Energy, which is requiring me to propose some simple additions to your LICENSE file to acknowledge their contribution.

@chimaerase
Copy link
Contributor Author

Testing (~manually) last week showed that this is fixed in #29. As a follow-up, it would be a good idea to implement automated tests to ensure this doesn't recur, as well as to document how to get repeatable results to highlight the importance of repeatability for users who may not be aware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants