6. PyMsBayes Change History
6.1. Version 0.3.5
6.1.1. Changes
- Updating versions of bundled msbayes.pl and dpp-msbayes.pl PERL scripts
from dpp-msbayes package. These no longer use deprecated POSIX::tmpnam.
Instead, they use File::Temp::tempfile.
6.1.2. Bug Fixes
- Fixing a bug that occurred for two-stage analyses (i.e., Stage 1: generate
all prior samples, and Stage 2: then do rejection). When ‘–compress’
option was used in Stage 1 command, the prior sample file was being
compressed, which caused eureject (and ultimately dmc.py) to crash during
rejection (Stage 2). This bug is fixed in this release.
- Fixing issue where the reporting frequency could be greater than the total
number of prior samples.
6.2. Version 0.3.4
6.2.1. Changes
- Making ‘pretty’ summary output of dmc_posterior_probs.py script more
robust. The regex used by ListConditionEvaluator to put the names of
the taxon pairs into the expression was easily fooled by numbers included
in the names. This did not affect the results, but led confusing summary
output. This is now more robust, and the output messages should now help
rather than confuse.
6.3. Version 0.3.3
6.3.1. Changes
- If ‘nan’ observed summary statistics are produced, an error is raised to
crash the analysis early on. This can happen for some summary statistics
when there are populations with only a single sequence sampled.
6.4. Version 0.3.2
6.4.1. New Features
- New documentation.
- New tutorials.
- A new option timeInSubsPerSite is now supported for dpp-msbayes and
msbayes analyses.
- The coefficient of variation of divergence times is now reported.
- The dmc_dpp_summary.py CLI script now estimates and reports the
probabilities of the number of categoories.
- New dmc_sum_sims.py CLI script for summarizing and plotting the results
of simulation-based analyses.
- New “-m/–mu” option added to dmc_plot_results.py script for specifying
the mutation rate to rescale time to generations.
- New “–extension” option added to dmc_plot_results.py script to allow
the file format of the plots to be specified.
- Adding new worker classes to the API for simulating data via Joseph Heled’s
biopy package.
- Adding DppSimWorker and DppSimTeam to the API.
6.4.2. Changes
- More rigorous checks added to dmc.py to make sure all the sample tables
are the same within an analysis.
- The logging frequency is now in units of prior samples rather than batch
iterations.
- Help menus of CLI scripts have been updated.
- New dpp-msbayes executables were added that accommodate the
timeInSubsPerSite option and reporting the coefficient of variation of
divergence times.
- The package has been updated to handle these new options.
- For the dmc_plot_results.py CLI script, changing the the
--iteration-index option to --sample-index to reflect the new
logging behavior (units of prior samples rather than batch iterations).
6.5. Version 0.3.1
6.5.1. Changes
- Cleaning up intra-package imports and rearranging some of the code.
6.6. Version 0.3.0
6.6.1. Changes
- Updated the version of dpp-msprior binary (for both mac and linux) that
is bundled with the package. This version of dpp-msprior is from
version 0.2 of the dpp-msbayes package and was updated to create a
lower limit (0.000000000001) for values of theta. In rare cases, a theta
value of zero would cause the coalescent simulator msDQH to crash.
Any previous analyses that did not crash should not be affected by
this change.
- The newer version of dpp-msprior also changes the weirdness from the
original msBayes where there was a check for small (i.e., 0.0001)
divergence times. In such simulations, the div time was set to this
arbitrary lower bound, and the bottleneck time was set to 0.5 of this. I
am guessing this was to prevent unrealistic (and numerically unstable?)
changes in pop size. However, 0.0001 can be thousands of generations which
is not trivial. Also, rather than this weird hack of the bottleneck time,
it seems much better to simply have no bottleneck if the div time is
essentially zero. Accordingly, I lowered the threshold and simply “turn
off” the bottleneck if the time is below it (I no longer adjust the div
time or bottleneck time). This change should have very little affect on
most analyses.
6.6.2. Bug Fixes
- As mentioned in the change above, in rare cases, theta values of zero drawn
from the prior via dpp-msprior would cause msDQH to crash. The new
version of the dpp-msprior binaries should prevent this. Any previous
analyses that did not crash were not affected by this bug.
6.7. Version 0.2.2
6.7.1. Changes
- End-user CLI scripts log additional information.
- dmc.py will now set the number of standardizing samples equal to the
number of prior samples, if the former is larger than the latter.
6.7.2. New Features
- New management of paths to dpp-msbayes and msBayes executables. Previously,
the package only used the executables bundled with the package, which
required the package to be installed via python setup.py develop. Now,
the package will also look for executables on the system.
- Updating setup.py so that the bundled executables will be installed
along with end-user scripts via python setup.py install after checking
that all the executables work on the system.
- 32-bit Linux versions of the dpp-msbayes/msBayes executables are now
bundled with the package.
6.7.3. Bug Fixes
- Fixed a bug where MsBayesWorker instances with a sample size of zero
would hang indefinitely. A ValueError is now raised if an MsBayesWorder
is initiated with an integer less than one for the sample size.
6.8. Version 0.2.1
6.8.1. Changes
- Adding working version of documentation.
- Overhauled package-wide and CLI-script logging management.
6.8.2. Bug Fixes
- Fixing bug in dmc.py script, where the wrong value for the number of
standardizing samples was being written to the run summary file.
6.9. Version 0.2.0
6.9.1. New Features
- New options added to package and scripts for the sort index. These allow
summary statistics to be grouped by taxon, but without re-sorting.
6.9.2. Changes
- The new default option for the sort index for the entire package and for
end-user scripts is to not group or re-sort the summary statistics of each
alignment. The re-sorting that was done prior to this was NOT VALID,
because the summary statics calculated from the different alignments are
NOT exchangeable.