Significant shifts

In this vignette we will assess the support for that a branch has at least one rate shift, versus the null hypothesis of no shifts. We will test this using Bayes factors, i.e. the relative support for one or more versus no shifts.

Tree file

First, we load the necessary modules and read in the tree file.

using Pesto

phy = readtree(Pesto.path("primates.tre"))
sampling_fraction = 0.62
primates = SSEdata(phy, sampling_fraction)

Model setup

Let's use the standard Pesto analysis with empirical Bayes for the speciation and extinction rates, and maximum-likelihood shift rate.

model, rates = pesto(primates; n = 6)

Bayes factors

A Bayes factor is a summary of the evidence in favor of one hypothesis, as opposed to a different hypothesis (Kass & Raftery 1995). In Pesto, we use Bayes factors to assess the evidence for the hypothesis that there was at least one diversification rate shift, versus the hypothesis that there were no rate shifts, per branch. The standard equation for a Bayes factor is as follows (see also Shi & Rabosky 2015)

\[ \text{Bayes factor} = \frac{\frac{P_M(\geq \text{1 shifts})}{\pi_M(\geq \text{1 shifts})}}{\frac{P_M(0 \text{ shifts})}{\pi_M(0 \text{ shifts})}}\]

where $P_M(\geq \text{1 shifts})$ is the posterior probability of at least one shift, $P_M(\geq \text{0 shifts})$ is the posterior probability of no shifts, $\pi_M(\geq \text{1 shifts})$ is the prior probability of at least one shift, and $\pi_M(\geq \text{0 shifts})$ is the prior probability of no shifts.

Plotting Bayes factors

We can for example plot the Bayes factor directly on the tree. Since the Bayes factor can vary considerably, we instead plot the log-transformed Bayes factors, which are more concentrated around 0. A log Bayes factor with a value of 0 means that the prior and posterior support for the shift hypotheses are equal. A value much larger than 0 means that there is support for one or more shifts. A value of less than 0 means that there is more support for 0 rate shifts.

using Makie, CairoMakie

rates[!,"shift_bf_log"] = rates[!,:shift_bf]

min, max = extrema(rates[1:end-1,"shift_bf_log"])
cmap = Makie.cgrad([:gray, :black, :purple])
treeplot(primates, rates, "shift_bf_log"; cmap = cmap)
Example block output

Plotting supported branches

Alternatively, we can assess which branches had a strong support for there being at least one shift. We first set the significance level at 10, meaning strong support (Kass & Raftery 1995). Next, we can compute which branches has strong support for at least one diversification rate shift, vs the null hypothesis of zero shifts.

Bayes factorlog10 Bayes factorLevel of support
0 to 3.20 to 0.5Not worth mentioning
3.2 to 100.5 to 1Substantial
10 to 1001 to 2Strong
>100>2Decisive

If we inspect the data frame with the branch-specific outputs, we can see specifically which branches has strong support.

using DataFrames
cutoff = 3.2
filter(:shift_bf => x -> x > cutoff, rates)
3×13 DataFrame
Rowdelta_lambdadelta_mudelta_netdivdelta_relextmean_lambdamean_mumean_netdivmean_relextnodeedgenshiftshift_bfshift_bf_log
Float64Float64Float64Float64Float64Float64Float64Float64Int64Int64Float64Float64Float64
10.044266-0.030450.074716-0.3218680.1725480.09499490.07755280.5803362871050.8957577.305957.30595
20.102268-0.008992250.11126-0.2366650.2127240.0993860.1133380.5194773762830.6874015.782835.78283
30.174356-0.02288490.197241-0.4009580.220180.1003280.1198520.5481843863041.0520739.6539.65

The result is that there are two branches that had strong support for there being at least one shift on the branch. The branch that led to the Old World Monkeys has the highest support (Bayes factor of 6.9), while the gibbons clade (represented by Hylobates) have a slightly lower support (Bayes factor of 3.5).

In order to plot these, we can create a dummy variable and use it in the tree plotting function.

rates[!,:strong_support] = Float64.(rates[!,:shift_bf] .> cutoff)
cmap = Makie.cgrad([:black, :red])
treeplot(primates, rates, "strong_support"; cmap = cmap)
Example block output

Both the phylogeny plot, as well as the filtered data frame, show two branches that had substantial support (with a Bayes factor of >3.2) for one or more shifts. The number of estimated diversification rate shifts ($\hat{N}$) on this branch is also significantly larger than zero, at 1.05 and 0.65.

References

  • Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the american statistical association, 90(430), 773-795.
  • Shi, J. J., & Rabosky, D. L. (2015). Speciation dynamics during the global radiation of extant bats. Evolution, 69(6), 1528-1545.