Number of rate shifts
In this vignette we go a little more in-depth and explain how the number of rate shifts ($\hat{N}$) is estimated.
Tree file
First, we load the necessary modules and read in the tree file.
using Pesto
using CairoMakie
phy = readtree(Pesto.path("primates.tre"))
ρ = 0.635
primates = SSEdata(phy, ρ)
Model setup
In this vignette, we pick the rate values by hand, and we don't use so many, in order to illustrate how the calculations work.
tree_length = sum(primates.branch_lengths)
λ = [0.1, 0.2, 0.3, 0.4, 0.20]
μ = [0.05, 0.15, 0.05, 0.15, 0.25]
η = 1 / tree_length
model = SSEconstant(λ, μ, η)
Number of total rate shifts
If we want to see the total number of rate shifts per branch, we can use the function birth_death_shift
for the standard inference:
rates = birth_death_shift(model, primates)
rates[1:5,:]
Row | mean_lambda | mean_mu | mean_netdiv | mean_relext | delta_lambda | delta_mu | delta_netdiv | delta_relext | node | edge | nshift | shift_bf | shift_bf_log |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Int64 | Float64 | Float64 | Float64 | |
1 | 0.199988 | 0.149988 | 0.05 | 0.749971 | -6.41135e-6 | -6.37481e-6 | -3.65428e-8 | -1.59455e-5 | 235 | 1 | 8.45425e-5 | 0.0111373 | -1.95322 |
2 | 0.199963 | 0.149943 | 0.0500193 | 0.749827 | 4.05215e-6 | -0.000114722 | 0.000118774 | -0.000485606 | 236 | 2 | 0.0013657 | 0.0765185 | -1.11623 |
3 | 0.201087 | 0.149207 | 0.0518794 | 0.744795 | 0.0031402 | -0.00186278 | 0.00500298 | -0.0132595 | 237 | 3 | 0.025474 | 3.62425 | 0.559218 |
4 | 0.203266 | 0.14793 | 0.0553359 | 0.735661 | 0.000306701 | -0.000161744 | 0.000468445 | -0.00120087 | 238 | 4 | 0.00340496 | 1.22126 | 0.0868064 |
5 | 0.203468 | 0.147809 | 0.0556598 | 0.734817 | 7.32631e-5 | -6.80288e-5 | 0.000141292 | -0.000386008 | 1 | 5 | 0.00210086 | 1.00764 | 0.00330392 |
We can use Makie
to plot the number of accumulated diversification rate shifts on the branches of the phylogeny
using Makie, CairoMakie
cmap = Makie.cgrad([:black, :red])
treeplot(primates, rates, "nshift"; cmap = cmap)
Number of rate shifts
The total number of rate shifts, as shown above, is a reduction or simplification of the number of rate shifts that can be inferred by Pesto
. The number of rate shifts from state j
to state i
accumulated over the branch length (from old to young) is described by the following differential equation
\[\frac{d\hat{N}_{M,ij}}{dt} = S_{M,j}(t) \frac{-\eta}{K-1} \frac{D_{M,i}(t)}{D_{M,j}(t)} \text{ if } j \neq i\]
with initial condition $\hat{N}_{ij}(t_0) = 0$. In Pesto, we would compute this using
nshift = state_shifts(model, primates; ape_order = false)
The object returned nshift
is a three-dimensional array. The first dimension corresponds to the branch index (what was M
). The second dimension represents the arrival state (i
), and the third dimension represents the departure state (j
). If ape_order = true
, then the first dimension is reordered such that the indices correspond to the node indices in the tree.
If we sum over second and third dimension, we get the number of rate shifts per branch:
sum(nshift, dims = 2:3)[:,1,1]
464-element Vector{Float64}:
8.454252537065528e-5
0.0013656986995030987
0.025474003366199215
0.0034049591324445444
0.0021008609428136214
0.001462101827506005
0.0008341202945861498
0.0008341202945861498
0.0022831472374817813
0.0021950091408757546
⋮
0.0003272696126845015
0.0004382649456816603
0.0004382649456816603
0.0004833775197805262
0.0005587196660507015
0.0005587196660507015
0.0006336182297654021
0.0007282135953863478
0.0007282135953863478
If instead we sum over the first dimension, we get a breakdown over which rate transitions were more frequent:
Nmatrix = sum(nshift, dims = 1)[1,:,:]
5×5 Matrix{Float64}:
0.0 0.235923 0.00980883 0.0329926 0.000304083
0.00072929 0.0 0.00856945 0.0280923 0.000243659
0.00160255 0.552613 0.0 0.0388902 0.00073274
0.00218382 0.952677 0.0135707 0.0 0.00132565
0.000369549 0.106414 0.00695396 0.0223901 0.0
In this case, the most frequent rate shift was from state 2
to state 4
, with $\hat{N} = 0.95$ number of rate shifts. Going from state 2
to state 4
under this model means an increase of $0.4-0.2=0.2$ in speciation rate units. This can for example be visualized using a histogram:
mids, bins = makebins(Nmatrix, model, -0.35, 0.35; nbins = 7)
f = Makie.Figure(resolution = (500, 300))
ax = Makie.Axis(f[1,1],
ylabel = "Number of rate shifts",
xlabel = "Change in speciation rate (λi - λj)",
xticks = round.(mids; digits = 2),
xgridvisible = false,
ygridvisible = false)
Makie.barplot!(ax, mids, bins[:,1], color = :maroon)
f
Most of the rate shift events represent a shift from a smaller to a larger speciation rate (i.e. $\lambda_i - \lambda_j > 0$), however some rate shifts are in the other direction ($\lambda_i - \lambda_j < 0$). There are also a few rate shift events where the speciation rate does not change ($\lambda_i - \lambda_j = 0$). In these events, it is the extinction rate that changes, and not the speciation rate. If we are interested in the question, "what is the overall magnitude of shifts?", we can calculate the mean shift magnitude (weighted by their frequencies):
\[\frac{1}{\sum_{i,j}\hat{N}_{ij}} \sum_{i,j} \hat{N}_{ij} (\lambda_i - \lambda_j).\]
In Julia we can calculate it like so
shifts_weighted = Nmatrix .* (λ .- λ')
mean_magnitude = sum(shifts_weighted) / sum(Nmatrix)
0.0979760754192213
meaning that the overall shift magnitude for the primates tree under this model was an increase of 0.098 speciation rate units.