Extended rate analysis
In this vignette, we will do the same as in the simple analysis, but we explain how the model is set up in more detail.
Tree file
First, we load the necessary modules and read in the tree file.
using Pesto
phy = readtree(Pesto.path("primates.tre"))
ρ = 0.635
primates = SSEdata(phy, ρ)
Model choice
Next, we set up the SSE model, including its dimensionality and hyperparameters. For this model, we will draw the speciation rate (λ) and extinction rate (µ) from LogNormal distributions. We pick the median of the LogNormal distributions such that they correspond to the maximum-likelihood estimates of the constant-rate birth-death model. This is also called the "empirical Bayes" approach, where we use the data twice. We pick the log-sd as H = 0.587
, which corresponds to a LogNormal distribution whose 2.5%-97.5% quantile spans one order of magnitude.
λml, μml = estimate_constant_bdp(primates)
H = 0.587
n = 6
using Distributions
dλ = LogNormal(log(λml), H)
dμ = LogNormal(log(µml), H)
λquantiles = make_quantiles(dλ, n)
µquantiles = make_quantiles(dμ, n)
λ, μ = allpairwise(λquantiles, µquantiles)
The scatter plot of λ
on the x-axis, and µ
on the y-axis looks like the figure below (blue dots), with the quantiles of the LogNormal distributions on the margin.
Next, we estimate the rate shift parameter η under the SSE model, conditional on λ and µ.
η = optimize_eta(λ, µ, primates)
0.0033108738372094435
The units of $\eta$ are number of rate shift events per lineage per time. The product of the tree length (the sum of all branch lengths) times $\eta$ will give us the number of expected rate shifts under the prior:
sum(primates.branch_lengths) * η
5.788171474756159
This allows us to set up the SSE model object:
model = SSEconstant(λ, μ, η)
With the model and data objects we can for example calculate the log likelihood
logL_root(model, primates)
-687.92449438465
Branch rates and shifts
Or we can compute both the postorder and preorder pass, and get the expected speciation and extinction rates per branch. The result is a data frame object, and we print the first five rows:
rates = birth_death_shift(model, primates)
rates[1:5,:]
Row | mean_lambda | mean_mu | mean_netdiv | mean_relext | delta_lambda | delta_mu | delta_netdiv | delta_relext | node | edge | nshift | shift_bf | shift_bf_log |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Int64 | Float64 | Float64 | Float64 | |
1 | 0.219675 | 0.137233 | 0.0824416 | 0.614623 | -0.000269656 | -0.000376738 | 0.000107082 | -0.000749151 | 235 | 1 | 0.00656303 | 0.147156 | -0.832222 |
2 | 0.219822 | 0.137151 | 0.0826702 | 0.613974 | 0.000750585 | 9.25196e-5 | 0.000658066 | -0.00152837 | 236 | 2 | 0.0200711 | 0.187898 | -0.726078 |
3 | 0.221556 | 0.137215 | 0.0843406 | 0.610394 | 0.00324677 | 0.000146542 | 0.00310023 | -0.00601478 | 237 | 3 | 0.0298147 | 0.722026 | -0.141447 |
4 | 0.224178 | 0.137536 | 0.0866416 | 0.606666 | 0.00152271 | 0.000592671 | 0.000930041 | -8.00436e-5 | 238 | 4 | 0.0108046 | 0.66752 | -0.175535 |
5 | 0.225484 | 0.138266 | 0.087218 | 0.607953 | 0.000992493 | 0.000933741 | 5.87518e-5 | 0.003101 | 1 | 5 | 0.00982901 | 0.813542 | -0.0896202 |
Tree plots
As before, we can use Makie
to make some quick tree plots. Here we are plotting the average net-diversification rate per branch, with a two-color scheme going from black to green.
using Makie, CairoMakie
cmap = Makie.cgrad([:black, :green])
treeplot(primates, rates, "mean_netdiv"; cmap = cmap)