Extended rate analysis

In this vignette, we will do the same as in the simple analysis, but we explain how the model is set up in more detail.

Tree file

First, we load the necessary modules and read in the tree file.

using Pesto

phy = readtree(Pesto.path("primates.tre"))
ρ = 0.635
primates = SSEdata(phy, ρ)

Model choice

Next, we set up the SSE model, including its dimensionality and hyperparameters. For this model, we will draw the speciation rate (λ) and extinction rate (µ) from LogNormal distributions. We pick the median of the LogNormal distributions such that they correspond to the maximum-likelihood estimates of the constant-rate birth-death model. This is also called the "empirical Bayes" approach, where we use the data twice. We pick the log-sd as H = 0.587, which corresponds to a LogNormal distribution whose 2.5%-97.5% quantile spans one order of magnitude.

λml, μml = estimate_constant_bdp(primates)

H = 0.587
n = 6

using Distributions
dλ = LogNormal(log(λml), H)
dμ = LogNormal(log(µml), H)

λquantiles = make_quantiles(dλ, n)
µquantiles = make_quantiles(dμ, n)
λ, μ = allpairwise(λquantiles, µquantiles)

The scatter plot of λ on the x-axis, and µ on the y-axis looks like the figure below (blue dots), with the quantiles of the LogNormal distributions on the margin.

primatestree

Next, we estimate the rate shift parameter η under the SSE model, conditional on λ and µ.

η = optimize_eta(λ, µ, primates)
0.0033108738372094435

The units of $\eta$ are number of rate shift events per lineage per time. The product of the tree length (the sum of all branch lengths) times $\eta$ will give us the number of expected rate shifts under the prior:

sum(primates.branch_lengths) * η
5.788171474756159

This allows us to set up the SSE model object:

model = SSEconstant(λ, μ, η)

With the model and data objects we can for example calculate the log likelihood

logL_root(model, primates)
-687.92449438465

Branch rates and shifts

Or we can compute both the postorder and preorder pass, and get the expected speciation and extinction rates per branch. The result is a data frame object, and we print the first five rows:

rates = birth_death_shift(model, primates)
rates[1:5,:]
5×13 DataFrame
Rowmean_lambdamean_mumean_netdivmean_relextdelta_lambdadelta_mudelta_netdivdelta_relextnodeedgenshiftshift_bfshift_bf_log
Float64Float64Float64Float64Float64Float64Float64Float64Int64Int64Float64Float64Float64
10.2196750.1372330.08244160.614623-0.000269656-0.0003767380.000107082-0.00074915123510.006563030.147156-0.832222
20.2198220.1371510.08267020.6139740.0007505859.25196e-50.000658066-0.0015283723620.02007110.187898-0.726078
30.2215560.1372150.08434060.6103940.003246770.0001465420.00310023-0.0060147823730.02981470.722026-0.141447
40.2241780.1375360.08664160.6066660.001522710.0005926710.000930041-8.00436e-523840.01080460.66752-0.175535
50.2254840.1382660.0872180.6079530.0009924930.0009337415.87518e-50.003101150.009829010.813542-0.0896202

Tree plots

As before, we can use Makie to make some quick tree plots. Here we are plotting the average net-diversification rate per branch, with a two-color scheme going from black to green.

using Makie, CairoMakie

cmap = Makie.cgrad([:black, :green])
treeplot(primates, rates, "mean_netdiv"; cmap = cmap)
Example block output