Evolutionary history of New World monkeys revealed by molecular and fossil data

New World monkeys (parvorder Platyrrhini) are one of the most diverse groups of primates, occupying today a wide range of ecosystems in the American tropics and exhibiting large variations in ecology, morphology, and behavior. Although the relationships among the almost 200 living species are relatively well understood, we lack robust estimates of the timing of origin, the ancestral morphology, and the evolution of the distribution of the clade. Here we integrate paleontological and molecular evidence to investigate the evolutionary dynamics of extinct and extant platyrrhines. We develop an analytical framework to infer ancestral states, the evolution of body mass, and changes in latitudinal ranges through time. Our results show that extant platyrrhines originated some 5–10 million years earlier than previously assumed, likely dating back to the Middle Eocene (∼ 43 million years ago, Ma). The estimated ancestral platyrrhine was strikingly small – weighing ∼ 0.4 kg, as compared to the largest modern species over 10 kg – matching the size of their presumed Eocene North African ancestors. Small-sized callitrichines (marmosets and tamarins) retained a small body mass throughout their evolutionary history, thus challenging the hypothesis of phyletic dwarfism as an explanation to their adaptive traits. In contrast, a rapid change in body mass range took place as the three families diverged between the Late Oligocene and the Early Miocene. That period also marks a peak in diversity of fossil platyrrhines and is associated with their widest latitudinal range, expanding as far to the South as Patagonia. This geographic expansion is temporally coincident with a significant increase in platyrrhine population size inferred from genomic data, and with warm and humid climatic conditions linked to the Miocene Climatic Optimum and the lower elevation of the Andes. These results unveil the early evolution of an iconic group of monkeys and showcase the advantages of integrating fossil and molecular data for estimating evolutionary rates and trends.

Introduction particular, the geographic localities where platyrrhine fossils have been found indicate that past populations expanded into the Caribbean (Hispaniola, Cuba and Jamaica, where they no longer exist), and as far south as Patagonia, the southernmost area where non-human primates ever lived (12,15). Other important insights about platyrrhine evolution can be obtained from the body mass of extinct taxa, which is strongly linked (and therefore predictable) from the molar area (16,17). This allows a confident estimate of the body mass of extinct primates, even when the fossil record is extremely sparse. The fossil record of platyrrhines show that extinct taxa account for the largest (> 20 Kg) and some of the smallest (~ 0.4 kg) taxa in the clade (as compared to the current range, approximately 0.1 to 12 kg), but the evolutionary dynamics shaping this wide range of body size changes remain unclear.
As an important complement to the fossil record, phylogenetic comparative methods based on molecular data and current trait measurements can shed further light into evolutionary processes (18). However, there are serious limitations to estimating ancestral states based on phenotypes and distributions of extant taxa only (19)(20)(21)(22). In particular, for quantitative traits, ancestral states inferred from extant taxa cannot be estimated to be outside the observed range and tend be an average of the observed values, without the possibility to infer evolutionary trends (22)(23)(24). These issues should be particularly evident for NWMs, where the spectrum of body sizes and the geographic ranges are larger in extinct taxa than among living species.
Here we compile all available paleontological evidence and combine it with data from living species to infer the evolutionary history of NWMs. Based on a large molecular data set and available taxonomic information, we infer phylogenetic trees of extinct and extant lineages using the fossilized birth-death (FBD) method (25). We then analyze the history of two quantitative traits in the evolution of platyrrhines: body mass and the mean latitude of their geographic range. The abundance of teeth in the fossil record as compared to other skeletal parts, allows us to infer body mass for all described extinct taxa. The location of the fossil sites also provides valuable information on the geographical evolution of the clade. We develop a Bayesian framework to infer the evolutionary history of quantitative traits using phylogenies that incorporate both extant and extinct lineages, which we validate through extensive simulations. The method allows us to jointly estimate the ancestral states at each node in the tree and the rate (describing how fast a trait evolves) and trend (the tendency of a trait to evolve in an estimated direction, e.g., towards larger or smaller values). Both rate and trend parameters can change across lineages in the tree, and our algorithm jointly estimates the number and placements of shifts in parameter values.
By combining molecular and fossil evidence with our new method, we provide new insights into the origin and evolution of extinct and extant platyrrhines. Our analyses reveal how body mass evolution and changes in geographic distributions contributed to shaping the large diversity of life forms encountered today in New World monkeys.

Results
NWM phylogeny. -Our phylogenetic analysis of the platyrrhine clade encompassed 87 extant species and 34 extinct taxa, spanning from the Late Eocene to the Holocene (Tables S4-5). The phylogenetic relationships between extant species were highly supported (Figs. S6-7) and reflected the results of Springer,et al. (26), while the placement of extinct taxa was sampled by the FBD within the limit of taxonomic constraints (see Methods). The FBD analysis placed the origin of the clade in the Eocene and the divergence times of families and subfamilies between the late Oligocene and the mid-Miocene (Table 1). The speciation and extinction rates estimated as part of the FBD model indicate a high turnover rate (extinction / speciation rates = 0.75, 95% credible interval: 0.57-0.91) which is consistent with the relatively high diversity of extinct taxa (Table S6).
Methodological validation. -Extensive simulations show that our novel Bayesian method to infer trait evolution along a phylogeny with extinct and extant taxa provides accurate estimates of the rate and trends parameters and their variation across clades (Supplementary Information). Our algorithm correctly identified the number of shifts in rate or trend (if any) with a frequency of 91.5% (ranging from 79 and 98% across different simulation settings; Table S1). The mean absolute percentage error (MAPE, see Methods) ranged from 0.11 to 0.22 for the rate parameter and between 0.11 and 0.23 for the trend parameter, in data sets with 20 fossils included (Fig. S2a,b, S4d,e). The estimated ancestral states were very accurately estimated with average coefficients of determination (R 2 ) ranging from 0.92 to 0.99 across simulations (Fig. S2c, S4c). Decreasing the number of fossils included in the data had small effects on the accuracy of the estimated rate and trend parameters (Fig. S3a,b) and on the ancestral states (Fig. S3c). In the absence of fossil data, the method could still estimate accurately the rate parameter (while the trend parameter is unidentifiable), but the accuracy of the ancestral states decreased to R 2 = 0.82 for data sets simulated without trends, and to R 2 = 0.53 for traits evolving under a non-zero trend (Fig. S2). The performance of our algorithm in terms of efficiency of sampling the parameters from their posterior distribution and time to evaluate the likelihood significantly outperformed alternative implementations, reducing computation times by one order of magnitude (Fig. S5). Notably, whereas computation time of alternative implementations increases exponentially with tree size, it increases linearly with our method, thus allowing for efficient analysis of very large data sets (thousands of tips; Fig. S5).
Body mass. -Platyrrhine body mass was estimated to have evolved under a Brownian motion (BM) model with variable rates and no or little evidence for positive or negative trends. The estimated number of rate shifts varied among the 100 FBD phylogenies analyzed and ranged between 0 and 7, whereas the trend parameter was found to be mostly homogeneous across branches (Table S7). We found comparatively high rates of body mass evolution at the family level, whereas the rates were substantially lower within subfamilies, except for Cebinae, for which a high rate was inferred (Fig. S8). The trend parameters were overall very close to 0, indicating non-directional evolution of body mass. The ancestral body mass inferred at the root of the tree ranged between 260 and 890 g ( Fig. 1; Table S8). This estimate strongly differs from the estimate obtained from a phylogeny of extant NWMs, after discarding the fossil record (Fig. S10), where the estimated ancestral body mass at the root ranged between 949 and 2,710 g (Table S8).
Range occupancy. -Latitudinal ranges in platyrrhines have changed at variable rates across lineages and we estimated 1 to 4 rate shifts among the 100 FBD phylogenies analyzed (Table S7). The rates were highest in the subfamilies Alouattinae and Cebinae ( Figure S9). The trend parameter was inferred to be constant across clades and positive, although it did not differ significantly from 0 based on the 95% CI (Table S7, Figure S9). The inferred ancestral latitudes through time show an expansion to the South which culminates in the Early Miocene. The slightly positive trend likely captures the general trend of ancestral latitudes towards the present tropics following the disappearance of platyrrhines from the south of South America around the Middle Miocene and expansion into Central America during the Miocene and Pliocene (Fig. 2, S9).

Origin and body size evolution
Our phylogenetic analysis of the platyrrhine clade provided generally older estimates of divergence times compared to previous studies (12,(26)(27)(28). For instance, the crown age of all extant platyrrhines was pushed back from the Early Miocene of previous estimates to the Early Oligocene. This is likely the result of using a more substantial and complete amount of fossil information and a more realistic approach to calibrate the tree (25, 29). The estimated time of origin of the platyrrhine lineage (stem age) is 43 Ma, thus indicating that the morphologically similar extinct North African anthropoids are comparable in age (30) (Table S10). This finding suggests that platyrrhines (or pre-platyrrhines) evolved first in Africa, where they eventually went extinct, and dispersed sometime in the Eocene into South America where they diversified, in the absence of other primates in the continent (4). Previous estimations also suggested an Eocene age for the catarrhine-platyrrhine split (2,27). Based on these lines of evidence, it is probable that the NWMs origin (stem age) dates back to at least the Middle Eocene.
Ancestral state estimates indicate that NWMs derive from a remarkably small ancestor (around 400 g; Fig. 1), with an inferred body mass close to the lower boundary of the size range of extant platyrrhines. In addition to Perupithecus, the oldest fossil included in this study, fossil records from the same locality (Santa Rosa, Peru) include two broken upper molars and a lower molar of different unidentified taxa (4). Their damaged condition did not allow an accurate description of these taxa and the specimens were therefore not included in our analyses. However, their approximate body size was likely around 70% of the size of Perupithecus. Thus, the taxon was possibly the size of a living tamarin, such as Callimico or Saguinus, weighing about 280 g. Despite the uncertainties around these estimates, this fragmentary fossil evidence indicates that the body mass of the ancestral NWM might have been even smaller than 400 g and closer to the size of small marmosets Mico, Callithrix and Cebuella. The hypothesis of a small ancestral body mass at the origin of NWMs is also supported by the estimated size of Eocene anthropoids, especially parapithecoids from North Africa, which are mostly around 500 g (1,31), with Talahpithecus weighing less than 400 g (Table S10). We propose that a small body size was probably a key factor enabling the survival of the individuals that reached South America from Africa, since their resource requirements on a floating islet would have been substantially smaller (e.g., relying on a diet of invertebrates and being better capable of protecting themselves against dehydration) than for a heavier and larger organism.
NWMs reached larger body sizes above 2 kg between the Late Oligocene and Early Miocene. Then, the upper bound of the body mass range in NWMs continuously raised, eventually reaching now extinct giant forms in the Atelidae family like Cartelles or Caipora, from the Pleistocene of Brazil (> 20 Kg; Fig. 1; Table S5).
Despite the expansion in range of platyrrhine body mass through time, most clades maintained intermediate sizes around 1 -3 Kg (Fig. 1 Table S5). Body size evolution in platyrrhines was largely heterogeneous, as indicated by several estimated rate shifts (Table S7), probably as a response to differential evolutionary selection forces acting on species across the vast range of Neotropical habitats.
Asides from the Late Miocene cebid Acrecebus fraileyi (12 Kg), most large NWMs belong in the family Atelidae. The large body size of atelids is associated with particular locomotor adaptations, some of them displaying suspensory behavior, although the four living genera have prehensile tails (1). Living atelids are widely distributed from Central America through northern Argentina, and some taxa can cope with open environments and seasonal climates, such as Alouatta. Despite the wide range of ecologies and adaptations in modern atelids and their large geographic range, the fossil record of the family does not include any specimen from the Miocene sites in Patagonia suggesting that they did not expand as far south as cebids and pitheciids (but see 32).
The smallest NWMs, the callitrichines (marmosets and tamarins), are characterized by a body mass smaller than 1 kg, a distribution spanning tropical forests from the Amazonia to Central America, and very scarce fossil record. The small body size of callitrichine has been usually considered as the result of evolutionary dwarfism (Ford, 1980(Ford, , 1986, and this evolutionary trend is assumed to explain their atypical morphology, such as the reduction or loss of the third molars, loss of the hypocone, presence of claws instead of nails. However, in our analyses, we did not find evidence of a negative trend in body mass evolution within the subfamily (Fig. S8).
Furthermore, the estimated size of the common ancestor of living callitrichines is small (around 0.6 kg; Fig. 1, Table S8), suggesting that callitrichines may have maintained a small body size throughout their evolutionary history. These results challenge the hypothesized dwarfism in callitrichines, and suggest that their peculiar morphological features might be the result of ecological adaptations, which did not necessarily involve a reduction in body mass.
The process of body size evolution was heterogeneous across platyrrhine clades and did not follow a simple neutral evolution. We found evidence for several rate-shifts but only weak or negligible trends. Because we incorporated topological uncertainties in our analyses, which is quite substantial for extinct lineages, it is difficult to pinpoint the exact placement of rate changes. However, we do observe that the rate of body mass evolution is highest at the family level and lower within most subfamilies ( Fig. S8). This suggests a rapid change in body mass range as the three families diverged, between the Late Oligocene and the Early Miocene, followed by a slower pace in evolution within subfamilies (Fig. S8).

Diversity dynamics and geographical occupancy
Our analyses show that the Late Oligocene and Early Miocene also mark an important phase in the biogeography of NWMs, with a peak in fossil diversity and the widest latitudinal range in South America. The geographic expansion of platyrrhines started in the Late Oligocene and culminated between 20 and 15 Ma, when they reached their southernmost distribution. This geographic expansion is temporally coincident with a significant increase in NWMs population size as inferred in a recent analysis of genomic data (2). Thus, both fossil and molecular data identify this phase as a crucial event in platyrrhine diversification and evolution. Miocene coincides with a range contraction of the southern limit of platyrrhine distribution (Fig. 2). The absence of latitudinal barriers (e.g. mountain ranges, deserts) in the continent has likely contributed to making these range expansions and contractions happen (42). Although richer fossil occurrence data would be necessary to assess the roles of extinction and migration in this process, our estimates of high extinction rates in platyrrhines (Table S6)

Methodological advances
Recent studies have shown that the inclusion of fossils in phylogenetic analyses of trait evolution can improve dramatically our ability to infer ancestral states and the underlying evolutionary processes (45). Here we implemented a powerful Bayesian analytical framework to integrate fossil and phylogenetic data in comparative analysis. Using this method, we showed by simulations that the estimation of ancestral states in the presence of evolutionary trends is far from realistic unless at least some fossil information is provided (Fig. S3).
Previous analyses of trait evolution combining data from extinct and extant taxa were either based on node calibrations (21) or on phylogenetic hypotheses built from morphological data (46,47). Node calibrations are based on the assumption that a fossil can be confidently placed at a node in the phylogeny of the living descendants, i.e. it is not a stem or a side branch. The topological placement of fossils in a phylogeny (whether as nodal constraints or extinct tips) is difficult to determine in many clades especially when there is scarcity of morphological traits that can be scored from fossils, as in NWMs, or morphological traits bear little phylogenetic value (12, 48). In our study, we used the available taxonomic information to constrain the position of fossil lineages and, at the same time, relied on the FBD process to sample multiple phylogenetic hypotheses based on an explicit process of speciation, extinction and preservation. Thus, by running trait-evolution analyses on FBD trees we incorporated topological and temporal uncertainties in our estimates.
There remain, however, some limitations in inferring evolutionary history of traits based on the fossil record. The small number and fragmentary nature of available fossil occurrences make it difficult to appreciate the amount of intraspecific variability (e.g. differences in body size across populations or linked to sexual dimorphism) and the impact of measurement error, both of which may affect the reliability of the estimates (49, 50). In the case of biogeographic inferences, while more realistic and spatial-explicit models (e.g. 51) would be desirable to infer the biogeographic history of a clade, the lack of extensive occurrence records means that the analyses must rely on unavoidable simplifications and assumptions. For instance, most platyrrhine fossil taxa are known from single localities and we used these coordinates as representative of their latitudinal range. Despite the limitations associated with the use of fossils in comparative analyses, paleontological data are still crucial evidence of traits that no longer exist in the living descendants (21). Furthermore, simulations using our Bayesian framework to infer trait evolution show that even very few fossils can drastically improve the estimation of ancestral states, indicating that the method can be applied to a large number of empirical data sets ( Fig. S3). The importance of fossil information in the inference of evolutionary processes is evident in our analyses of body size and mean latitudinal range in platyrrhines, where the inclusion of fossil data significantly changes the estimated processes and ancestral states (Figs. S10-11, Tables S8-9).

Conclusions
Our integrated analysis of molecular and fossil data of NWMs revealed that the stem platyrrhine lineage originated earlier than previously thought, and almost certainly in tropical regions. NWMs diversified shortly after their arrival from Africa and expanded their latitudinal range into the entire South American continent. This Under the new evidences about the earliest ancestors of NWM, we infer that body mass evolution in these primates started with small forms, as shown by their Eocene North African relatives, but were later diversified in a wide range of large and middlesized taxa. In contrast, callitrichines kept small sizes throughout their evolutionary history, challenging the widely-accepted hypothesis of phyletic dwarfism in this clade.

Bayesian analysis of trait evolution
We implemented a Bayesian algorithm to estimate the evolutionary history of quantitative traits in a phylogenetic framework. The evolutionary models implemented here are based on Brownian motion (BM) in which the expected trait value v i+t at time t follows the normal distribution: , (1) where v i is the ancestral trait value at time i, μ 0 is the trend parameter describing the tendency of a trait to evolve in a direction, and σ 2 is the rate parameter describing the speed of phenotypic change. Note that most commonly neutral BM models are applied in the absence of fossil data by setting μ 0 = 0. In our implementation, we relax the assumption of a constant BM model by allowing both the rate and the trend parameters to vary across clades in the phylogeny. We use a Birth-Death Markov

Phylogenetic analysis
We used molecular dataset from Springer et al. (26) from which we kept only the 87 species of platyrrhines and discarded all markers with a high proportion of missing data for these taxa. The reduced alignment included 54 nuclear genes and 1 mitochondrial gene for a total length of 36,065 bp, with average coverage per gene of 60% of the living taxa (Tables S2-3). We used 34 fossil taxa with ages ranging from the Late Eocene to the Pleistocene (Tables S4-5 (25). We use taxonomic information following (7, 13) to constrain the placement of extinct taxa in the phylogeny when possible (e.g. to a family or subfamily; Table S4). We ran two MCMC analyses for 100 million generations, sampling every 10,000 generations. Both runs were examined in Tracer v1.5 to check for convergence. We combined the 2 runs after removing a burn-in of 25 million generations. To obtained a dated phylogeny with only extant species, we removed fossils taxa from the entire posterior distribution of trees and used them produce a maximum clade credibility tree.

Trait analyses
We compiled fossil data and body mass estimates for most extant and extinct taxa from (1). We additionally obtained data for Perupithecus from (4), for Canaanimico from (8), for Talahpithecus from (31), and for Panamacebus from (28).
Mean latitudes of extant species were computed from the minimum and maximum latitudes of their ranges as defined in the IUCN database (http://www.iucnredlist.org) and from the Pantheria database (57). Because most fossil taxa are known from single localities we used the latitude or their sampling locality as representative of their mean latitudinal range. We treated latitude as a quantitative trait to infer temporal changes in the ancestral distribution in platyrrhine evolution (e.g. 58).
We ran trait evolution analyses on 100 trees randomly selected from the BEAST posterior sample in order to incorporate topological and temporal uncertainties in our estimates. On each tree, we ran 5 million BDMCMC iterations sampling every 5,000.
Because the trees differed in branching times and in topology, we summarized the ancestral states for each family and subfamily (Table S9). We also calculated the range of trait values occupied through time as the minimum and maximum boundaries of the range of estimated ancestral states averaged over 100 analyses within 1 million- year time bins, following the procedure of (59). The number and placement of rate and trend shifts estimated through BDMCMC varied across trees. Thus, we summarized the parameters by families and subfamilies, by averaging the estimated rates and trends across the lineages within (sub)families. For comparison, we repeated the analysis of body mass and mean latitude evolution after dropping all extinct taxa from the platyrrhine phylogeny (Figs S10-11).

Data and software availability
All the data used in this study (including the nucleotide alignment, BEAST input file and trait data) are available here: https://github.com/dsilvestro/fossilBM. The repository also includes all the R and Python scripts developed to simulate data and analyze them.

Bayesian estimation of trait evolution
We developed a Bayesian method to jointly estimate the parameters of a Brownian motion model of trait evolution (BM), i.e. rates and trends, their variation across clades, and the ancestral states at all internal nodes. We used di↵erent Markov Chain Monte Carlo (MCMC) algorithms combined into a single MCMC to sample the di↵erent parameters: Birth-Death MCMC to estimate the number and placement of shifts in BM parameters, Metropolis-Hastings MCMC to sample rates, trends and the trait value at the root of the tree, and a Gibbs sampler to sample ancestral states at the other nodes. The di↵erent algorithms were chosen for their properties to improve the e ciency of the analysis (see also Section Performance tests). All three steps are run during a single analysis and our algorithm randomly jumps across them based on user-defined frequencies.

Birth-Death MCMC to estimate shifts in rates and trends
We implemented a Bayesian algorithm to jointly estimate the number and placement of shifts in the rate and trend parameters of the BM model. Shifts define monophyletic clades within which the rate or trend parameter is treated as independent of the parameters in the other branches of the tree. To infer the number and placement of shifts, we used Birth-Death MCMC (BDMCMC) (Stephens, 2000), an algorithm that has been previously used to estimate rates shifts in other stochastic processes in an evolutionary biology context (Silvestro et al., 2014). Unlike the reversible-jump MCMC (Green, 1995), the BDMCMC-moves across models are not based on an acceptance probability, but on varying rates of a stochastic birthdeath process. The birth rate, determines the probability of proposing a new shift in rates or trends and is fixed to 1 2 (Stephens, 2000), while individual death rates are calculated for each class of parameters defined by a shift. Death rates determine the probability of removing a rate or trend shift.
We use a Poisson distribution with shape parameter set to 1 as prior distribution on the number of rate and trend shifts.
To compute the death rate of a shift is obtained by calculating the likelihood of the tree under a BM with and without the shift. To compute the likelihood without a shift we set the rate (or trend) of the clade identified by the shift to the background rate (or trend), i.e. the current parameter value at its parent node. The death rate of a parameter class is computed as the ratio between the likelihood without the shift and the likelihood with the shift (Stephens, 2000;Silvestro et al., 2014). Thus, rate or trend shifts that improve the fit of the model have a very low extinction rate, and are unlikely to be removed during the BDMCMC. In contrast, rate shifts that do not improve the tree likelihood (or even decrease it) result in high extinction rates and will be removed very quickly by the BDMCMC algorithm.
The algorithm starts with the simplest BM model (i.e. with homogeneous rate and trend parameters) and randomly selects a clade for which a new rate or trend is sampled from their prior distribution. In this case we use an exponential distribution for rates and a normal distribution centered in 0 for trends. The introduction of a shift in the model represents a "birth" event. As soon as there is at least one shift the death rates for each clade identified by a shift are calculated and the following event of the birth-death process will be determined by the relative magnitude of the rates. Additional details about the BDMCMC algorithm are described by (Stephens, 2000;Silvestro et al., 2014).

Metropolis-Hastings MCMC to estimate BM parameters
For a given set of rate and trend parameters and a vector of ancestral states, the likelihood of a BM model can be calculated as a product of normal densities moving from the tips to the root (see main text). We sampled the rate and trend parameters and the ancestral state at the root of the tree using MCMC with acceptance probabilities defined by the posterior odds (Metropolis et al., 1953;Hastings, 1970). We used multiplier proposals for the rate parameters (while properly adjusting the Hastings ratio (Ronquist et al., 2007)) and sliding window proposals for trends and the root state. To sample the ancestral states from the posterior we therefore draw random values from the conjugate distribution: In our implementation, a Gibbs move implies updating all ancestral states iteratively, sampling from Eq. 1.

Evaluation of the method Simulated data sets
We assessed the accuracy and the e ciency of our Bayesian framework by analyzing simulated data sets and comparing the estimated rates, trends, and ancestral states with the true values. For each simulation, we generated a complete phylogenetic tree (extinct and extant taxa) under a constant rate of birth-death with 100 extant tips (sim.bd.taxa function with parameters = 0.4, and µ = 0.2, in the R package TreePar, (Stadler, 2011)).
The number of fossils simulated on the tree was defined by a Poisson distribution with expected number of occurrences (N ) equal to the total branch length times the preservation rate (q). We fixed the number of expected fossils N = 20 by setting q to the ratio between N and the sum of branch lengths. Then, each fossil was placed randomly on the tree and extinct tips without fossils were pruned out (note that this procedure is equicalent to that implemented in the FBD original paper (Heath et al., 2014)). Finally, the fossilized tree was re-scaled to an arbitrary root height of 1.
Phenotypic data were simulated on every tree with the following parameters: rate of evolution ( 2 ), phenotypic trend (µ 0 ), number of shifts in rate, number of shifts in trend, magnitude of the shifts. We simulated data sets under six evolutionary scenarios: 1. constant 2 drawn from a gamma distribution (2,5), and fixed µ 0 = 0 2. fixed 2 = 0.1, and constant µ 0 drawn from a normal distribution N (2, 0.5) 3. initial 2 r = 0.1, fixed µ 0 = 0, and one rate shift in a randomly selected clade so that 4. baseline 2 = 0.1, fixed µ 0 = 0, and two shifts in 2 drawn from an uniform distribution representing a change with magnitude between 8 and 16 fold 5. fixed 2 = 0.1, fixed initial µ 0 = 0, and one shift in µ 0 drawn from a normal distribution N (2r, 0.5), where r was randomly set to -1 or 1 (simulating negative or positive trends, respectively).
For the last two scenarios the location of shifts (in 2 or µ 0 ) was randomly selected by choosing clades that contained between 25 to 50 tips. The branches within the chosen clades were assigned with the new parameter value before performing the phenotypic data simulation using a mapped tree (make.era.map function in R package phytools (Revell, 2012)).
We simulated 100 data sets under each of the six scenarios. We additionally ran simulations for scenarios (1) and (2) with a lower number of fossils (N= 5, 1, and 0) adjusting the q parameter. For simulations with one fossil, we simulated fossils under a Poisson process with N = 20 and kept only the oldest occurrence in the trait analysis.

Analysis of simulated data
We analyzed each simulated dataset to estimate the rate and trend parameters of the BM model ( 2 and µ 0 ) and the ancestral states. Each dataset was run for 500,000 MCMC generations, sampling every 500 steps. We summarized the results in di↵erent ways. First, for each simulation we graphically inspected the results by plotting the phylogeny with the width of the branches proportional to the true and estimated rates. We plotted the true versus estimated 2 , and µ 0 for each branch on the tree (Fig. S4). The true and estimated ancestral states are compared in a phenogram plot (Revell, 2012).
Secondly, we numerically quantified the overall accuracy of the parameter estimates 6 across all simulations using di↵erent summary statistics for each set of parameters. The the BM rate parameters ( 2 ) we calculated the mean absolute percentage error (MAPE), defined as: where j is the simulation number,ˆ 2 i is the estimated rate at branch i, 2 i is the true rate at branch i, and N is the number of branches in the tree. Because the trend parameter can take both negative and positive values, we used the mean absolute error (MAE) to quantify the accuracy of its estimates: whereμ i 0 is the estimated trend at branch i and µ i 0 is the true trend at branch i. We quantified the accuracy of the ancestral state estimates in terms of coe cient of determination (R 2 ) between the true and the estimated values. These summary statistics were computed for each simulation scenario (across 100 replicates) and are provided in Figs. S2 and S3.
Finally, we assessed the ability of the BDMCMC algorithm to identify the correct BM model of evolution in terms of number of shifts in rate and trend parameters. We calculated the mean probability estimated for models with di↵erent number of rate shifts (K 2 ranging from 0 to 4) and shifts in trends (K µ0 ranging from 0 to 4). Note that K = 0 indicates a model with constant rate and/or trend parameter across branches. The estimated posterior probability of a given number of rate shifts was obtained from the frequency at which that model was sampled during the MCMC (Stephens, 2000). We averaged these probabilities across 100 simulations under each scenario. We additionally calculated the percentage of simulations in which each model was selected as the best model (i.e., it was sampled most frequently). These summary statistics are provided in the Table S1.

Performance tests
We compared the performance of our implementation using Gibbs sampling for ancestral states with that of an implementation using a Metropolis-Hasting algorithm to update the ancestral states, i.e. the method used in other software such as Geiger (Slater et al., 2012;Slater and Harmon, 2013;Pennell et al., 2014) and phytools (Revell, 2012).
We implemented the latter option in our code by updating the ancestral states individually using sliding window proposals and acceptance probability based on the posterior ratio.
We ran the two implementations on trees with 50, 100, 500, and 1000 tips for 100,000 MCMC iterations, sampling every 50 iterations and using the true parameter values as starting values in the MCMC to avoid burnin. We simulated the data using a constant Brownian rate ( 2 = 0.1) and no trend (µ 0 = 0) and ran the analyses setting the MCMC to run using the simplest Brownian model (i.e. no rate shifts and no trends). We then calculated for each simulation the run time and the e↵ective sample size (ESS) of the posterior and use them to estimate the run time necessary to reach an ESS = 1000. These performance tests (summarized in Fig. S5a), show that the Gibbs sampler achieved the target ESS of 1000 in a much shorter time than the alternative Metropolis-Hastings algorithm. Importantly, the time required to reach ESS = 1000 scaled linearly with tree size using the Gibbs sampler and exponentially using the Metropolis-Hastings. The joint estimation of model parameters (rates and trends) and ancestral states allow us to compute the likelihood as a product of normal densities (Fig. S1) while avoiding the use of a variance-covariance (VCV) matrix (commonly used in comparative methods (Felsenstein, 1988;O'Meara, 2012)), which can be very expensive especially for large trees. We ran simulations to assess the performance of the two approaches to 8 calculate the likelihood of the data (product of normal densities against standard calculation using the vcv matrix). We simulated trees of 50, 100, 500, 1000, and 5000 tips under a pure birth process (only extant taxa) and trait data under a simple constant Brownian model ( 2 = 0.1, µ 0 = 0). On each data set we ran 100,000 MCMC iterations and recorded the run time. The results of these simulations are summarized in Fig. S1b and show that the two approaches to calculate the model likelihood perform similarly for small tree ( 100 tips). However, the VCV likelihood calculation time scales exponentially with tree size, while our implementation with the product of normal densities scales linearly with tree size. Thus, for instance, our implementation was 1.7 times faster than the alternative with 500 tips and 16.4 times faster with 5000 tips. Tables   Table S1: Model of testing across simulations. We summarized the mean probability (Pr) estimated for models with di↵erent number of rate shifts (K 2 ranging from 0 to 4) and shifts in trends (K µ0 ranging from 0 to 4) across 100 simulations under each scenario (see Supplementary Methods). We additionally calculated the percentage of simulations in which each model was selected as the best model (%bm). Values in bold represent the settings used to simulated the data.         Table S7: Evolution of body mass and latitude in Platyrrhines: estimated number of shifts in rate and trend parameters averaged over 100 phylogenies of extinct and extant taxa. We summarized the mean probability (Pr) estimated for models with di↵erent number of rate shifts (K 2 ranging from 0 to 7) and shifts in trends (K µ0 ranging from 0 to 7). For both traits constant rate BM models received very little support and the number of rate shifts ranged between 1 and 4 depending on the tree (Figs. S8, S9). This heterogeneity of BM models estimated across di↵erent trees is likely to capture the uncertainties associated with the placement of fossil lineages and branching times. We found little evidence of shifts in trend parameters (S8, S9).

Supplementary
Body mass Mid latitude n. shifts Pr %bm n. shifts Pr bm    Figure S1: Ancestral states at the internal nodes of the tree are sampled directly from their posterior distribution, which combines four normal densities: two from the descendants, one from the parent node (all of which are based on the current trait states and parameters of the BM model), and one being a vague normal prior distribution on the node state (blue graph). The notation follows that of equation (1).