Ancient gene flow complicates phylogenetic analyses of leaf warblers

Understanding underlying processes helps to select the “correct” genomic loci.

Different genes tell different stories. This simple statement captures the essence of phylogenomic analyses. The evolutionary history of a particular gene (or genomic region) can be shaped by several processes, such as interspecific gene flow and natural selection. This insight raises the question which genomic regions we should use to estimate the “true” species tree. Some authors have argued that regions of low recombination are most suitable for phylogenetic analyses. Genetic variants flowing in from another species that end up in these rarely recombining regions might become linked to deleterious alleles and will be quickly removed from the population. Hence, regions of low recombination are expected to be immune to introgression, potentially retaining the “true” evolutionary history of the species. However, phylogenomic analyses of Ficedula flycatchers revealed that low recombination regions can produce misleading results due to strong selection. Clearly, the debate on the most suitable genomic regions for phylogenetic analyses has not been settled. A recent study in the journal Systematic Biology provided another perspective on this issue by exploring the evolutionary history of several leaf warblers.

Three Hypotheses

Dezhi Zhang and his colleagues sequenced the whole genomes of 78 leaf warblers, representing 8 species. Previous analyses – using a limited number of genetic markers – could not confidently resolve the phylogenetic relationships among these species. Specifically, the position of Martens’s Warbler (Phylloscopus omeiensis) turned out to be problematic. The researchers proposed three hypotheses to explain the phylogenetic issues with this species:

  • Hypothesis 1: Martens’s Warbler is the sister species of Whistler’s Warbler (P. whistleri) and Bianchi’s Warbler (P. valentini), but gene flow from another species – Alström’s Warbler (P. soror) – results in Martens’s Warbler clustering with Alström’s Warbler.
  • Hypothesis 2: Martens’s Warbler is the sister species of Alström’s Warbler, but ancient gene flow from the ancestor of Whistler’s Warbler and Bianchi’s Warbler results in Martens’s Warbler clustering with these species.
  • Hypothesis 3: Martens’s Warbler is a hybrid species originating from hybridization between Alström’s Warbler and the ancestor of Whistler’s Warbler and Bianchi’s Warbler.

I can imagine that these three scenarios are difficult to follow. Luckily, the authors provided a clear overview of their hypotheses in the figure below.

An overview of the three hypotheses on the phylogenetic position of the Martens’s warbler (Phylloscopus omeiensis). From: Zhang et al. (2021).

Demographic Analyses

Using a suite of phylogenetic analyses, the researchers tried to figure out which hypothesis depicts the most likely scenario. I will not go into the technical details of all these analyses, but I will summarize the main results:

  • Comparing several demographic models with the coalescent simulator fastsimcoal2 revealed that scenarios with ancient gene flow between Martens’s Warbler and the ancestor of Whistler’s Warbler and Bianchi’s Warbler received the highest support.
  • D-statistic analyses suggested high levels of gene flow between Martens’s Warbler and Whistler’s Warbler and between Martens’s Warbler and Bianchi’s Warbler. These patterns were corroborated with demographic analyses using the software DADI.
  • Phylogenetic network analyses pointed to Martens’s Warbler and Alström’s Warbler as sister species, but with ancient gene flow between Martens’s Warbler and the ancestor of Whistler’s Warbler and Bianchi’s Warbler.

If you compare these findings with the three hypotheses outline above, you will quickly see that hypothesis two comes out on top. It seems that Martens’s Warbler is the sister species of Alström’s Warbler, but ancient gene flow from the ancestor of Whistler’s Warbler and Bianchi’s Warbler has thrown a wrench in previous phylogenetic analyses.

Phylogenetic network analyses consistently cluster Martens’s Warbler (P. omeiensis) and Alström’s Warbler (P. soror), but also indicate gene flow between Martens’s Warbler and the ancestor of Whistler’s Warbler (P. whistleri) and Bianchi’s Warbler (P. valentini). The three graphs show results for (a) one, (b) two, and (c) three reticulations. From: Zhang et al. (2021).

Gene Flow and Selection

So, we have managed to resolve the phylogenetic mystery of the leaf warblers. But what about the question posed at the beginning of this blog post: which genomic regions are most suitable for phylogenetic analyses? Taking a closer look at the genome of Martens’s Warbler, the researchers discovered that regions of low recombination were mostly affected by ancient gene flow. They suspect that the introgressed variants from other species ended up in these low recombination regions and were retained in the population by strong positive selection. This observation suggests that “low recombination […] may not be a good indicator of genomic regions suitable for inferring the true phylogeny in the context of ancient gene flow.”

Does this mean that we can safely discard regions of low recombination in phylogenomic analyses and continue working with the rest of the genome? Not necessarily. Which genomic regions are most suitable for phylogenetic analyses will depend on the evolutionary history of the study system. Because the evolution of these leaf warblers has been shaped by high levels of ancient gene flow and strong positive selection on the introgressed regions, it turns out that low recombination regions are not reliable for phylogenetics. In other study systems, ancient gene flow might be less pervasive or introgressed variants were quickly removed from the population. In those scenarios, low recombination regions might be good candidates to reconstruct the true species phylogeny. In other words, which genomic regions should be used in phylogenetic analyses will depend on the evolutionary history of the study system. Due to the contingent nature of evolution, there will probably be no silver bullet to reconstruct the “true” species tree.

And there is also the question whether the “true” species tree actually exists. Perhaps evolution is just one big reticulated network that cannot be captured in a simple bifurcating tree. But that is a discussion for another blog post.


Zhang, D., Rheindt, F. E., She, H., Cheng, Y., Song, G., Jia, C., Qu, Y., Alström, P. & Lei, F. (2021). Most genomic loci misrepresent the phylogeny of an avian radiation because of ancient gene flow. Systematic Biology70(5), 961-975.

Featured image: Whistler’s Warbler (Phylloscopus whistleri) © Raju Kasambe | Wikimedia Commons

A transposable element associated with migratory behavior of Willow Warblers

But how does it contribute to migration?

What genes determine the migratory strategies of birds? Several studies have tried to answer this question in a variety of study systems, such as Vermivora warblers, European Blackcaps (Sylvia atricapilla) and Willow Warblers (Pylloscopus trochilus). Interestingly, different candidate genes have popped up in different studies, suggesting that the genetic basis for migration varies between species. In the Willow Warbler, for example, researchers took advantage of the divergent migratory strategies of two subspecies: trochilus migrates to the southwest, whereas acredula follows a southeastern route. These migratory differences were associated with three genomic regions, located on chromosomes 1, 3 and 5. However, a previously identified genetic variant – using the older AFLP-technique – could not be assigned to a particular genomic region. Given that most avian genome assemblies are far from complete, it could be that this variant – known as WW2 – resides in a difficult-to-assemble section of the genome, such as a region with repetitive sequences. That is why Violeta Caballero-López and her colleagues used an updated version of the Willow Warbler genome to determine the location and identity of the WW2-variant. Their findings recently appeared in the journal Molecular Ecology.

Endogenous Retrovirus

The newest Willow Warbler genome was sequenced using a long-read technique which allows scientists to reconstruct highly repetitive sections of the genome. Within one of these section, the researchers found the WW2-variant. Additional analyses indicated that it concerns a transposable element, which is a selfish genetic element that “jumps” around the genome using either a copy-and-paste or a cut-and-paste mechanism. This particular transposable element turned out to be an endogenous retrovirus (ERV) that inserted itself into the genome of an ancestral songbird a long time ago. A similar variant is also present in the genome of the Zebra Finch (Taeniopygia guttata), which diverged from the Willow Warbler at least 20 million years ago.

A detailed look at the WW2-variant revealed additional evolutionary changes. Apart from the ancestral version shared with the Zebra Finch, the researchers also uncovered a derived version. The latter version probably originated after a duplication and an inversion event. The resulting sequence then accumulated mutations, leading to divergence from the ancestral state. Interestingly, the derived version was much more abundant in the acredula-subspecies (7 to 45 copies) compared to the trochilus-subspecies (0 to 6 copies).

A schematic overview of the probable evolution of the WW2-variant. The ancestral version (small green arrow) duplicated and became inverted (figures c and d). The resulting sequence accumulated mutations and diverged into the derived version (small yellow arrow). From: Caballero‐López et al. (2021).

Smelly Migration?

The different number of copies of the derived version in the two subspecies suggests that the transposable element might be involved in their migratory behavior. The genomic region surrounding the WW2-variant contains several olfactory receptors. It is tempting to speculate that olfaction might help Willow Warblers during their migration (as shown in homing pigeons), but the researchers warn that more analyses are needed to test this hypothesis. Alternatively, the WW2-variants might interact with other genomic regions – perhaps the ones on chromosomes 1, 3 and 5 – to influence migratory behavior. A similar mechanism has been described in Carrion Crow (Cornix c. corone) and Hooded Crow (C. c. cornix) where a transposable element might be involved in the regulation of plumage coloration. Clearly, there are many new exciting questions to investigate in the Willow Warbler. Just as transposable elements “jump” around the genome and explore new territories, scientists keep delving into knowledge gaps to uncover surprising new insights. You never know where the next analysis will take you…


Caballero‐López, V., Lundberg, M., Sokolovskis, K., & Bensch, S. (2022). Transposable elements mark a repeat‐rich region associated with migratory phenotypes of willow warblers (Phylloscopus trochilus). Molecular Ecology31(4), 1128-1141.

Featured image: Willow Warblers (Pylloscopus trochilus) © Chris Romeiks/ | Wikimedia Commons