Phylogenetic tree of Nemacheilidae, including all 471 analysed nemacheilid specimens, reconstructed in MrBayes based on six loci. The topologies of BI and ML trees were congruent. The values at the basal nodes correspond to Bayesian posterior probabilities and ML bootstrap supports, respectively. Only the ingroup is shown.

Divergence time estimation resulting from Bayesian divergence time analysis of concatenated dataset in BEAST 2: maximum clade credibility (MCC) tree of the whole dataset, with the Nemacheilidae ingroup collapsed into main clades. Red stars indicate calibration points, values at branches represent posterior probabilities (PPs lower than 0.90 not shown) and blue horizontal bars denote 95% HPD. The time scale is in millions of years.

Geographic distribution of the six major clades of Nemacheilidae.

Topologies of main clades derived from various analyses. A) Topologies recovered by ML analyses of single gene trees, with values at branches indicating bootstrap supports. Topologies congruent to the concatenated data-derived as well as species tree is revealed by Cyt b, MYH6 and IRBP2. Three slightly different topologies were derived from RH, EGR3 and RAG1 B) Topology obtained from analyses of concatenate dataset, the values at the branches correspond to ML/MrBayes PP/BEAST PP supports, respectively. C) Topology inferred by ASTRAL-III using the unrooted ML single gene trees as input. The values at the branches correspond to support values/gCF/sCF.

Divergence time estimation: Ingroup of the maximum clade credibility (MCC) tree resulting from Bayesian divergence time analysis of concatenated dataset in BEAST 2. The red star indicates internal calibration point, the values at the nodes represent posterior probabilities (PPs lower than 0.90 not shown), and the blue bars relevant 95% HPD. The time scale is in millions of years. Inset in top left corner shows complete tree: Nemacheilidae is highlighted in yellow, while the outgroup is not highlighted; all calibration points are indicated by red stars.

Reconstruction of the biogeographical history of Nemacheilidae. The inset map in top left corner shows the defined biogeographical regions of Eurasia recognised in the analysis; below the map the colour – region code for the main tree. Grey colour indicates combination of two regions as ancestral (FH); black colour indicates ancestral ranges identified with likelihood < 10%. Main tree: Ancestral range reconstruction with use of DEC+J model based on the phylogeny derived from BEAST2. Pie charts at the nodes indicate relative likelihood of ancestral ranges of most recent common ancestors (MRCA), while the current distribution of taxa is indicated at the tips.

Time tree of the evolutionary history of Nemacheilidae: The figure presents the evolutionary history of Nemacheildae, highlighting the dating of key geological or climatic key paleo-events that might have significantly impacted their evolution. Orange bars represent exceptionally warm periods, blue bars denote cold periods. Simple boxes give time span of specific geological or climatic event. For detailed explanations of these events and their impact on nemacheilid evolution refer to text.

Series of paleomaps visualising the geographical and chronological aspects of the main events during the evolutionary history of the freshwater fish family Nemacheilidae across Eurasia. A red star marks the point of origin of Nemacheilidae, arrows indicate dispersal events.

List of analysed samples, their identification, geographical origin, voucher number and GenBank accession numbers for their sequences.

Voucher numbers starting with ‘A’ refer to the collection of IAPG, Liběchov, Czech Republic; ‘CMK’ numbers refer to the collection of Maurice Kottelat; ‘GenBank’ refers to sequences from GenBank; ‘ZRC’ to samples housed in the Lee Kong Chian Natural History Museum, National University of Singapore, Singapore.

List of primers used in the present study for amplification and/or sequencing

Alignment attributes and best-fit models. Lengths of alignments, numbers of variable (VP) and parsimony informative (PI) positions and models estimated for all partitions.

BEAST and MrBayes models were calculated in Partition Finder 2 (PF2, Lanfear et al., 2016) implemented in PhyloSuite v1.2.2 (Zhang et al., 2020) under AICc criterion, with greedy algorithm (Lanfear et al., 2016) and branch lengths linked. For ML trees the models and partitioning schemes were estimated under BIC by ModelFinder (Kalyaanamoorthy et al. 2017) implemented in IQ tree. The values and models were calculated for both (A) full as well as (B) reduced dataset. Table (C) provides an overview of data attributes for the ingroup dataset only.

Comparison of biogeographic models in RASP.

The last column shows the p-values of the Likelihood Ratio Test. Based on AICc weight (AICc wt), the DEC+J model is recommended as the best fit for our dataset, supported by low p-values indicating the significant influence of the J factor on model likelihood.