Illustrative summary of our methods.
(A) Given two cohorts of cfDNA samples differing by the sequencing pipeline that processed them, the model corrects the second cohort to match the distribution of the first one. After correction, the cost matrix for our OT problem is given by the pairwise Euclidean distances. (B) The solution of the OT problem, named transport plan, assigns patients from Domain 2 to similar patients in Domain 1. The model parameters are found by minimising the Wasserstein distance, as defined by the cost matrix and transport plan. (C) After inference, the two cohorts are merged and ready for downstream analysis. (D) Depiction of the validation procedure used for the purpose of this study.