Review Article

Systems genetics approaches for understanding complex traits with relevance for human disease

Departments of Population & Public Health Sciences, University of Southern California, United States
Biochemistry & Molecular Medicine, Keck School of Medicine, University of Southern California, United States
Center for Public Health Genomics, University of Virginia School of Medicine, United States
Departments of Biochemistry & Molecular Genetics, University of Virginia School of Medicine, United States
Public Health Sciences, University of Virginia School of Medicine, United States
Department of Biological Chemistry, University of California, Irvine, United States
Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg
School of Life and Environmental Sciences, University of Sydney, Australia
Faculty of Medicine and Health, University of Sydney, Australia
Charles Perkins Centre, University of Sydney, Australia
Departments of Human Genetics, University of California, Los Angeles, United States
Medicine, University of California, Los Angeles, United States
Microbiology, Immunology, & Molecular Genetics, David Geffen School of Medicine of UCLA, United States

Nov 14, 2023

Open access
Copyright information

Abstract
Introduction
References
Article and author information
Metrics

Abstract

Quantitative traits are often complex because of the contribution of many loci, with further complexity added by environmental factors. In medical research, systems genetics is a powerful approach for the study of complex traits, as it integrates intermediate phenotypes, such as RNA, protein, and metabolite levels, to understand molecular and physiological phenotypes linking discrete DNA sequence variation to complex clinical and physiological traits. The primary purpose of this review is to describe some of the resources and tools of systems genetics in humans and rodent models, so that researchers in many areas of biology and medicine can make use of the data.

Introduction

Complex traits, including common diseases, result from the combined effects of multiple genetic variations together with environmental factors. Genome-wide association studies (GWAS) have revealed that most complex traits are highly heterogeneous and can involve contributions from hundreds of genetic variants, each explaining a tiny fraction of susceptibility. Although GWAS have been very successful in identifying thousands of causal genetic loci, the underlying mechanisms and interactions remain, in most cases, poorly understood. The regions identified in GWAS usually contain blocks of credible SNPs, and one challenge is to pinpoint the causal gene in that locus of interest. A further challenge is that the majority of fine-mapped SNPs are found in non-coding regions of the genome with no obvious link to protein function. It is often assumed that such SNPs will regulate transcription of a proximal gene but this can be difficult to prove and, in some cases, such as FTO (Claussnitzer et al., 2015), has proven not to be the case. A third challenge is to understand how the genetic variation perturbs the system to affect molecular, physiological, or clinical traits.

Systems genetics is an approach that seeks to understand the molecular and physiological phenotypes linking DNA sequence variation to complex clinical and physiological traits. It does this by using high-throughput ‘omics’ technologies to examine how intermediate molecular phenotypes, such as RNA, protein, and metabolite levels, are perturbed by natural genetic variation among individuals in the population and then attempting to relate these variations, using statistical methods, to physiological or clinical traits (Figure 1). Systems genetics approaches have proven very useful in identifying causal genes, pathways, and interactions underlying complex traits. We refer readers to a number of reviews of systems genetics strategies that cover historical developments, omics technologies, experimental designs, statistical analyses, and applications in detail (Ehrenreich et al., 2009; Civelek and Lusis, 2014 Baliga et al., 2017; Schughart and Williams, 2017; Seldin et al., 2019; Li and Auwerx, 2020; Molendijk and Parker, 2021).

Figure 1

Download asset Open asset

Systems genetics strategy for integration of clinical (and other complex) traits with molecular traits.

In this cartoon, individuals in a cohort are examined for clinical or other complex traits of interest. Tissues from the same individuals are also examined using various omics technologies that quantitate to molecular traits. Genetic and environmental variations among the individuals will perturb the clinical and molecular traits. The relationships among the traits can then be statistically modelled using genetic mapping, correlation structure, causal inference, and network modelling. Figure adapted from Civelek and Lusis, 2014.

The papers in this issue of eLife provide examples of its many applications across a broad spectrum of model systems. Vast amounts of systems genetics data (i.e., omics-level data collected in diverse populations of humans, mice, rats, flies, or yeast) are now compiled in publicly accessible databases or are available through collaborations. The primary purpose of this review is to describe some of the resources and tools of systems genetics, so that researchers in all areas of biology and medicine can make use of the data. Our focus is limited to studies in mammals although we note that many of the most important studies utilized yeast and flies. We begin with a brief overview, followed by descriptions of human and rodent tools and resources and end with thoughts about future directions.

Overview of systems genetics

Design of systems genetics studies

A typical systems genetics study involves the following steps: (1) Identification of an important question, or set of questions, that could be addressed with a systems genetics study. (2) Selection of an appropriate cohort exhibiting variation in the traits of interest and of sufficient statistical power. (3) Phenotyping of the population for physiological, clinical, and omics traits of interest. (4) Integration of the resulting data through genetic mapping and statistical methods (Seldin et al., 2019). (5) Validation of hypotheses using methods such as gain/loss-of-function experimental studies. (6) For studies of disease, validation in human cohorts if possible. Developing a study de novo represents a substantial effort and cost, whether the study involves human or rodent populations. Thus, if possible, the study should be designed such that it can address many different questions and provide a resource for other researchers.

Human studies versus animal models

Complex traits such as common diseases typically involve not only many genes and environmental factors, but also complications such as lifestyle factors, aging-related changes, ethnic differences, sex differences, and in some cases gut microbiome interactions. It is important to design the study with these factors in mind. Several large human biomedical data and research resources have recently emerged, most notably UK Biobank. While such databases will play an invaluable role in future biomedical discoveries, they still currently possess at least two major caveats that are noteworthy. First, access to relevant tissues or organs may be limited and this limits systems genetics analyses. Second, the function of many genes is dramatically influenced by the environment but this is almost impossible to control and/or quantify in humans. For these and other reasons, studies of model systems such as genetically diverse cohorts of mice or rats have proven especially valuable for systems genetics. Notably, the ability to control the environment, age and sex of the animals, and other variables greatly reduce the number of individuals required for the study. As discussed below, a variety of useful mouse and rat population resources have been developed.

Flow of information

Systems genetics can be used to follow the flow of information from DNA variation to intermediate phenotypes for the disease of interest. Thus, it is in some cases useful to interrogate the relevant tissues with different omics platforms (multi-omics) such as transcriptomics, proteomics, and metabolomics. Connecting the dots in this way, from DNA variation to molecular traits to the clinical trait, has been invaluable in identifying causal genes and relevant biological pathways. Information can, of course, also flow in reverse, and statistical methods can be useful in determining the direction of information flow (Figure 1).

Statistical analyses

Since information flow from DNA is unidirectional, causal pathways can be modeled and ‘mediators’ identified. For example, if both a clinical trait and the levels of a transcript are correlated and map to the same locus, one can condition on the transcript levels and ask whether a significant association between the locus and the clinical trait remains. If the association is eliminated, the results suggest that the effect on the clinical trait is not mediated by the transcript. Various causal inference tests have been developed and are typically referred to as mediation analysis (Schadt et al., 2005; Zhu et al., 2016). Mendelian randomization (MR) (Zhu et al., 2016) is one form of mediation analysis that has been particularly informative in dissecting causal influences between intermediate physiologic traits and disease pathophysiology in human studies. Higher-order interactions in biological systems are also conveniently modeled as network graphs using systems genetics data (Figure 1; Huang et al., 2018).

Reductionist versus systems genetics approaches

Biological research is now dominated by reductionist approaches, such as gain- or loss-of-function studies in mice. These approaches are powerful in that they establish causality, but they have some important limitations that hinder a more complete understanding of the architecture of complex traits, as discussed below. In contrast, systems genetics studies must generally be combined with experimental studies to conclusively establish causality. One constraint with the purely reductionist approach is that it usually involves perturbation of a single gene in a single genetic background and thus is unlikely to detect genetic interactions, such as modifier genes (Riordan and Nadeau, 2017). In other words, a genetic variation acts in the context of the genetic background, and by examining the effect of a gain or loss of function in only a single genetic background, an incomplete view of the function of the gene will be obtained. For example, engineered mutations in mice often exhibit strikingly different phenotypes when examined in different strains (Sittig et al., 2016).

Another important feature of systems genetics approaches is that they are relatively unbiased. Reductionist scientists usually generate hypotheses based on results from previous studies, and thus some genes or pathways are explored in great depth, whereas others are ignored. A recent study (Stoeger et al., 2018) found that more than one-quarter of coding genes have never been the subject of a single paper, whereas approximately 2000 genes (<10% of the coding genome) have dominated the literature. Systems genetics hypotheses, in contrast, are driven by natural variation paired with global measures of omics data and are therefore relatively unbiased. The power of natural variation derives from the multitude of genetic perturbations that occur in all combinations in a population (Heinz et al., 2013; Battle et al., 2017).

Omics technologies

The different omics technologies used in systems genetics have been reviewed in detail (Hasin et al., 2017; Molendijk and Parker, 2021). In addition to genotyping, technologies have been developed that allow broad characterization of DNA modifications (epigenomics), gene expression (transcriptomics), protein expression (proteomics), small molecules (metabolomics), gut bacteria (metagenomics and metatranscriptomics). Below, we briefly review each of these technologies and their integration into systems genetics analyses of complex traits.

Epigenomics

In addition to variation in genomic sequence, chemical modifications to the nucleotide bases or DNA-bound histones, such as methylation, acetylation, and phosphorylation, can also vary between individuals (and cell types). Epigenomics is the study of such modifications and how they regulate gene expression (Allis and Jenuwein, 2016). The most well characterized of these modifications is the methylation of cytosine residues, which tend to cluster at repetitive CG dinucleotides in the proximal promoters of genes in regions referred to as CpG islands. Compared to unmethylated cytosines, methylated cytosine residues are protected from sodium bisulfite-induced deamination to uracil. Thus, it is possible to accurately quantitate the degree of methylation at individual cytosines by comparing the same DNA sequence before and after sodium bisulfite treatment.

Analogous to microarrays or SNP arrays, high-throughput methylation arrays have been developed that allow the simultaneous quantitative evaluation of >800,000 methylation sites throughout the genome. While such arrays do not cover all potential methylation sites, they have been designed to capture the proximal promoters of genes and other known important regulatory regions. Coupled with their relatively low cost, methylation arrays have allowed epigenome-wide association studies (EWAS) to be carried out on a large-scale (Campagna et al., 2021). In this approach, the degree of methylation at cytosines is tested for association with clinical traits or outcomes on a genome-wide level. Because of the large numbers of samples that have been archived or the ease with which they can be collected, most EWAS have been carried out with methylation levels in blood cell-derived DNA. Importantly, analytical methods have been developed that use the methylation data itself to deconvolute individual leukocyte populations (Houseman et al., 2012; Salas et al., 2022), which can be used as covariates in the analyses given the heterogeneity in blood cell number among individuals. Furthermore, databases that have curated EWAS datasets are publicly available (Li et al., 2019) and provide efficient access to published results for integration with other omics data, such as those described below.

An example of the application of systems genetics to understand epigenetic marks is the study by Orozco et al., 2015. The paper examined the contribution of DNA methylation to several complex traits relevant to heart disease, diabetes, and osteoporosis. DNA methylation in liver was examined in 90 mouse inbred strains from the Hybrid Mouse Diversity Panel (HMDP) (see below) using bisulfite sequencing. The DNA methylation levels in the liver were integrated with global transcript levels as well as proteomic and metabolomic data using correlation, mapping, and modeling. Many associations between epigenetic marks and clinical traits were identified, and causal inference tests were performed using the R statistical package CIT (Millstein et al., 2009). CIT performs a series of conditional probability tests to determine if the associations between a genetic locus and a trait are mediated by DNA methylation. About 25% of clinical trait associations were predicted to be causal. The study also revealed an example of how natural genetic variation can influence methylation levels. Mapping analysis of CpG methylation identified several hotspots in the genome in which a single locus was associated with variations in DNA methylation in many locations in the genome. One locus, on mouse chromosome 13, was associated with hundreds of CpGs. A candidate gene in the locus, methionine synthase reductase (Mtrr), was regulated in cis and its expression was highly correlated with methylation levels of CpGs mapping to the locus. A causal role of Mtrr in the methylation was demonstrated using genetically engineered mice (Orozco et al., 2015).

Transcriptomics

Most systems genetics studies performed to date have utilized transcriptomic data, largely because global gene expression can be measured at scale in a cost-effective manner using RNA sequencing (RNA-seq) (Stark et al., 2019). Additionally, most (>90%) disease associations identified by GWAS implicate non-coding variation, making the identification of trait-associated variation altering gene regulation (e.g., expression, splicing, etc.) a logical starting point to identify causal genes and mechanisms (Buniello et al., 2019). Although the original transcriptomic studies used gene expression arrays, the transcriptomic data used in contemporary systems genetics studies is almost exclusively generated by RNA-seq. There are different ‘flavors’ of RNA-seq analyses (e.g., total RNA-seq, mRNA-seq, small RNA-seq, etc.) but the general principle involves a multi-step process of RNA isolation from cells or tissues, conversion of RNA to cDNA, and next-generation sequencing. Gene expression is then quantified by aligning the resulting reads to a reference genome or transcriptome and counting the number of reads aligning to a particular gene, which equates to a digital readout of gene expression (Conesa et al., 2016).

One of the most useful systems genetics analyses involving transcriptomic data is the identification of expression quantitative trait loci (eQTLs) (Farber and Lusis, 2008). eQTLs are sets of genetic variants influencing gene expression. There are two types of eQTL, local (also referred to as cis-eQTL) and distant (also referred to as trans). Local eQTL are loci located in close proximity to the gene they regulate, whereas distant eQTL, as their name implies, are generally defined as being >~1 Mbp from the gene they regulate. Local eQTLs have been widely used to connect variants to genes in genetic studies. Another approach is transcriptome-wide association studies (TWASs) (Li and Ritchie, 2021). TWASs use eQTLs from a reference population to predict or impute gene expression in a much larger population. Imputed (genetically regulated) gene expression is then correlated with a complex trait to identify genes potentially driving disease.

RNA-seq is most commonly performed in heterogeneous tissues comprised of multiple cell types. However, genetic influences on gene regulation are often cell-type specific (Kim-Hellmuth et al., 2020). Recently, single-cell RNA-seq (scRNA-seq) has enabled the generation of complete transcriptomes from individual cells (Olsen and Baryawno, 2018). This provides the opportunity to study the genetics of gene regulation in individual cells or homogenous groups of cells. Generation of scRNA-seq data across a variety of organisms, tissues, and disease states has fueled an explosion of potential cell-specific contributions to complex outcomes. In a recent study, Chatterjee and colleagues created a Drosophila model which causes hyper-invasive multilayering of the follicular epithelia (Chatterjee et al., 2022). By comparing scRNA-seq profiles from ovaries of this cancerous strain to normal flies, the authors identified a link between Keap1 and Nrf2 as upstream components of tumorigenesis. However, scRNA-seq has technical limitations since it generally requires generating single-cell suspensions with fresh tissue. Therefore, scientists have also developed single-nuclei RNA-seq, which as the name implies involves sequencing RNA from nuclei of individual cells instead of whole-cell RNA. A variety of methods have been developed to analyze available scRNA-seq data to support or refute specific hypotheses. For example, a recent study developed Scallop, a statistical tool which enables quantification of transcriptional noise derived from scRNA-seq data (Ibañez-Solé et al., 2022). Here, the authors applied this method to normal or aged datasets in mice and humans to demonstrate that previously inferred transcriptional noise could be attributed to shifting cellular identities and thus not entirely stochastic. In another recent study, single-nuclei RNA-seq of brain tissue in 192 individuals identified 7607 genes with a local eQTL, 46% of which demonstrated cell-type-specific effects. The identified local eQTLs were found to colocalize with many loci identified in GWAS for Alzheimer’s disease, Parkinson’s disease, and schizophrenia, often in a cell-type-specific manner (Bryois et al., 2022). One caveat of scRNA analysis is the very low number of transcripts per gene in an average cell. Some estimates indicate that there are somewhere between 1 and 10 copies of RNA per gene per cell and this might be even lower in non-dividing cells (Eberwine et al., 2014; Mund et al., 2022). Hence, the accurate quantification using this method is likely restricted to abundant transcripts.

In the context of RNA-seq, the vast majority of transcriptomics data generated to date are based on short-read sequencing (reads of 75–150 bp). This has proven to be a powerful approach for quantifying expression at the level of individual genes. However, alternative splicing is a major gene regulatory mechanism that underlies disease states, including variation in disease-associated complex traits (Castaldi et al., 2022). While excellent for quantifying gene expression, short-read RNA-seq is not ideal for quantifying gene expression at the level of individual transcripts generated via alternative splicing. Long-read RNA-seq (reads of 10,000 bp or more) can address this limitation by enabling the identification and quantification of full-length transcripts. Long-read RNA-seq using technologies such as PacBio (Rhoads and Au, 2015) and Oxford Nanopore (Steinbock and Radenovic, 2015) promises to improve transcript quantification and our understanding of alternative splicing (Abood et al., 2023). Additionally, long-read RNA-seq coupled with short-read data in the short term should improve the identification and interpretation of genetically regulated alterations in splicing (i.e., splicing QTL) involved in disease (Abood et al., 2023). Ultimately, it is likely that, as the cost of generating long-read RNA-seq data continues to decrease, it will become the primary approach for characterizing and quantifying transcriptomes for systems genetics studies.

Proteomics

Advances over the past decade in sample preparation, fractionation methods, and mass spectrometry have led to the large-scale implementation of proteomics methods. Many studies now report quantitative tissue or cellular proteome measurements for >5000 proteins using minimalist sample preparation methods. Further, by using much deeper fractionation approaches, researchers can measure even more complete proteomes of 10,000–13,000 proteins in a single sample (Wang et al., 2019). These latter methods are not yet feasible on a larger scale due to the long run times and cost, but they presage a near future when scientists can measure the proteome as comprehensively and affordably as nucleotide sequencing has made transcriptomics. Several systems genetics studies have already utilized tissue proteomics to great affect and this approach is becoming a valuable part of the systems genetics armamentarium (Mitok et al., 2018; Parker et al., 2019; Xiao et al., 2022; Yu et al., 2023).

Other technologies capable of broadly surveying protein levels include the SOMAscan aptamer-based multiplexed platform based on the development of single-stranded DNA aptamers that show high affinity for protein targets (Gold et al., 2010) and the proximity extension assay developed by Olink (Assarsson et al., 2014). However, these methods are most commonly used on human plasma samples and their application to tissue samples is yet to be realized. The latest platform can now measure over 5000 proteins in a sample and has been used to consistently quantify plasma proteins from tens of thousands of human subjects. This method quantifies proteins when two antibodies to which barcodes are attached bind to the same protein, thus bringing the barcodes into close proximity with each other and allowing nucleic acid-based amplification. There are other promising techniques under development as well, including single-cell proteomics (Molendijk and Parker, 2021).

Similar to genetic studies of transcript levels, genetic variations affecting proteins can be classified into those in which the locus maps near the coding gene or those that map elsewhere in the genome. The former are termed ‘local’ protein QTL (pQTL) and presumably act in cis, affecting only the expression of the gene on the contiguous DNA strand. The latter are termed trans pQTL. The correlation between transcript and protein levels in genetic studies tends to be relatively modest (Liu et al., 2016).Thus, the information derived from a proteomics analysis will be quite different from that of a transcriptomics systems genetics study. Likewise, genetic loci affecting protein levels are often distinct from those that regulate the corresponding mRNA, even for variants acting in cis (Williams et al., 2022), for several reasons. First, the turnover of individual proteins varies from minutes to many days in cells; for instance, mitochondrial proteins exhibit particularly long half-lives (Minard et al., 2016). Moreover, protein turnover is under intense control by various post-translational modifications such as ubiquitination, phosphorylation, or acetylation. This is likely to be a major contributor to the non-linear relationship between mRNA and protein expression in cells. Second, many proteins form multi-protein complexes and this often leads to complex co-regulation of all the proteins that exist in such complexes. The archetypal example of this is the T cell receptor, which comprises six subunits. In T cells, surface expression of this receptor requires efficient expression and assembly of all six subunits in stoichiometric amounts and failure to make one single subunit results in rapid degradation of the remaining subunits (Weissman, 1994). Third, protein levels and functions can be affected by missense and nonsense variations that affect translation or processing. Thus, the genetic control of cellular protein levels is highly complex and requires more detailed analyses. Proteomics analyses are also capable of examining certain protein modifications that may be of functional relevance. Protein phosphorylation, in particular, is of great importance in intracellular signaling (Needham et al., 2022). In addition, other protein modifications have recently been quantified in systems genetics reference panels, such as UFMylation, which was demonstrated to impact muscle physiology (Molendijk et al., 2022).

Metabolomics

Similar to proteomics, high-throughput technologies have been applied to the characterization of the metabolome, often referred to as the pool of metabolites in biological systems, such as cells or tissues (Wishart, 2019). Two of the most commonly used methods include MS, coupled with either LC or gas chromatography (GC), or nuclear magnetic resonance (NMR). When done in an unbiased fashion, LC/GC-MS can detect and quantify thousands of small molecules in a particular sample. However, this approach is limited by not being able to provide absolute quantification of the features detected unless known standards are spiked into the sample being analyzed. By comparison, NMR-based metabolomics approaches can provide absolute levels with minimal or no pre-processing of samples, thus increasing efficiency, reducing time, and decreasing costs. However, NMR is typically not as sensitive as MS methods. Both metabolomics approaches have been applied in thousands of subjects from numerous human cohorts and in relation to multiple common diseases (Wishart, 2019). By being distal to mRNA and proteins but proximal to disease outcomes, metabolomics also offers the advantage of providing a level of biological data situated between DNA sequence variation and clinical traits.

Aside from characterizing the repertoire of small molecules, the coupling of metabolomics with genetic data has also allowed large-scale GWAS to be carried out simultaneously for thousands of metabolites, analogous to eQTL and pQTL analyses (Kastenmüller et al., 2015). Such metabolomic QTL (mQTL) have been very informative by revealing genetic determinants of small molecules derived from cellular and physiological processes and have provided insight into the intermediate mechanisms that could underlie association of loci with disease traits. mQTL summary statistics are also publicly available for download similar to other omics data through searchable web-based resources, thus facilitating efficient systems genetics analyses.

Gut microbiome

Systems genetics can also be applied to the gut microbiota. The various microbes can be identified and quantitated using shotgun sequencing of microbial DNA or sequencing of informative amplicons of the bacterial 16S gene. Whole microbial DNA sequencing frequently permits reads to be specifically assigned to individual species or genera. In contrast, 16S sequencing is much less expensive, but it generates data which are often analyzed at a much broader scale, for example at the phylum level. Gut microbe compositions are determined by the diet, host genetics, and other characteristics of the host such as aging. Usually, in humans, bacterial DNA is isolated from stool samples, whereas in mice, cecal contents or fecal pellets can be used. Large-scale population studies of gut microbes, which include fungi as well as bacteria, have been performed in an effort to relate gut bacterial composition to clinical outcomes or other traits (Kurilshikov et al., 2021). Because there are thousands of species in a single individual’s gut microbiome, and because the microbes exist as communities with other species, it has been difficult to identify specific relationships, although broad categories, such as the ratio of Firmicutes to Bacteroidetes phyla, have been related to traits such as obesity. Such studies are simplified in animal models, such as the mouse. For example, in a systems genetics study of microbes in a diverse set of ~100 inbred strains of mice, the species Roseburia intestinalis was found to be associated with resistance to atherosclerosis. Further studies with germ-free mice confirmed this and identified butyrate, a metabolite produced by Roseburia, as the intermediate (Kasahara et al., 2018). In this regard, analysis of bacteria-derived metabolites has been particularly informative (Jie et al., 2017). For example, Hazen and colleagues identified the metabolite trimethylamine-N-oxide, derived from dietary choline and carnitine, as a very significant cause of atherosclerosis (Tang et al., 2019). Systems genetics approaches have also successfully identified some host genetic loci with large effects on microbe compositions. The most significant of these is the lactase locus, which regulates the ability to hydrolyze dietary lactose. If an individual is deficient in the intestinal enzyme, the availability of lactose in the gut will lead to an increased abundance of bacteria that can utilize this sugar as an energy source.

Population resources

Mouse and rat populations

The mouse and rat have played critical roles in systems genetics studies of complex traits. The popularity of rodents for systems genetics studies is due to many reasons, including the similarity between rodent and human physiology, the relatively lower cost compared to human studies, access to critical tissues and disease phenotypes, and the availability of populations designed for genetic mapping. In particular, rodent populations are critical for systems genetics studies investigating clinically relevant conditions (e.g., genetic × diet effects in obesity [Parks et al., 2013] or biomechanical bone strength [AlBarghouthi and Farber, 2019]) that are difficult or impossible to study in humans.

There are two types of rodent populations that are commonly used for systems genetics studies - genetic reference populations (GRPs) and outbred stocks. GRPs can be recombinant inbred lines derived from two or more inbred strains (Williams and Williams, 2017) or collections of inbred strains, such as the HMDP (Bennett et al., 2010) and the Hybrid Rat Diversity Panel (HRDP) (Tabakoff et al., 2019). Outbred stocks can begin from a variety of starting points, including initial crosses between inbred founder strains, with the resulting progeny forming a randomly mating population. The most commonly used genetic reference populations and outbred populations are listed in Table 1.

Table 1

Commonly used genetic reference populations (GRPs) and outbred populations in mice and rats.

GRPs	Species	Inbred or outbred	# of strains	Description	Data repository
BXD	Mouse	Inbred	140	Recombinant inbred lines generated from C57BL/6J and DBA/2J founders	https://genenetwork.org/
HMDP	Mouse	Inbred	~100	A set of ~100 classical laboratory inbred strains and multiple recombinant inbred line panels	http://systems.genetics.ucla.edu
CC	Mouse	Inbred	~50–75	A panel of ~75 recombinant inbred lines derived from eight genetically diverse inbred founders	http://csbio.unc.edu/CCstatus/index.py
HRDP	Rat	Inbred	99	A set of ~100 classical laboratory inbred strains and recombinant inbred line panels	http://phenogen.org https://genenetwork.org/
HXB/BXH	Rat	Inbred	30	Recombinant inbred lines generated from the spontaneously hypertensive rat (SHR/OlaIpcv) and Brown Norway (BN.Lx/Cub) founders	http://phenogen.org https://genenetwork.org/
DO	Mouse	Outbred		An outbred population derived from eight genetically diverse inbred founders	https://genenetwork.org/

The two different populations have unique advantages and disadvantages (Jurrjens et al., 2023). The main advantage of GRPs is that animals within each strain in the population are identical and replicable. This eliminates the need for costly genotyping for each new study and provides a stable set of genotypes on which phenotype data can be accumulated. GRPs are ideal for investigating gene × environment (GXE) interactions given that phenotypes can be measured in strains exposed to different environmental conditions. They are also useful for the genetic analysis of traits with low heritability since having replicates of genetically identical individuals from the same strain decreases noise and increases heritability (Bennett et al., 2010; Keele, 2023). Heritability estimates provide a valuable metric to quantify the genetic and/or environmental contributions to complex outcomes. However, these have been difficult to accurately assess in humans, and studies have suggested differences in heritability estimates depending on methodological approaches used (Mayhew and Meyre, 2017). Several improvements to human heritability estimates have been proposed such as gene network-based approaches which have the power to inform context-specific estimates of complex traits (Feng et al., 2022). However, one disadvantage for rodent studies is that most GRPs consist of a relatively small number of strains (~25–140), limiting statistical power to detect subtle genetic effects on clinical and molecular traits.

By comparison, outbred stocks are not limited by sample size, enabling the detection of more subtle genetic effects and relationships among traits. This additional power can be important when exploring the polygenicity of complex traits. Additionally, outbred stocks accumulate recombinations at every generation, increasing mapping resolution over time, as compared to GRPs which have a fixed set of recombinations and, thus, fixed mapping precision. The main disadvantage of outbred stocks is the uniqueness of each animal, which requires de novo genotyping. This limits the number of disease phenotypes that can be captured on the same set of mice. Also, while GXE interactions can be studied in outbred stocks, it is much more straightforward with GRPs. Below, we discuss the most popular GRPs and outbred stocks used in both mice and rats. We note that, while not reviewed here, advanced intercross lines have also been extensively used to map complex traits. For example, a recent intercross between inbred QSi5 and 129T2/SvEms mouse lines identified 37 new QTLs which associated with cardiac interatrial septation (Moradi Marjaneh et al., 2023).

Widely used GRPs and outbred stocks

BXD

The BXD GRP is a set of recombinant inbred strains derived from crosses between the C57BL/6J and DBA/2J founders. The initial set of BXD strains was developed in the early 1970s by Dr. Benjamin Taylor at the Jackson Laboratory (Taylor et al., 1973). This set was subsequently expanded to approximately 140 strains (Peirce et al., 2004; Ashbrook et al., 2021). As stated above, one of the main advantages of inbred GRPs is the ability to accumulate data on the same genotypes over time. The BXD GRP has been used to map a wide range of complex traits (reviewed in Ashbrook et al., 2021). As an example, a recent study took advantage of the accumulation of de novo mutations over 50 years in the BXD to identify alleles in Mutyh, a known driver of colorectal cancer, that modulate specific germline mutagenesis signatures (Sasani et al., 2022). The BXDs are one of the most well-characterized GRP in any species. To date, over 7500 phenotypes, in disease categories such as cardiovascular disease (CAD), behavior, addiction, and cancer, just to name a few, have been collected. Additionally, over 100 ‘-omics’ (transcriptome, metabolome, DNA methylation, and proteome) datasets on multiple tissues have been generated in the BXD GRP, making it a powerful systems genetics resource. A recent integrative systems genetics study of bile acid metabolism in the BXD used different diets, bile acid profiling, liver transcriptomics, and metabolic phenotyping to identify Ces1c as a master regulator of plasma tauroursodeoxycholic acid, a bile acid known to have health-promoting actions (Li et al., 2022). All of these data are stored on the GeneNetwork2 database (https://genenetwork.org/) and can be used for interactive systems genetics studies.

HMDP

The HMDP is a GRP comprised of a set of commercially available recombinant inbred strains (including the BXD strains discussed above) and classical laboratory strains (Bennett et al., 2010). Most experiments using the HMDP have included approximately 100 strains, though the number of strains that could be used is larger. The HMDP was designed to use available resources to address two issues, the limited statistical power of most strain panels and the lower mapping resolution of RILs. In the HMDP, recombinant inbred strains increase sample size and power, while the small haplotype blocks found in classical inbred strains increase mapping resolution. The HMDP has been used to study a wide range of systems genetics studies of disease phenotypes such as plasma lipids (Bennett et al., 2010), obesity (Parks et al., 2013), bone mass (Farber et al., 2011), fatty liver disease (Hui et al., 2015; Hui et al., 2018), atherosclerosis (Bennett et al., 2015), heart failure (Rau et al., 2015), cellular phenotypes (Davis et al., 2013; Zhou et al., 2015), and various omics datasets (Ghazalpour et al., 2011; Ghazalpour et al., 2014; Hartiala et al., 2014; Orozco et al., 2015). Furthermore, the HMDP is ideal for GXE experiments and has been used to identify loci and genes responsible for differences in obesity and atherosclerosis between chow and high-fat diets and response to environmental exposures (Lavinsky et al., 2016; Maazi et al., 2019; Tuominen et al., 2021). Recently, the HMDP was also used to identify coagulation factor XI as an endocrine factor produced in the liver impacting cardiac fibrosis, inflammation, and heart failure (Cao et al., 2022). Extensive genetic, phenotypic, and omics (transcriptomic and proteomic) data collected in the HMDP can be found at https://systems.genetics.ucla.edu/HMDP/.

CC and DO mice

The collaborative cross (CC) and diversity outbred (DO) mice are multi-parental populations created by ‘combining’ eight genetically diverse inbred founders: A/J, C57BL/6J, 129S1/SvImJ, NZO/H1LtJ, NOD/LtJ, WSB/EiJ, PWK/PhJ, and CAST/EiJ (Churchill et al., 2004; Churchill et al., 2012; Svenson et al., 2012; Saul et al., 2019; Solberg Woods and Palmer, 2019). After the founders were intercrossed in a balanced design, the resulting progeny were used to create a set of recombinant inbred lines (CC) or randomly bred to create DO mice. The genomes of the CC/DO founders have been sequenced and from these data we know that at least ~45 million SNPs and INDELs/structural variants are segregating in the CC and DO mice (Keane et al., 2011). This level of variation is significantly higher than existing mouse GRPs (e.g., ~5 million variants are segregating in the BXD created from classical laboratory strains [Wang et al., 2016]) and this level of variation is more than the common variation seen across diverse human populations.

The original goal for the CC was the generation of 1000 recombinant inbred strains (Churchill et al., 2004); however, genetic incompatibilities among the founders caused most strains to be lost during development. Today there are roughly 50–70 strains that are accessible in the United States. The relatively small number of strains and the increased genetic diversity (which reduces statistical power to identify individual effects) has limited the use of the CC for genetic mapping of complex traits.

The DO has all of the advantages and disadvantages of outbred stocks that are described above. A consideration when designing systems genetics studies using the DO is the increased sample size required over crosses between classical strains to identify QTL in the DO due to the increased genetic variation (Svenson et al., 2012). Despite this, the DO provides a powerful platform to integrate polygenic traits and has been used to perform GWAS and systems genetics studies for stem cell biology (Skelly et al., 2020; Aydin et al., 2023), insulin secretion (Keller et al., 2019), and a wide variety of other disease-associated traits (Chick et al., 2016; AlBarghouthi et al., 2021; Xiao et al., 2022).

HXB/BXH rats

The HXB/BXH GRP is a set of 30 RILs generated by intercrossing the spontaneously hypertensive rat with the normotensive Brown Norway rat strain (Pravenec et al., 1989; Pravenec et al., 2014). The HXB/BXH GRP has been used for genetic and systems genetic studies of blood pressure (Pravenec et al., 1989), alcohol-related traits (Vanderlinden et al., 2014; Saba et al., 2015; Lusk et al., 2018), metabolic and cardiovascular traits (Morrissey et al., 2011; Pravenec et al., 2018), among others. The HXB/BXH GRP was recently used to identify a pleiotropic QTL affecting hippocampal neurogenesis and serum glucose (Senko et al., 2022). Using a systems genetics approach integrating transcriptomics data the Telo2-interacting protein 2 (Tti2) gene was identified as a strong candidate and subsequently validated using a rat Tti2 knockout.

HRDP

The HRDP is a rat GRP similar to the HMDP. Like the HMDP, the HRDP is a panel consisting of both RILs and classical inbred strains, with the goal of maximizing power and mapping resolution using existing resources. Specifically, the HRDP is comprised of 35 inbred strains, 30 strains from the HXB/BXH RIL, and 34 strains from the LEXF/FXLE RIL (Voigt et al., 2008; Tabakoff et al., 2019). Simulations with the HRDP has suggested that an experiment with the full panel has 80% power to detect QTL accounting for approximately 12% of the variance for complex traits (Tabakoff et al., 2019). While few studies have mapped QTL in the HRDP, mapping resolution should be quite good as the GRP has a median haplotype block in of approximately 150 kbp (Tabakoff et al., 2019).

Human populations

Large-scale GWAS, since their first implementation 20 years ago, have been instrumental in elucidating the genetic architecture of complex traits. More recently, this approach has been expanded to include meta-analyses using data from multiple datasets and measurement types (e.g., phenome-wide association studies). This has led the international scientific community to build on this success and develop multi-ancestry biobanks with tens and hundreds of thousands of subjects for systems genetics studies that leverage the vast amount of clinical data and archived blood and tissue samples available in healthcare systems around the world. Some of the datasets provide the advantage of including individuals of non-European ancestry and can be used to determine whether epidemiological and genetic findings are applicable across ethnicities. Importantly, this addresses one criticism of human genetic studies that subjects of European ancestry have been overrepresented in GWAS. Furthermore, a collaborative network comprised of 24 international biobanks has been formed, termed the Global Biobank Meta-analysis Initiative (GBMI), with the express goal of facilitating large-scale genetic analyses of selected common diseases (https://www.globalbiobankmeta.org/; Zhou et al., 2022). Thus far, over 2.2 million genotyped multi-ancestry subjects from both population-based and hospital/health center-based cohorts have been included in the GBMI, which will undoubtedly increase as more groups join this initiative. While a description of every available biobank is beyond the scope of this review, below we highlight some of those that have extensive resources available and have been used widely (Table 2).

Table 2

Commonly used human biobanks for epidemiological and genetics studies.

Cohort	Ancestry (N)	Disease traits	Biomarkers	Genomics	Transcriptomics	Proteomics	Metabolomics	Data repository
UK Biobank	European, Asian, African, Other (502,492)	✓	✓	✓		✓	✓	https://www.ukbiobank.ac.uk/
FinnGEN	European (500,000*)	✓		✓				https://www.finngen.fi/en
Biobank Japan	Asian (260,000)	✓	✓	✓		✓	✓	https://biobankjp.org/en/
China Kadoorie Biobank	Asian (512,000)	✓	✓	✓		✓	✓	https://www.ckbiobank.org/
TOPMed	European, African, Hispanic, Asian (205,092)	✓	✓	✓	✓	✓	✓	https://topmed.nhlbi.nih.gov/
BioVU	European (300,000^†)	✓	✓	✓				https://victr.vumc.org/biovu-description/
Millions Veteran Program	European, African, Hispanic, Asian (950,000^†)	✓	✓	✓				https://www.mvp.va.gov/pwa/
Geisinger MyCode	European (300,000^†)	✓		✓				https://www.geisinger.org/precision-health/mycode

*

Indicates goal of subject recruitment.
†

Recruitment still ongoing. Citations for the studies are provided in the text.

The Genotype-Tissue Expression (GTEx) project (https://gtexportal.org/home/)

The GTEx project consists of >500 postmortem samples from ~50 tissues in humans. The project was initiated with the goal of providing a resource of organs-specific actions of eQTL variants (Consortium, 2013) where intersection with common disease association SNPs could implicate or disregard a specific tissue of action. These data have been particularly useful in providing researchers with a fairly comprehensive paired tissue RNA-seq and genome datasets among the same individuals. More recent efforts have expanded this resource to encompass measures of chromatin accessibility (Zhang et al., 2021a), long-read RNA-seq (Glinos et al., 2022), and single-nuclear RNA-seq (Jones et al., 2022a) in the same individuals.

The UK Biobank

The UK Biobank is a large, multi-site cohort that recruited a total of 503,325 individuals in the UK between 2006 and 2010 (Bycroft et al., 2018). Data and health-related information for ~6700 phenotypes are available on each participant, including biological measurements, lifestyle indicators, environmental exposures, imaging, blood and urinary biomarkers, metabolomics, proteomics, and genetic variants and exome and whole-genome sequencing. Since making the data available through an application-based approval process, hundreds of research groups have published thousands of studies using the UK Biobank cohort. Of all biobanks, access to primary level data in UK Biobank is least restrictive and can be obtained through a proposal and application review process.

FinnGen

The goal of the FinnGen biobank is to recruit ~500,000 subjects drawn from epidemiological and disease-based cohorts as well as hospital biobank samples in Finland (Tabassum et al., 2019). These data will be combined with registry data that record healthcare events over the entire lifespan. As of December 2022, over 356,000 participants have been recruited and genotyped on SNP arrays and the goal is to finish recruitment by the end of 2023. Results of GWAS analyses for over 2200 disease phenotypes are made publicly available twice year, which can be leveraged for meta-analyses and investigation of candidate loci/genes. However, plans to extend the scope of FinnGen to characterization of biomarkers and other omics data in participants have not been announced.

Biobank Japan

Biobank Japan is a national biobank of ~260,000 patients who were enrolled at 12 hospitals throughout Japan between 2003 and 2008 (Hirata et al., 2017; Nagai et al., 2017). The Biobank Japan study examines 51 common diseases in its participants and whole-genome SNP array data is already available in over 215,000 subjects, with plans to extend genetic analysis to exome and whole-genome sequencing. Other omics data, such as metabolomics and proteomics in serum, has also been collected on a subset of the cohort. Similar to UK Biobank and FinnGen, summary statistics are publicly available and can be leveraged for various systems genetics analyses.

China Kadoorie Biobank

The China Kadoorie Biobank has recruited over 500,000 adults from 10 geographically diverse (urban and rural) areas of China from 2004 to 2008 (Chen et al., 2011). Extensive data was collected at baseline but the China Kadoorie Biobank is also focused on prospective data collection with surveys of participants carried out every 4–5 years. In addition to genetic (whole-genome genotyping and sequencing) and disease outcome data, biological samples have been collected for biomarker, metabolomics, and proteomics analyses.

TOPMed

The Trans-Omics for Precision Medicine (TOPMed) program is an NIH-sponsored precision medicine initiative consisting of over 200,000 participants drawn from over 85 cohorts. The primary goal is to integrate clinical, genetic (whole-genome sequencing), and other omics data (e.g., epigenomics, transcriptomics, proteomics, and metabolomics) to gain a better understanding of heart, lung, blood, and sleep disorders (Brody et al., 2017). One notable advantage of TOPMed is the broad diversity in participants’ ancestries, including those of African, Hispanic/Latino, and Asian backgrounds. However, access to individual-level data and summary statistics is more restrictive than, for example, UK Biobank but can be gained through an application review process.

BioVU

BioVU is an electronic health record (EHR)-based biobank in which over 300,000 participants have been recruited at Vanderbilt University (Roden et al., 2008). It is one of the largest single-site biobanks and also leverages whole-genome genotyping, EHR data, and archived plasma samples to investigate the genetic basis of thousands of disease phenotypes. However, access to the data is also more restrictive than some other biobanks and requires being a Vanderbilt University faculty member. BioVU is also part of the large Electronic Medical Records and Genomics (eMERGE) national network of cohorts and biobanks (https://emerge-network.org/), which is organized and funded by the NIH to carry out large-scale, high-throughput genetic research with EHR-based data.

MVPMillion Veteran Program

The Million Veteran Program (MVP) is a national research program organized and sponsored by the Veterans Administration Office of Research and Development. The goals of the MVP are to better understand the interrelationships between health outcomes, genetic determinants, behavioral traits, and environmental factors, with a particular focus on improving the care of US Veterans (Gaziano et al., 2016). It is one of the largest biobanks and, as of mid 2023, has recruited over 950,000 subjects of different ancestries who are representative of the Veteran population. Thus, the MVP represents one of the largest research programs in the world studying genes and health outcomes.

Geisinger

The Geisinger MyCode Initiative is another EMR-based biobank that is enrolling participants from a rural, integrated health system serving central and northeastern Pennsylvania (Carey et al., 2016; Jones et al., 2022b). SNP genotyping and exome sequencing is being carried out on all participants with the same overall goal as other biobanks of discovering novel genetic associations with disease and therapeutic targets. However, one unique distinction between Geisinger MyCode and other biobanks is the decision to return clinically actionable genetic findings (e.g., BRCA1 mutation status) back to subjects. This is based on surveys of participants who generally indicated they are in favor of receiving such information (Schwartz et al., 2018).

Analytic approaches

Correlation structure and causal modeling

Correlation is a simple procedure for relating molecular and clinical/physiological traits. Of course, the traits can be correlated for multiple reasons. However, since information from DNA is unidirectional, causal pathways can be modeled. For example, if a clinical trait and the levels of a transcript are correlated and map to the same locus, researchers can condition on the transcript levels and ask whether a significant association between the locus and the clinical trait remains. If so, the results suggest that the effect on the clinical trait is not mediated by the transcript. Furthermore, SNPs are often associated with multiple clinical traits, which can be due to either pleiotropy, direct or indirect genetic effects, or environmental/non-genetic factors. Mediation analysis can therefore be used to quantify the proportion of direct and indirect effects of the exposure on the outcome of interest via the intermediate biomarker as the mediator. One straightforward application of this approach would be to use causal mediation analysis to determine whether association of a variant with insulin resistance is mediated through an effect on BMI. MR (see below) is another form of mediation analysis that has especially strict criteria in that the mediator (such as the levels of a protein) is required to explain all of the association between an SNP and a complex trait. Various forms of causal inference tests have been developed for mediation analysis (Schadt et al., 2005; Zhu et al., 2016; Zeng et al., 2021; Crouse et al., 2022).

Mendelian randomization

The gold standard for determining causality with biomarkers has relied on randomized clinical interventions. However, MR has emerged as an alternative and highly efficient strategy for leveraging genetic data to make causal inferences (Smith and Ebrahim, 2003; Didelez and Sheehan, 2007; Lawlor et al., 2008; Burgess et al., 2017). In this approach, MR uses Mendel’s laws of inheritance to treat DNA variants as instrumental variables that mimic the randomization of individuals in clinical trials to two ‘treatment’ groups. Thus, genetic variants that have been associated with biomarkers or intermediate traits of interest (exposures) are then tested for association for disease outcomes (ideally in independent datasets) to infer a causal relationship. This approach has been successfully applied to numerous biomarkers, with the best results being related to classic cardiometabolic risk factors, such as elevated blood pressure, LDL, and triglycerides, that have been confirmed as causal drivers of atherosclerosis (Jansen et al., 2014). However, there are inherent limitations to MR that require careful consideration when evaluating causality of intermediate traits. For example, MR assumes that selected genetic instruments are valid and only associated with the biomarker being tested but not other related pathways and traits or confounding factors. In other words, the variants do not have pleiotropic effects. Second, variants may be weak genetic instruments for the exposure, which can lead to imprecise estimates and/or require larger sample sizes to boost power for detecting associations. Various statistical modifications to MR analyses have also been developed to address these limitations, such as two-sample MR (Hemani et al., 2018), multivariate MR (Sanderson et al., 2019), and TwoStepCisMR (Woolf et al., 2022). Thus, the large amount of publicly available summary statistics for numerous biomarkers and disease outcomes from published GWAS analyses renders MR an efficient and important analytical tool in systems genetic studies. One intriguing recent study developed an integrative MR framework to analyze gene-metabolite-phenotype associations from large-scale data (Auwerx et al., 2023). Application of this method uncovered new molecular links between omics datatypes, such as SLC6A12 correlating with serum creatinine through modulation of the levels of the renal osmolyte betaine.

Colocalization

Colocalization approaches have become instrumental in deciphering the genetic basis of GWAS associations by integrating eQTL, meQTL, and pQTL data (Li and Ritchie, 2021). The premise of colocalization analysis is to determine if two associations, one to a molecular phenotype (such as gene expression) and the other to a complex trait or disease (i.e., GWAS), are driven by the same causal variants. Thus, colocalization approaches can identify candidate causal genes and prioritize functional variants that may be driving the observed associations. By leveraging molecular data reflecting different layers of biological regulation (e.g., gene expression, DNA methylation, and protein abundance), colocalization studies provide valuable insights into the potential biological mechanisms underlying complex traits and diseases. These approaches aid in translating GWAS findings by generating testable hypotheses (e.g., variant x influences the expression of gene y which in turn influences a complex trait) for experimental validation. In an elegant example of colocalization analysis, Al-Barghouthi and colleagues performed colocalization of tissue-wide eQTLs from GTEx and GWA significant loci for bone mineral density (BMD) (AlBarghouthi et al., 2022). These analyses uncovered >500 new candidate genes mediating variance in human BMD and validated PPP6R3 as a causal driver by ablating the gene in mice. In another study, Aberra and colleagues applied colocalization to multi-omic data to identify key genes and pathways which would distinguish BMI, WHR, and T2D (Aberra et al., 2023).

Several colocalization methods have been developed to facilitate the integration of molecular QTL data and GWAS, each with their strengths and limitations. Bayesian colocalization methods, such as COLOC (Giambartolomei et al., 2014), estimate the posterior probabilities of distinct colocalization hypotheses based on the observed summary statistics. Additional methods such as eCAVIAR (Hormozdiari et al., 2014) accommodate multiple causal variants. Most of the available methods do not require individual-level data, making them suitable for large-scale integrative analyses with publicly available summary statistics. Other approaches, such as summary-data-based Mendelian randomization (SMR) (Zhu et al., 2016) and joint likelihood mapping (Chun et al., 2017), rely on distinct statistical frameworks to assess colocalization. These methods have also been successfully applied to a wide range of complex traits and diseases, revealing novel candidate genes and functional variants.

Cross-tissue interactions

Systems genetics data from multiple tissues can be used to identify potential endocrine circuits mediating crosstalk. The method, first developed by Seldin et al., 2018, utilized correlation structure between the expression of individual genes in one tissue and all genes in a second tissue. If a given gene in the first tissue perturbs pathways in the second tissue, it may exhibit an overall correlation that rises above the noise of other gene correlations. The methods work best on animal models where environmental effects can be minimized, although it can also be applied to human datasets that are sufficiently large (Koplev et al., 2022). Recently, the method was used in mice to identify coagulation factor 11 as a novel mediator of liver-heart crosstalk with a significant impact on heart function and one form of heart failure (Cao et al., 2022).

Network modeling

One of the major premises of systems genetics is that biological components (transcripts, proteins, metabolites, etc.) do not act in isolation, but rather participate in complex networks that operate at the cellular, tissue, and organismal levels (Nadeau and Dudley, 2011; Civelek and Lusis, 2014). While reductionist approaches focus on linear events (e.g., kinase phosphorylates protein), systems genetics seeks to understand how genetic variation alters network homeostasis and ultimately disease (AlBarghouthi and Farber, 2019). As a result, a key component of systems genetics studies is the reconstruction of cellular networks.

Co-expression networks

Co-expression networks are one of the most popular types of networks used in systems genetics studies (Figure 2). In a co-expression network, nodes are genes and edges between genes represent a measure of the strength of their co-expression. Weighted gene co-expression network analysis (WGCNA) is a widely used tool for constructing co-expression networks (Langfelder and Horvath, 2008). WGCNA provides a way to organize biology by using the variation in gene expression generated by a series of perturbations (e.g., genetic background, treatments, etc.) to group genes into modules based on similarity in expression patterns. Studies across a wide range of species have demonstrated that genes whose expression co-varies often share similar functions and are members of the same biological pathway or process (McCarroll et al., 2004; Singer et al., 2005; Ala et al., 2008).

Figure 2

Download asset Open asset

How networks can be used in systems genetics studies of disease.

An important aspect of co-expression networks is that, compared to other biological networks, such as protein-protein interaction networks, they retain tissue or cell-type identity. Additionally, because these networks are generated across a set of samples, it is possible to relate the behavior of modules to other characteristics of the sample (Figure 1). For example, one of the first studies utilizing WGCNA identified liver tissue modules that correlated with differences in body weight in a mouse cross (Ghazalpour et al., 2006). Once modules are linked to complex traits, one can integrate the specific genes in the module for their involvement by integrating additional data (e.g., genetics data) or their place in the sub-network defined by the module, such as looking at module hub genes (Figure 1). Co-expression networks have been used to provide insight into a wide range of complex traits and diseases in rodents and humans such as chronic fatigue syndrome (Presson et al., 2008), CAD (McDermott-Roe et al., 2011), and diabetes (Keller et al., 2008).

Another use of co-expression networks has been to inform GWAS (Figure 1). The rationale behind this work is that causal genes from GWAS are often functionally similar and co-expression networks group genes with similar functions into modules. Therefore, it follows that using co-expression networks can help pinpoint potentially causal genes. The way some studies have implemented this strategy is to generate modules from disease-relevant tissues and then scan these modules to identify ones that are enriched for genes implicated by GWAS (located with associated loci) or for disease heritability. Pan et al. recently constructed a co-expression network from human adipose tissue and identified a module associated with waist-to-hip ratio adjusted for BMI (WHRadjBMI) (Pan et al., 2021). Additionally, regions around the 347 module genes were enriched for WHRadjBMI heritability. A transcription factor in this module was implicated as important in the context of WHRadjBMI through orthogonal data and its knockdown in human preadipocytes altered a large fraction of module genes (Pan et al., 2021). Calabrese et al. used a co-expression network generated from mouse bone to identify the ‘osteoblast functional module’ and use it to predict that MARK3 regulated osteoblast activity and BMD (Calabrese et al., 2017), which was validated using an osteoblast-specific Mark3 knockout mouse (Zhang et al., 2021b). In another recent example, Li et al. surveyed intestinal transcriptomics and cytokine abundances in 54 BXD strains fed a chow or HF diet (Li et al., 2023). The authors performed WGCNA on gene expression and applied association mapping to the module eigengenes to identify potentially causal drivers of intestinal bowel disease (IBD) phenotypes in mice. These modQTLs were intersected with IBD associations in UKBB, where MUC4 and EPHA6 were prioritized as key genes. Co-expression network principles have also been widely repurposed to evaluate molecular interactions beyond gene expression. For example, a recent Caenorhabditis elegans screen for argonaut protein localization was used to segregate members of the protein family into discrete modules which tracked with stress response and age (Seroussi et al., 2023).

Bayesian networks

A limitation of co-expression networks is that the direction of effect between genes is missing, and making definitive conclusions regarding causality is nearly impossible from a statistical perspective. Bayesian network reconstruction tools, on the other hand, have been developed to begin to address this limitation.

Bayesian networks are graphical models that represent probabilistic relationships among variables (Koller and Friedman, 2009). Bayesian networks have been widely used in various domains, such as artificial intelligence, machine learning, bioinformatics, and biology due to their ability to model complex relationships among variables and make inferences under uncertainty. They have also been widely used in the context of systems genetics studies. One of the first examples was the generation of Bayesian networks using transcriptomic data from segregating mouse populations to identify gene networks contributing to a wide range of metabolic phenotypes (Zhu et al., 2004; Mehrabian et al., 2005; Chen et al., 2008). These studies demonstrated the power of using networks to disentangle the molecular basis of complex traits.

Bayesian networks have been used in other ways in systems genetics studies (Figure 2). For example, Zhao et al., 2016, used tissue-specific Bayesian networks integrated with the results of GWAS for CAD to identify key driver genes for CAD. In this example, key driver genes were defined as those implicated by GWAS that were connected within a Bayesian network to known disease genes (Zhao et al., 2016). Their systems genetics framework identified tissue-specific relations among GWAS candidate genes and prioritized many genes as likely causal drivers of CAD. A similar approach was recently used to identify likely causal genes for osteoporosis (AlBarghouthi et al., 2021). In this analysis, the authors identified key drivers in a Bayesian network generated from mouse cortical bone transcriptomics data and integrated with GWAS data on BMD. A total of 66 high-priority likely causal genes for BMD were identified including Glt8d2, which was experimentally demonstrated to be involved in the regulation of BMD.

One particularly powerful Bayesian network approach that incorporates biological information (such as eQTL) is Mergeomics (Arneson et al., 2016; Shu et al., 2016), which builds on the explicit hypothesis that multi-omics modalities are functionally related and together can provide information on interconnected biological processes. Mergeomics uses only summary-level multi-omics data, which can be derived from different studies or even species. Briefly, multi-layer disease association signals are mapped to pathways or networks comprising interacting molecules to reveal pathogenic processes perturbed by individual omics variants as well as those affected by multiple omics layers. Recent applications of Mergeomics have yielded substantial insights into the tissue-specific biological processes and regulatory genes involved in individual diseases and those shared between diseases (Shu et al., 2017; von Scheidt et al., 2017; Chella Krishnan et al., 2018).

Cross-species integration

While GWAS approaches have proven powerful in humans or model organisms, identification of conserved association signals at syntenic loci has not been widely observed. However, several notable successful examples of cross-species integration have been recently published. In one study, 60 loci were identified for ex vivo secretagogue-induced insulin secretion from isolated pancreatic islets of several hundred DO mice. Genes at these loci were significantly enriched for genes that localized to loci identified in human GWAS for diabetes-related traits (Keller et al., 2019). In a second study, integration of human lipid GWAS data with mouse liver co-expression networks from DO mice and the HMDP led to the identification of SESN1 (Sestrin1) as a gene associated with cholesterol levels (Li et al., 2020). Prior to this study, Sestrin1 had not been known to be involved in lipid metabolism. More recent analyses have also demonstrated how incorporation of molecular networks can improve translation of GWAS results across species. For example, GWAS analyses in humans and rats identified hundreds of BMI-associated genes, of which 29 were found to overlap between the two species (Wright et al., 2023). However, by integrating molecular networks into the analyses, even greater convergence of GWAS signals was observed. Taken together, such studies highlight how cross-species systems genetics studies can identify novel candidate causal genes for complex traits and the need to develop additional computational tools to better carry out bi-directional translation of findings between humans and model organisms.

Future directions

There will undoubtedly continue to be technical and computational innovations in multi-omics endeavors and this will speed progress in the systems genetics field. For example, above we discuss the value of incorporating long-read RNA-seq into systems genetics analyses (Glinos et al., 2022). A comprehensive view of common genetic variation of isoform-specific changes from these data has the potential to refine the definition of causal gene impacts compared to changes in expression alone from eQTL mapping. Further, there has been rapid progress in single-cell modalities. For example, recent technological advances have enabled widespread application of spatially resolved single-cell quantification (Moses and Pachter, 2022). While testing the robustness of these data in the context of common genetic variation presents significant challenges (discussed above with scRNA-seq), these measures expand our view of the regulatory landscape of single cells and their interactions with surrounding tissue. In addition, technological advances in methods such as proteomics and metabolomics continue to evolve. Lastly, artificial intelligence is already beginning to be applied to systems genetics data. Databases will continue to increase in size and number, and in silico analyses will become increasingly powerful.

Over the past decade, researchers have significantly refined mass-spectrometry-based methods to detect utilization of specific substrates using stable isotope tracing (Jang et al., 2018). Knowledge of these small molecule usages at a genetic scale presents substantial appeal in defining metabolism and associated disease mechanisms. In a recent study, Akingbesote and colleagues measured liver isotope flux, transcriptomics, and proteomics in five species (Akingbesote et al., 2023). They observed ~30,000-fold variation in the usage of substrates and suggested metabolic adaptations which differ between species, for example between mice and rats.

Systems genetics data also present a unique appeal to quantify complex interactions such as gene-by-diet responses (Nelson et al., 2022). Without measuring outcomes across diverse genetic backgrounds and environmental conditions, these interactions would be missed entirely which could prove relevant. In a recent example, Hodel and colleagues analyzed two longitudinal human patient cohorts for risk factors mediating coronary heart disease (Hodel et al., 2023). Here, the authors identified a unique interaction between polygenic risk of cardiovascular incidents and infection via Fusobacterium nucleatum, highlighting the complexities of GXE.

An important challenge will be to develop better methods for curating, storing, sharing, and integrating systems genetics data. We feel that a particularly useful goal would be to combine data from the various systems genetics groups into one searchable database that is accessible to biologists. One mechanism in which this type of database is implemented is available through the Alliance of Genome resources alliancegenome.org (Alliance of Genome Resources Consortium, 2022). This web-based API contains all data from seven model organism repositories (Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource for humans), which can be queried simultaneously at the level of genes, alleles, or disease models. Beyond centralizing these data and curating for streamlined comparisons, significant efforts will have to be applied to provide all researchers the ability to query and understand outputs from systems genetics analyses. This issue spans many areas of computational analyses related to biological data. Several efforts to centralize data analysis toolkits such as Docker, Github, and Jupyter have proved successful in this area; however, still require some knowledge of computational code. Undoubtedly, natural language processing will accelerate these improvements. For example, recent guidelines and considerations for utilization of chatGPT for coding optimization and generation have been summarized (Piccolo et al., 2023; Shue et al., 2023).

Major hurdles in applying unified analyses to common datatypes is differences in data structure, available pre-processing steps, and associated meta-data. Efforts have been proposed to reconcile these, such as the implementation of FAIR data principles (Wilkinson et al., 2016). These principles were first proposed in 2016 to suggest data standards, which maximize Findability, Accessibility, Interoperability, and Reusability. Given the utility of these principles in data management, these guidelines have been adopted by organizations such as the Research Data Alliance and been proposed as critical for the success of future data mining efforts such as machine learning (Scheffler et al., 2022). The approach should also be enhanced by the incorporation of additional data modalities, such as high-resolution clinical imaging data, single-cell gene expression data, and long non-coding RNA. In addition to quantitating the steady-state levels of macromolecules in populations, it may be possible to determine their rates of synthesis and degradation and how this is regulated.

One potentially fertile area of investigation is drug action using systems genetics data. An elegant example of this application was provided by Masson and colleagues who surveyed muscle proteomics from a diverse outbred population of mice exhibiting substantial variation in glucose metabolism profiles (Masson et al., 2023). The authors generated a pQTL footprint of proteins which co-mapped to MATSUDA index and linked these signatures to select compounds which were then subjected to two high-throughput drug screening platforms. Using these approaches, they identified thiostrepton as a compound which impacted glucose metabolism independent of the canonical insulin signaling pathway.

Systems genetics approaches are, of course, limited by that natural variation that is captured in the cohorts that are studied. In the case of human studies, certain ethnic groups have been disproportionately studied and, thus, some results may not apply the less studied groups. Clear racial, ethnic, and gender disparities have been described for most common diseases, emphasizing the need to incorporate more diverse populations in current research gold standards.

References

(2023) Predicting mechanisms of action at genetic loci associated with discordant effects on type 2 diabetes and abdominal fat accumulation
eLife 12:e79834.

https://doi.org/10.7554/eLife.79834
- PubMed
- Google Scholar
Preprint
1. Abood A
2. Mesner LD
3. Jeffery ED
4. Murali M
5. Lehe M
6. Saquing J
7. Farber CR
8. Sheynkman GM
(2023) Long-Read Proteogenomics to Connect Disease-Associated sQTLs to the Protein Isoform Effectors of Disease
bioRxiv.

https://doi.org/10.1101/2023.03.17.531557
- Google Scholar
1. Akingbesote ND
2. Leitner BP
3. Jovin DG
4. Desrouleaux R
5. Owusu D
6. Zhu W
7. Li Z
8. Pollak MN
9. Perry RJ
(2023) Gene and protein expression and metabolic flux analysis reveals metabolic scaling in liver ex vivo and in vivo
eLife 12:e78335.

https://doi.org/10.7554/eLife.78335
- PubMed
- Google Scholar
1. Ala U
2. Piro RM
3. Grassi E
4. Damasco C
5. Silengo L
6. Oti M
7. Provero P
8. Di Cunto F
(2008) Prediction of human disease genes by human-mouse conserved coexpression analysis
PLOS Computational Biology 4:e1000043.

https://doi.org/10.1371/journal.pcbi.1000043
- PubMed
- Google Scholar
1. AlBarghouthi BM
2. Farber CR
(2019) Dissecting the genetics of osteoporosis using systems approaches
Trends in Genetics 35:55–67.

https://doi.org/10.1016/j.tig.2018.10.004
- PubMed
- Google Scholar
1. AlBarghouthi BM
2. Mesner LD
3. Calabrese GM
4. Brooks D
5. Tommasini SM
6. Bouxsein ML
7. Horowitz MC
8. Rosen CJ
9. Nguyen K
10. Haddox S
11. Farber EA
12. Onengut-Gumuscu S
13. Pomp D
14. Farber CR
(2021) Systems genetics in diversity outbred mice inform BMD GWAS and identify determinants of bone strength
Nature Communications 12:3408.

https://doi.org/10.1038/s41467-021-23649-0
- PubMed
- Google Scholar
1. AlBarghouthi BM
2. Rosenow WT
3. Du KP
4. Heo J
5. Maynard R
6. Mesner L
7. Calabrese G
8. Nakasone A
9. Senwar B
10. Gerstenfeld L
11. Larner J
12. Ferguson V
13. Ackert-Bicknell C
14. Morgan E
15. Brautigan D
16. Farber CR
(2022) Transcriptome-wide association study and eQTL colocalization identify potentially causal genes responsible for human bone mineral density GWAS associations
eLife 11:e77285.

https://doi.org/10.7554/eLife.77285
- PubMed
- Google Scholar
1. Alliance of Genome Resources Consortium
(2022) Harmonizing model organism data in the Alliance of Genome Resources
Genetics 220:iyac022.

https://doi.org/10.1093/genetics/iyac022
- Google Scholar
1. Allis CD
2. Jenuwein T
(2016) The molecular hallmarks of epigenetic control
Nature Reviews. Genetics 17:487–500.

https://doi.org/10.1038/nrg.2016.59
- PubMed
- Google Scholar
(2016) Mergeomics: a web server for identifying pathological pathways, networks, and key regulators via multidimensional data integration
BMC Genomics 17:722.

https://doi.org/10.1186/s12864-016-3057-8
- PubMed
- Google Scholar
1. Ashbrook DG
2. Arends D
3. Prins P
4. Mulligan MK
5. Roy S
6. Williams EG
7. Lutz CM
8. Valenzuela A
9. Bohl CJ
10. Ingels JF
11. McCarty MS
12. Centeno AG
13. Hager R
14. Auwerx J
15. Lu L
16. Williams RW
(2021) A platform for experimental precision medicine: The extended BXD mouse family
Cell Systems 12:235–247.

https://doi.org/10.1016/j.cels.2020.12.002
- PubMed
- Google Scholar
(2014) Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability
PLOS ONE 9:e95192.

https://doi.org/10.1371/journal.pone.0095192
- PubMed
- Google Scholar
1. Auwerx C
2. Sadler MC
3. Woh T
4. Reymond A
5. Kutalik Z
6. Porcu E
(2023) Exploiting the mediating role of the metabolome to unravel transcript-to-phenotype associations
eLife 12:e81097.

https://doi.org/10.7554/eLife.81097
- PubMed
- Google Scholar
1. Aydin S
2. Pham DT
3. Zhang T
4. Keele GR
5. Skelly DA
6. Paulo JA
7. Pankratz M
8. Choi T
9. Gygi SP
10. Reinholdt LG
11. Baker CL
12. Churchill GA
13. Munger SC
(2023) Genetic dissection of the pluripotent proteome through multi-omics data integration
Cell Genomics 3:100283.

https://doi.org/10.1016/j.xgen.2023.100283
- PubMed
- Google Scholar
1. Baliga NS
2. Björkegren JLM
3. Boeke JD
4. Boutros M
5. Crawford NPS
6. Dudley AM
7. Farber CR
8. Jones A
9. Levey AI
10. Lusis AJ
11. Mak HC
12. Nadeau JH
13. Noyes MB
14. Petretto E
15. Seyfried NT
16. Steinmetz LM
17. Vonesch SC
(2017) The state of systems genetics in 2017
Cell Systems 4:7–15.

https://doi.org/10.1016/j.cels.2017.01.005
- PubMed
- Google Scholar
(2017) Genetic effects on gene expression across human tissues
Nature 550:204–213.

https://doi.org/10.1038/nature24277
- PubMed
- Google Scholar
1. Bennett BJ
2. Farber CR
3. Orozco L
4. Kang HM
5. Ghazalpour A
6. Siemers N
7. Neubauer M
8. Neuhaus I
9. Yordanova R
10. Guan B
11. Truong A
12. Yang W
13. He A
14. Kayne P
15. Gargalovic P
16. Kirchgessner T
17. Pan C
18. Castellani LW
19. Kostem E
20. Furlotte N
21. Drake TA
22. Eskin E
23. Lusis AJ
(2010) A high-resolution association mapping panel for the dissection of complex traits in mice
Genome Research 20:281–290.

https://doi.org/10.1101/gr.099234.109
- PubMed
- Google Scholar
1. Bennett BJ
2. Davis RC
3. Civelek M
4. Orozco L
5. Wu J
6. Qi H
7. Pan C
8. Packard RRS
9. Eskin E
10. Yan M
11. Kirchgessner T
12. Wang Z
13. Li X
14. Gregory JC
15. Hazen SL
16. Gargalovic PS
17. Lusis AJ
(2015) Genetic architecture of atherosclerosis in mice: a systems genetics analysis of common inbred strains
PLOS Genetics 11:e1005711.

https://doi.org/10.1371/journal.pgen.1005711
- PubMed
- Google Scholar
(2017) Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology
Nature Genetics 49:1560–1563.

https://doi.org/10.1038/ng.3968
- PubMed
- Google Scholar
1. Bryois J
2. Calini D
3. Macnair W
4. Foo L
5. Urich E
6. Ortmann W
7. Iglesias VA
8. Selvaraj S
9. Nutma E
10. Marzin M
11. Amor S
12. Williams A
13. Castelo-Branco G
14. Menon V
15. De Jager P
16. Malhotra D
(2022) Cell-type-specific cis-eQTLs in eight human brain cell types identify novel risk genes for psychiatric and neurological disorders
Nature Neuroscience 25:1104–1112.

https://doi.org/10.1038/s41593-022-01128-z
- PubMed
- Google Scholar
1. Buniello A
2. MacArthur JAL
3. Cerezo M
4. Harris LW
5. Hayhurst J
6. Malangone C
7. McMahon A
8. Morales J
9. Mountjoy E
10. Sollis E
11. Suveges D
12. Vrousgou O
13. Whetzel PL
14. Amode R
15. Guillen JA
16. Riat HS
17. Trevanion SJ
18. Hall P
19. Junkins H
20. Flicek P
21. Burdett T
22. Hindorff LA
23. Cunningham F
24. Parkinson H
(2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019
Nucleic Acids Research 47:D1005–D1012.

https://doi.org/10.1093/nar/gky1120
- PubMed
- Google Scholar
(2017) A review of instrumental variable estimators for Mendelian randomization
Statistical Methods in Medical Research 26:2333–2355.

https://doi.org/10.1177/0962280215597579
- PubMed
- Google Scholar
1. Bycroft C
2. Freeman C
3. Petkova D
4. Band G
5. Elliott LT
6. Sharp K
7. Motyer A
8. Vukcevic D
9. Delaneau O
10. O’Connell J
11. Cortes A
12. Welsh S
13. Young A
14. Effingham M
15. McVean G
16. Leslie S
17. Allen N
18. Donnelly P
19. Marchini J
(2018) The UK Biobank resource with deep phenotyping and genomic data
Nature 562:203–209.

https://doi.org/10.1038/s41586-018-0579-z
- PubMed
- Google Scholar
(2017) Integrating GWAS and Co-expression Network Data Identifies Bone Mineral Density Genes SPTBN1 and MARK3 and an Osteoblast Functional Module
Cell Systems 4:46–59.

https://doi.org/10.1016/j.cels.2016.10.014
- PubMed
- Google Scholar
(2021) Epigenome-wide association studies: current knowledge, strategies and recommendations
Clinical Epigenetics 13:214.

https://doi.org/10.1186/s13148-021-01200-8
- PubMed
- Google Scholar
1. Cao Y
2. Wang Y
3. Zhou Z
4. Pan C
5. Jiang L
6. Zhou Z
7. Meng Y
8. Charugundla S
9. Li T
10. Allayee H
11. Seldin MM
12. Lusis AJ
(2022) Liver-heart cross-talk mediated by coagulation factor XI protects against heart failure
Science 377:1399–1406.

https://doi.org/10.1126/science.abn0910
- PubMed
- Google Scholar
1. Carey DJ
2. Fetterolf SN
3. Davis FD
4. Faucett WA
5. Kirchner HL
6. Mirshahi U
7. Murray MF
8. Smelser DT
9. Gerhard GS
10. Ledbetter DH
(2016) The Geisinger MyCode community health initiative: an electronic health record-linked biobank for precision medicine research
Genetics in Medicine 18:906–913.

https://doi.org/10.1038/gim.2015.187
- PubMed
- Google Scholar
(2022) Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease
Human Molecular Genetics 31:R123–R136.

https://doi.org/10.1093/hmg/ddac196
- PubMed
- Google Scholar
1. Chatterjee D
2. Costa CAM
3. Wang XF
4. Jevitt A
5. Huang YC
6. Deng WM
(2022) Single-cell transcriptomics identifies Keap1-Nrf2 regulated collective invasion in a Drosophila tumor model
eLife 11:e80956.

https://doi.org/10.7554/eLife.80956
- PubMed
- Google Scholar
1. Chella Krishnan K
2. Kurt Z
3. Barrere-Cain R
4. Sabir S
5. Das A
6. Floyd R
7. Vergnes L
8. Zhao Y
9. Che N
10. Charugundla S
11. Qi H
12. Zhou Z
13. Meng Y
14. Pan C
15. Seldin MM
16. Norheim F
17. Hui S
18. Reue K
19. Lusis AJ
20. Yang X
(2018) Integration of multi-omics data from mouse diversity panel highlights mitochondrial dysfunction in non-alcoholic fatty liver disease
Cell Systems 6:103–115.

https://doi.org/10.1016/j.cels.2017.12.006
- PubMed
- Google Scholar
1. Chen Y
2. Zhu J
3. Lum PY
4. Yang X
5. Pinto S
6. MacNeil DJ
7. Zhang C
8. Lamb J
9. Edwards S
10. Sieberts SK
11. Leonardson A
12. Castellini LW
13. Wang S
14. Champy MF
15. Zhang B
16. Emilsson V
17. Doss S
18. Ghazalpour A
19. Horvath S
20. Drake TA
21. Lusis AJ
22. Schadt EE
(2008) Variations in DNA elucidate molecular networks that cause disease
Nature 452:429–435.

https://doi.org/10.1038/nature06757
- PubMed
- Google Scholar
(2011) China Kadoorie Biobank of 0.5 million people: survey methods, baseline characteristics and long-term follow-up
International Journal of Epidemiology 40:1652–1666.

https://doi.org/10.1093/ije/dyr120
- PubMed
- Google Scholar
1. Chick JM
2. Munger SC
3. Simecek P
4. Huttlin EL
5. Choi K
6. Gatti DM
7. Raghupathy N
8. Svenson KL
9. Churchill GA
10. Gygi SP
(2016) Defining the consequences of genetic variation on a proteome-wide scale
Nature 534:500–505.

https://doi.org/10.1038/nature18270
- PubMed
- Google Scholar
(2017) Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types
Nature Genetics 49:600–605.

https://doi.org/10.1038/ng.3795
- PubMed
- Google Scholar
1. Churchill GA
2. Airey DC
3. Allayee H
4. Angel JM
5. Attie AD
6. Beatty J
7. Beavis WD
8. Belknap JK
9. Bennett B
10. Berrettini W
11. Bleich A
12. Bogue M
13. Broman KW
14. Buck KJ
15. Buckler E
16. Burmeister M
17. Chesler EJ
18. Cheverud JM
19. Clapcote S
20. Cook MN
21. Cox RD
22. Crabbe JC
23. Crusio WE
24. Darvasi A
25. Deschepper CF
26. Doerge RW
27. Farber CR
28. Forejt J
29. Gaile D
30. Garlow SJ
31. Geiger H
32. Gershenfeld H
33. Gordon T
34. Gu J
35. Gu W
36. de Haan G
37. Hayes NL
38. Heller C
39. Himmelbauer H
40. Hitzemann R
41. Hunter K
42. Hsu H-C
43. Iraqi FA
44. Ivandic B
45. Jacob HJ
46. Jansen RC
47. Jepsen KJ
48. Johnson DK
49. Johnson TE
50. Kempermann G
51. Kendziorski C
52. Kotb M
53. Kooy RF
54. Llamas B
55. Lammert F
56. Lassalle J-M
57. Lowenstein PR
58. Lu L
59. Lusis A
60. Manly KF
61. Marcucio R
62. Matthews D
63. Medrano JF
64. Miller DR
65. Mittleman G
66. Mock BA
67. Mogil JS
68. Montagutelli X
69. Morahan G
70. Morris DG
71. Mott R
72. Nadeau JH
73. Nagase H
74. Nowakowski RS
75. O’Hara BF
76. Osadchuk AV
77. Page GP
78. Paigen B
79. Paigen K
80. Palmer AA
81. Pan H-J
82. Peltonen-Palotie L
83. Peirce J
84. Pomp D
85. Pravenec M
86. Prows DR
87. Qi Z
88. Reeves RH
89. Roder J
90. Rosen GD
91. Schadt EE
92. Schalkwyk LC
93. Seltzer Z
94. Shimomura K
95. Shou S
96. Sillanpää MJ
97. Siracusa LD
98. Snoeck H-W
99. Spearow JL
100. Svenson K
101. Tarantino LM
102. Threadgill D
103. Toth LA
104. Valdar W
105. de Villena FP-M
106. Warden C
107. Whatley S
108. Williams RW
109. Wiltshire T
110. Yi N
111. Zhang D
112. Zhang M
113. Zou F
114. Complex Trait Consortium
(2004) The Collaborative Cross, a community resource for the genetic analysis of complex traits
Nature Genetics 36:1133–1137.

https://doi.org/10.1038/ng1104-1133
- PubMed
- Google Scholar
(2012) The diversity outbred mouse population
Mammalian Genome 23:713–718.

https://doi.org/10.1007/s00335-012-9414-2
- PubMed
- Google Scholar
1. Civelek M
2. Lusis AJ
(2014) Systems genetics approaches to understand complex traits
Nature Reviews. Genetics 15:34–48.

https://doi.org/10.1038/nrg3575
- PubMed
- Google Scholar
1. Claussnitzer M
2. Dankel SN
3. Kim KH
4. Quon G
5. Meuleman W
6. Haugen C
7. Glunk V
8. Sousa IS
9. Beaudry JL
10. Puviindran V
11. Abdennur NA
12. Liu J
13. Svensson PA
14. Hsu YH
15. Drucker DJ
16. Mellgren G
17. Hui CC
18. Hauner H
19. Kellis M
(2015) FTO Obesity Variant Circuitry and Adipocyte Browning in Humans
The New England Journal of Medicine 373:895–907.

https://doi.org/10.1056/NEJMoa1502214
- PubMed
- Google Scholar
1. Conesa A
2. Madrigal P
3. Tarazona S
4. Gomez-Cabrero D
5. Cervera A
6. McPherson A
7. Szcześniak MW
8. Gaffney DJ
9. Elo LL
10. Zhang X
11. Mortazavi A
(2016) A survey of best practices for RNA-seq data analysis
Genome Biology 17:13.

https://doi.org/10.1186/s13059-016-0881-8
- PubMed
- Google Scholar
1. Consortium GT
(2013)
The Genotype-Tissue Expression (GTEx) project

Nature Genetics 45:580–585.
- Google Scholar
(2022) A Bayesian model selection approach to mediation analysis
PLOS Genetics 18:e1010184.

https://doi.org/10.1371/journal.pgen.1010184
- PubMed
- Google Scholar
1. Davis RC
2. van Nas A
3. Bennett B
4. Orozco L
5. Pan C
6. Rau CD
7. Eskin E
8. Lusis AJ
(2013) Genome-wide association mapping of blood cell traits in mice
Mammalian Genome 24:105–118.

https://doi.org/10.1007/s00335-013-9448-0
- PubMed
- Google Scholar
1. Didelez V
2. Sheehan N
(2007) Mendelian randomization as an instrumental variable approach to causal inference
Statistical Methods in Medical Research 16:309–330.

https://doi.org/10.1177/0962280206077743
- PubMed
- Google Scholar
1. Eberwine J
2. Sul JY
3. Bartfai T
4. Kim J
(2014) The promise of single-cell sequencing
Nature Methods 11:25–27.

https://doi.org/10.1038/nmeth.2769
- PubMed
- Google Scholar
(2009) Genetic dissection of complex traits in yeast: insights from studies of gene expression and other phenotypes in the BYxRM cross
Cold Spring Harbor Symposia on Quantitative Biology 74:145–153.

https://doi.org/10.1101/sqb.2009.74.013
- PubMed
- Google Scholar
1. Farber CR
2. Lusis AJ
(2008) Integrating global gene expression analysis and genetics
Advances in Genetics 60:571–601.

https://doi.org/10.1016/S0065-2660(07)00420-8
- PubMed
- Google Scholar
1. Farber CR
2. Bennett BJ
3. Orozco L
4. Zou W
5. Lira A
6. Kostem E
7. Kang HM
8. Furlotte N
9. Berberyan A
10. Ghazalpour A
11. Suwanwela J
12. Drake TA
13. Eskin E
14. Wang QT
15. Teitelbaum SL
16. Lusis AJ
(2011) Mouse genome-wide association and systems genetics identify Asxl2 as a regulator of bone mineral density and osteoclastogenesis
PLOS Genetics 7:e1002038.

https://doi.org/10.1371/journal.pgen.1002038
- PubMed
- Google Scholar
1. Feng Z
2. Duren Z
3. Xin J
4. Yuan Q
5. He Y
6. Su B
7. Wong WH
8. Wang Y
(2022) Heritability enrichment in context-specific regulatory networks improves phenotype-relevant tissue identification
eLife 11:e82535.

https://doi.org/10.7554/eLife.82535
- PubMed
- Google Scholar
1. Gaziano JM
2. Concato J
3. Brophy M
4. Fiore L
5. Pyarajan S
6. Breeling J
7. Whitbourne S
8. Deen J
9. Shannon C
10. Humphries D
11. Guarino P
12. Aslan M
13. Anderson D
14. LaFleur R
15. Hammond T
16. Schaa K
17. Moser J
18. Huang G
19. Muralidhar S
20. Przygodzki R
21. O’Leary TJ
(2016) Million Veteran Program: A mega-biobank to study genetic influences on health and disease
Journal of Clinical Epidemiology 70:214–223.

https://doi.org/10.1016/j.jclinepi.2015.09.016
- PubMed
- Google Scholar
1. Ghazalpour A
2. Doss S
3. Zhang B
4. Wang S
5. Plaisier C
6. Castellanos R
7. Brozell A
8. Schadt EE
9. Drake TA
10. Lusis AJ
11. Horvath S
12. Gibson G
(2006) Integrating genetic and network analysis to characterize genes related to mouse weight
PLOS Genetics 2:e130.

https://doi.org/10.1371/journal.pgen.0020130
- PubMed
- Google Scholar
1. Ghazalpour A
2. Bennett B
3. Petyuk VA
4. Orozco L
5. Hagopian R
6. Mungrue IN
7. Farber CR
8. Sinsheimer J
9. Kang HM
10. Furlotte N
11. Park CC
12. Wen PZ
13. Brewer H
14. Weitz K
15. Camp DG
16. Pan C
17. Yordanova R
18. Neuhaus I
19. Tilford C
20. Siemers N
21. Gargalovic P
22. Eskin E
23. Kirchgessner T
24. Smith DJ
25. Smith RD
26. Lusis AJ
27. Snyder M
(2011) Comparative analysis of proteome and transcriptome variation in mouse
PLOS Genetics 7:e1001393.

https://doi.org/10.1371/journal.pgen.1001393
- PubMed
- Google Scholar
1. Ghazalpour A
2. Bennett BJ
3. Shih D
4. Che N
5. Orozco L
6. Pan C
7. Hagopian R
8. He A
9. Kayne P
10. Yang W
11. Kirchgessner T
12. Lusis AJ
(2014) Genetic regulation of mouse liver metabolite levels
Molecular Systems Biology 10:730.

https://doi.org/10.15252/msb.20135004
- PubMed
- Google Scholar
(2014) Bayesian test for colocalisation between pairs of genetic association studies using summary statistics
PLOS Genetics 10:e1004383.

https://doi.org/10.1371/journal.pgen.1004383
- PubMed
- Google Scholar
1. Glinos DA
2. Garborcauskas G
3. Hoffman P
4. Ehsan N
5. Jiang L
6. Gokden A
7. Dai X
8. Aguet F
9. Brown KL
10. Garimella K
11. Bowers T
12. Costello M
13. Ardlie K
14. Jian R
15. Tucker NR
16. Ellinor PT
17. Harrington ED
18. Tang H
19. Snyder M
20. Juul S
21. Mohammadi P
22. MacArthur DG
23. Lappalainen T
24. Cummings BB
(2022) Transcriptome variation in human tissues revealed by long-read sequencing
Nature 608:353–359.

https://doi.org/10.1038/s41586-022-05035-y
- PubMed
- Google Scholar
1. Gold L
2. Ayers D
3. Bertino J
4. Bock C
5. Bock A
6. Brody E
7. Carter J
8. Cunningham V
9. Dalby A
10. Eaton B
11. Fitzwater T
12. Flather D
13. Forbes A
14. Foreman T
15. Fowler C
16. Gawande B
17. Goss M
18. Gunn M
19. Gupta S
20. Halladay D
21. Heil J
22. Heilig J
23. Hicke B
24. Husar G
25. Janjic N
26. Jarvis T
27. Jennings S
28. Katilius E
29. Keeney T
30. Kim N
31. Kaske T
32. Koch T
33. Kraemer S
34. Kroiss L
35. Le N
36. Levine D
37. Lindsey W
38. Lollo B
39. Mayfield W
40. Mehan M
41. Mehler R
42. Nelson M
43. Nelson S
44. Nieuwlandt D
45. Nikrad M
46. Ochsner U
47. Ostroff R
48. Otis M
49. Parker T
50. Pietrasiewicz S
51. Resnicow D
52. Rohloff J
53. Sanders G
54. Sattin S
55. Schneider D
56. Singer B
57. Stanton M
58. Sterkel A
59. Stewart A
60. Stratford S
61. Vaught J
62. Vrkljan M
63. Walker J
64. Watrobka M
65. Waugh S
66. Weiss A
67. Wilcox S
68. Wolfson A
69. Wolk S
70. Zhang C
71. Zichi D
(2010) Aptamer-based multiplexed proteomic technology for biomarker discovery
Nature Precedings 5:e15004.

https://doi.org/10.1038/npre.2010.4538.1
- Google Scholar
1. Hartiala J
2. Bennett BJ
3. Tang WHW
4. Wang Z
5. Stewart AFR
6. Roberts R
7. McPherson R
8. Lusis AJ
9. Hazen SL
10. Allayee H
11. CARDIoGRAM Consortium
(2014) Comparative genome-wide association studies in mice and humans for trimethylamine N-oxide, a proatherogenic metabolite of choline and L-carnitine
Arteriosclerosis, Thrombosis, and Vascular Biology 34:1307–1313.

https://doi.org/10.1161/ATVBAHA.114.303252
- PubMed
- Google Scholar
1. Hasin Y
2. Seldin M
3. Lusis A
(2017) Multi-omics approaches to disease
Genome Biology 18:83.

https://doi.org/10.1186/s13059-017-1215-1
- PubMed
- Google Scholar
1. Heinz S
2. Romanoski CE
3. Benner C
4. Allison KA
5. Kaikkonen MU
6. Orozco LD
7. Glass CK
(2013) Effect of natural genetic variation on enhancer selection and function
Nature 503:487–492.

https://doi.org/10.1038/nature12615
- PubMed
- Google Scholar
1. Hemani G
2. Zheng J
3. Elsworth B
4. Wade KH
5. Haberland V
6. Baird D
7. Laurin C
8. Burgess S
9. Bowden J
10. Langdon R
11. Tan VY
12. Yarmolinsky J
13. Shihab HA
14. Timpson NJ
15. Evans DM
16. Relton C
17. Martin RM
18. Davey Smith G
19. Gaunt TR
20. Haycock PC
(2018) The MR-Base platform supports systematic causal inference across the human phenome
eLife 7:e34408.

https://doi.org/10.7554/eLife.34408
- PubMed
- Google Scholar
1. Hirata M
2. Kamatani Y
3. Nagai A
4. Kiyohara Y
5. Ninomiya T
6. Tamakoshi A
7. Yamagata Z
8. Kubo M
9. Muto K
10. Mushiroda T
11. Murakami Y
12. Yuji K
13. Furukawa Y
14. Zembutsu H
15. Tanaka T
16. Ohnishi Y
17. Nakamura Y
18. Hospital GBJC
19. Matsuda K
(2017) Cross-sectional analysis of BioBank Japan clinical data: A large cohort of 200,000 patients with 47 common diseases
Journal of Epidemiology 27:S9–S21.

https://doi.org/10.1016/j.je.2016.12.003
- PubMed
- Google Scholar
1. Hodel F
2. Xu ZM
3. Thorball CW
4. de La Harpe R
5. Letang-Mathieu P
6. Brenner N
7. Butt J
8. Bender N
9. Waterboer T
10. Marques-Vidal PM
11. Vollenweider P
12. Vaucher J
13. Fellay J
(2023) Associations of genetic and infectious risk factors with coronary heart disease
eLife 12:e79742.

https://doi.org/10.7554/eLife.79742
- PubMed
- Google Scholar
1. Hormozdiari F
2. Kostem E
3. Kang EY
4. Pasaniuc B
5. Eskin E
(2014) Identifying causal variants at loci with multiple signals of association
Genetics 198:497–508.

https://doi.org/10.1534/genetics.114.167908
- PubMed
- Google Scholar
(2012) DNA methylation arrays as surrogate measures of cell mixture distribution
BMC Bioinformatics 13:86.

https://doi.org/10.1186/1471-2105-13-86
- PubMed
- Google Scholar
1. Huang JK
2. Carlin DE
3. Yu MK
4. Zhang W
5. Kreisberg JF
6. Tamayo P
7. Ideker T
(2018) Systematic evaluation of molecular networks for discovery of disease genes
Cell Systems 6:484–495.

https://doi.org/10.1016/j.cels.2018.03.001
- PubMed
- Google Scholar
1. Hui ST
2. Parks BW
3. Org E
4. Norheim F
5. Che N
6. Pan C
7. Castellani LW
8. Charugundla S
9. Dirks DL
10. Psychogios N
11. Neuhaus I
12. Gerszten RE
13. Kirchgessner T
14. Gargalovic PS
15. Lusis AJ
(2015) The genetic architecture of NAFLD among inbred strains of mice
eLife 4:e05607.

https://doi.org/10.7554/eLife.05607
- Google Scholar
(2018) The genetic architecture of diet-induced hepatic fibrosis in mice
Hepatology 68:2182–2196.

https://doi.org/10.1002/hep.30113
- PubMed
- Google Scholar
(2022) Lack of evidence for increased transcriptional noise in aged tissues
eLife 11:e80380.

https://doi.org/10.7554/eLife.80380
- PubMed
- Google Scholar
(2018) Metabolomics and Isotope Tracing
Cell 173:822–837.

https://doi.org/10.1016/j.cell.2018.03.055
- Google Scholar
(2014) Mendelian randomization studies in coronary artery disease
European Heart Journal 35:1917–1924.

https://doi.org/10.1093/eurheartj/ehu208
- PubMed
- Google Scholar
1. Jie Z
2. Xia H
3. Zhong SL
4. Feng Q
5. Li S
6. Liang S
7. Zhong H
8. Liu Z
9. Gao Y
10. Zhao H
11. Zhang D
12. Su Z
13. Fang Z
14. Lan Z
15. Li J
16. Xiao L
17. Li J
18. Li R
19. Li X
20. Li F
21. Ren H
22. Huang Y
23. Peng Y
24. Li G
25. Wen B
26. Dong B
27. Chen JY
28. Geng QS
29. Zhang ZW
30. Yang H
31. Wang J
32. Wang J
33. Zhang X
34. Madsen L
35. Brix S
36. Ning G
37. Xu X
38. Liu X
39. Hou Y
40. Jia H
41. He K
42. Kristiansen K
(2017) The gut microbiome in atherosclerotic cardiovascular disease
Nature Communications 8:845.

https://doi.org/10.1038/s41467-017-00900-1
- PubMed
- Google Scholar
1. Jones RC
2. Karkanias J
3. Krasnow MA
4. Pisco AO
5. Quake SR
6. Salzman J
7. Yosef N
8. Bulthaup B
9. Brown P
10. Harper W
11. Hemenez M
12. Ponnusamy R
13. Salehi A
14. Sanagavarapu BA
15. Spallino E
16. Aaron KA
17. Concepcion W
18. Gardner JM
19. Kelly B
20. Neidlinger N
21. Wang Z
22. Crasta S
23. Kolluru S
24. Morri M
25. Pisco AO
26. Tan SY
27. Travaglini KJ
28. Xu C
29. Alcántara-Hernández M
30. Almanzar N
31. Antony J
32. Beyersdorf B
33. Burhan D
34. Calcuttawala K
35. Carter MM
36. Chan CKF
37. Chang CA
38. Chang S
39. Colville A
40. Crasta S
41. Culver RN
42. Cvijović I
43. D’Amato G
44. Ezran C
45. Galdos FX
46. Gillich A
47. Goodyer WR
48. Hang Y
49. Hayashi A
50. Houshdaran S
51. Huang X
52. Irwin JC
53. Jang S
54. Juanico JV
55. Kershner AM
56. Kim S
57. Kiss B
58. Kolluru S
59. Kong W
60. Kumar ME
61. Kuo AH
62. Leylek R
63. Li B
64. Loeb GB
65. Lu WJ
66. Mantri S
67. Markovic M
68. McAlpine PL
69. de Morree A
70. Morri M
71. Mrouj K
72. Mukherjee S
73. Muser T
74. Neuhöfer P
75. Nguyen TD
76. Perez K
77. Phansalkar R
78. Pisco AO
79. Puluca N
80. Qi Z
81. Rao P
82. Raquer-McKay H
83. Schaum N
84. Scott B
85. Seddighzadeh B
86. Segal J
87. Sen S
88. Sikandar S
89. Spencer SP
90. Steffes LC
91. Subramaniam VR
92. Swarup A
93. Swift M
94. Travaglini KJ
95. Van Treuren W
96. Trimm E
97. Veizades S
98. Vijayakumar S
99. Vo KC
100. Vorperian SK
101. Wang W
102. Weinstein HNW
103. Winkler J
104. Wu TTH
105. Xie J
106. Yung AR
107. Zhang Y
108. Detweiler AM
109. Mekonen H
110. Neff NF
111. Sit RV
112. Tan M
113. Yan J
114. Bean GR
115. Charu V
116. Forgó E
117. Martin BA
118. Ozawa MG
119. Silva O
120. Tan SY
121. Toland A
122. Vemuri VNP
123. Afik S
124. Awayan K
125. Botvinnik OB
126. Byrne A
127. Chen M
128. Dehghannasiri R
129. Detweiler AM
130. Gayoso A
131. Granados AA
132. Li Q
133. Mahmoudabadi G
134. McGeever A
135. de Morree A
136. Olivieri JE
137. Park M
138. Pisco AO
139. Ravikumar N
140. Salzman J
141. Stanley G
142. Swift M
143. Tan M
144. Tan W
145. Tarashansky AJ
146. Vanheusden R
147. Vorperian SK
148. Wang P
149. Wang S
150. Xing G
151. Xu C
152. Yosef N
153. Alcántara-Hernández M
154. Antony J
155. Chan CKF
156. Chang CA
157. Colville A
158. Crasta S
159. Culver R
160. Dethlefsen L
161. Ezran C
162. Gillich A
163. Hang Y
164. Ho PY
165. Irwin JC
166. Jang S
167. Kershner AM
168. Kong W
169. Kumar ME
170. Kuo AH
171. Leylek R
172. Liu S
173. Loeb GB
174. Lu WJ
175. Maltzman JS
176. Metzger RJ
177. de Morree A
178. Neuhöfer P
179. Perez K
180. Phansalkar R
181. Qi Z
182. Rao P
183. Raquer-McKay H
184. Sasagawa K
185. Scott B
186. Sinha R
187. Song H
188. Spencer SP
189. Swarup A
190. Swift M
191. Travaglini KJ
192. Trimm E
193. Veizades S
194. Vijayakumar S
195. Wang B
196. Wang W
197. Winkler J
198. Xie J
199. Yung AR
200. Artandi SE
201. Beachy PA
202. Clarke MF
203. Giudice LC
204. Huang FW
205. Huang KC
206. Idoyaga J
207. Kim SK
208. Krasnow M
209. Kuo CS
210. Nguyen P
211. Quake SR
212. Rando TA
213. Red-Horse K
214. Reiter J
215. Relman DA
216. Sonnenburg JL
217. Wang B
218. Wu A
219. Wu SM
220. Wyss-Coray T
(2022a) The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans
Science 376:eabl4896.

https://doi.org/10.1126/science.abl4896
- PubMed
- Google Scholar
1. Jones LK
2. Strande NT
3. Calvo EM
4. Chen J
5. Rodriguez G
6. McCormick CZ
7. Hallquist MLG
8. Savatt JM
9. Rocha H
10. Williams MS
11. Sturm AC
12. Buchanan AH
13. Glasgow RE
14. Martin CL
15. Rahm AK
(2022b) A RE-AIM Framework Analysis of DNA-Based Population Screening: Using Implementation Science to Translate Research Into Practice in A Healthcare System
Frontiers in Genetics 13:883073.

https://doi.org/10.3389/fgene.2022.883073
- PubMed
- Google Scholar
1. Jurrjens AW
2. Seldin MM
3. Giles C
4. Meikle PJ
5. Drew BG
6. Calkin AC
(2023) The potential of integrating human and mouse discovery platforms to advance our understanding of cardiometabolic diseases
eLife 12:e86139.

https://doi.org/10.7554/eLife.86139
- PubMed
- Google Scholar
1. Kasahara K
2. Krautkramer KA
3. Org E
4. Romano KA
5. Kerby RL
6. Vivas EI
7. Mehrabian M
8. Denu JM
9. Bäckhed F
10. Lusis AJ
11. Rey FE
(2018) Interactions between Roseburia intestinalis and diet modulate atherogenesis in a murine model
Nature Microbiology 3:1461–1471.

https://doi.org/10.1038/s41564-018-0272-x
- PubMed
- Google Scholar
(2015) Genetics of human metabolism: an update
Human Molecular Genetics 24:R93–R101.

https://doi.org/10.1093/hmg/ddv263
- PubMed
- Google Scholar
1. Keane TM
2. Goodstadt L
3. Danecek P
4. White MA
5. Wong K
6. Yalcin B
7. Heger A
8. Agam A
9. Slater G
10. Goodson M
11. Furlotte NA
12. Eskin E
13. Nellåker C
14. Whitley H
15. Cleak J
16. Janowitz D
17. Hernandez-Pliego P
18. Edwards A
19. Belgard TG
20. Oliver PL
21. McIntyre RE
22. Bhomra A
23. Nicod J
24. Gan X
25. Yuan W
26. van der Weyden L
27. Steward CA
28. Bala S
29. Stalker J
30. Mott R
31. Durbin R
32. Jackson IJ
33. Czechanski A
34. Guerra-Assunção JA
35. Donahue LR
36. Reinholdt LG
37. Payseur BA
38. Ponting CP
39. Birney E
40. Flint J
41. Adams DJ
(2011) Mouse genomic variation and its effect on phenotypes and gene regulation
Nature 477:289–294.

https://doi.org/10.1038/nature10413
- PubMed
- Google Scholar
1. Keele GR
(2023) Which mouse multiparental population is right for your study? The Collaborative Cross inbred strains, their F1 hybrids, or the Diversity Outbred population
G3 13:jkad027.

https://doi.org/10.1093/g3journal/jkad027
- PubMed
- Google Scholar
1. Keller MP
2. Choi Y
3. Wang P
4. Davis DB
5. Rabaglia ME
6. Oler AT
7. Stapleton DS
8. Argmann C
9. Schueler KL
10. Edwards S
11. Steinberg HA
12. Chaibub Neto E
13. Kleinhanz R
14. Turner S
15. Hellerstein MK
16. Schadt EE
17. Yandell BS
18. Kendziorski C
19. Attie AD
(2008) A gene expression network model of type 2 diabetes links cell cycle regulation in islets with diabetes susceptibility
Genome Research 18:706–716.

https://doi.org/10.1101/gr.074914.107
- PubMed
- Google Scholar
1. Keller MP
2. Rabaglia ME
3. Schueler KL
4. Stapleton DS
5. Gatti DM
6. Vincent M
7. Mitok KA
8. Wang Z
9. Ishimura T
10. Simonett SP
11. Emfinger CH
12. Das R
13. Beck T
14. Kendziorski C
15. Broman KW
16. Yandell BS
17. Churchill GA
18. Attie AD
(2019) Gene loci associated with insulin secretion in islets from non-diabetic mice
The Journal of Clinical Investigation 129:4419–4432.

https://doi.org/10.1172/JCI129143
- PubMed
- Google Scholar
1. Kim-Hellmuth S
2. Aguet F
3. Oliva M
4. Muñoz-Aguirre M
5. Kasela S
6. Wucher V
7. Castel SE
8. Hamel AR
9. Viñuela A
10. Roberts AL
11. Mangul S
12. Wen X
13. Wang G
14. Barbeira AN
15. Garrido-Martín D
16. Nadel BB
17. Zou Y
18. Bonazzola R
19. Quan J
20. Brown A
21. Martinez-Perez A
22. Soria JM
23. GTEx Consortium
24. Getz G
25. Dermitzakis ET
26. Small KS
27. Stephens M
28. Xi HS
29. Im HK
30. Guigó R
31. Segrè AV
32. Stranger BE
33. Ardlie KG
34. Lappalainen T
(2020) Cell type-specific genetic regulation of gene expression across human tissues
Science 369:eaaz8528.

https://doi.org/10.1126/science.aaz8528
- PubMed
- Google Scholar
Book
1. Koller D
2. Friedman N
(2009)
Probabilistic Graphical Models: Principles and Techniques

Cambridge, Massachusetts, London, England: MIT Press.
- Google Scholar
1. Koplev S
2. Seldin M
3. Sukhavasi K
4. Ermel R
5. Pang S
6. Zeng L
7. Bankier S
8. Di Narzo A
9. Cheng H
10. Meda V
11. Ma A
12. Talukdar H
13. Cohain A
14. Amadori L
15. Argmann C
16. Houten SM
17. Franzén O
18. Mocci G
19. Meelu OA
20. Ishikawa K
21. Whatling C
22. Jain A
23. Jain RK
24. Gan LM
25. Giannarelli C
26. Roussos P
27. Hao K
28. Schunkert H
29. Michoel T
30. Ruusalepp A
31. Schadt EE
32. Kovacic JC
33. Lusis AJ
34. Björkegren JLM
(2022) A mechanistic framework for cardiometabolic and coronary artery diseases
Nature Cardiovascular Research 1:85–100.

https://doi.org/10.1038/s44161-021-00009-1
- PubMed
- Google Scholar
1. Kurilshikov A
2. Medina-Gomez C
3. Bacigalupe R
4. Radjabzadeh D
5. Wang J
6. Demirkan A
7. Le Roy CI
8. Raygoza Garay JA
9. Finnicum CT
10. Liu X
11. Zhernakova DV
12. Bonder MJ
13. Hansen TH
14. Frost F
15. Rühlemann MC
16. Turpin W
17. Moon J-Y
18. Kim H-N
19. Lüll K
20. Barkan E
21. Shah SA
22. Fornage M
23. Szopinska-Tokov J
24. Wallen ZD
25. Borisevich D
26. Agreus L
27. Andreasson A
28. Bang C
29. Bedrani L
30. Bell JT
31. Bisgaard H
32. Boehnke M
33. Boomsma DI
34. Burk RD
35. Claringbould A
36. Croitoru K
37. Davies GE
38. van Duijn CM
39. Duijts L
40. Falony G
41. Fu J
42. van der Graaf A
43. Hansen T
44. Homuth G
45. Hughes DA
46. Ijzerman RG
47. Jackson MA
48. Jaddoe VWV
49. Joossens M
50. Jørgensen T
51. Keszthelyi D
52. Knight R
53. Laakso M
54. Laudes M
55. Launer LJ
56. Lieb W
57. Lusis AJ
58. Masclee AAM
59. Moll HA
60. Mujagic Z
61. Qibin Q
62. Rothschild D
63. Shin H
64. Sørensen SJ
65. Steves CJ
66. Thorsen J
67. Timpson NJ
68. Tito RY
69. Vieira-Silva S
70. Völker U
71. Völzke H
72. Võsa U
73. Wade KH
74. Walter S
75. Watanabe K
76. Weiss S
77. Weiss FU
78. Weissbrod O
79. Westra H-J
80. Willemsen G
81. Payami H
82. Jonkers DMAE
83. Arias Vasquez A
84. de Geus EJC
85. Meyer KA
86. Stokholm J
87. Segal E
88. Org E
89. Wijmenga C
90. Kim H-L
91. Kaplan RC
92. Spector TD
93. Uitterlinden AG
94. Rivadeneira F
95. Franke A
96. Lerch MM
97. Franke L
98. Sanna S
99. D’Amato M
100. Pedersen O
101. Paterson AD
102. Kraaij R
103. Raes J
104. Zhernakova A
(2021) Large-scale association analyses identify host factors influencing human gut microbiome composition
Nature Genetics 53:156–165.

https://doi.org/10.1038/s41588-020-00763-1
- PubMed
- Google Scholar
1. Langfelder P
2. Horvath S
(2008) WGCNA: an R package for weighted correlation network analysis
BMC Bioinformatics 9:559.

https://doi.org/10.1186/1471-2105-9-559
- PubMed
- Google Scholar
1. Lavinsky J
2. Ge M
3. Crow AL
4. Pan C
5. Wang J
6. Salehi P
7. Myint A
8. Eskin E
9. Allayee H
10. Lusis AJ
11. Friedman RA
(2016) The genetic architecture of noise-induced hearing loss: evidence for a gene-by-environment interaction
G3: Genes, Genomes, Genetics 6:3219–3228.

https://doi.org/10.1534/g3.116.032516
- PubMed
- Google Scholar
(2008) Mendelian randomization: using genes as instruments for making causal inferences in epidemiology
Statistics in Medicine 27:1133–1163.

https://doi.org/10.1002/sim.3034
- PubMed
- Google Scholar
1. Li M
2. Zou D
3. Li Z
4. Gao R
5. Sang J
6. Zhang Y
7. Li R
8. Xia L
9. Zhang T
10. Niu G
11. Bao Y
12. Zhang Z
(2019) EWAS Atlas: a curated knowledgebase of epigenome-wide association studies
Nucleic Acids Research 47:D983–D988.

https://doi.org/10.1093/nar/gky1027
- PubMed
- Google Scholar
1. Li H
2. Auwerx J
(2020) Mouse systems genetics as a prelude to precision medicine
Trends in Genetics 36:259–272.

https://doi.org/10.1016/j.tig.2020.01.004
- PubMed
- Google Scholar
1. Li Z
2. Votava JA
3. Zajac GJM
4. Nguyen JN
5. Leyva Jaimes FB
6. Ly SM
7. Brinkman JA
8. De Giorgi M
9. Kaul S
10. Green CL
11. St Clair SL
12. Belisle SL
13. Rios JM
14. Nelson DW
15. Sorci-Thomas MG
16. Lagor WR
17. Lamming DW
18. Eric Yen C-L
19. Parks BW
(2020) Integrating Mouse and Human Genetic Data to Move beyond GWAS and Identify Causal Genes in Cholesterol Metabolism
Cell Metabolism 31:741–754.

https://doi.org/10.1016/j.cmet.2020.02.015
- PubMed
- Google Scholar
1. Li B
2. Ritchie MD
(2021) From GWAS to Gene: Transcriptome-Wide Association Studies and Other Methods to Functionally Understand GWAS Discoveries
Frontiers in Genetics 12:713230.

https://doi.org/10.3389/fgene.2021.713230
- PubMed
- Google Scholar
(2022) Integrative systems analysis identifies genetic and dietary modulators of bile acid homeostasis
Cell Metabolism 34:1594–1610.

https://doi.org/10.1016/j.cmet.2022.08.015
- PubMed
- Google Scholar
1. Li X
2. Morel JD
3. Benegiamo G
4. Poisson J
5. Bachmann A
6. Rapin A
7. Williams E
8. Perino A
9. Schoonjans K
10. Sleiman MB
11. Auwerx J
(2023) Genetic and Dietary Modulators of the Inflammatory Response in the Gastro-Intestinal Tract of the BXD Mouse Genetic Reference Population
eLife 12:RP87569.

https://doi.org/10.7554/eLife.87569.1
- Google Scholar
(2016) On the Dependency of Cellular Protein Levels on mRNA Abundance
Cell 165:535–550.

https://doi.org/10.1016/j.cell.2016.03.014
- PubMed
- Google Scholar
1. Lusk R
2. Saba LM
3. Vanderlinden LA
4. Zidek V
5. Silhavy J
6. Pravenec M
7. Hoffman PL
8. Tabakoff B
(2018) Unsupervised, statistically based systems biology approach for unraveling the genetics of complex traits: a demonstration with ethanol metabolism
Alcoholism, Clinical and Experimental Research 42:1177–1191.

https://doi.org/10.1111/acer.13763
- PubMed
- Google Scholar
1. Maazi H
2. Hartiala JA
3. Suzuki Y
4. Crow AL
5. Shafiei Jahani P
6. Lam J
7. Patel N
8. Rigas D
9. Han Y
10. Huang P
11. Eskin E
12. Lusis AJ
13. Gilliland FD
14. Akbari O
15. Allayee H
(2019) A GWAS approach identifies Dapp1 as A determinant of air pollution-induced airway hyperreactivity
PLOS Genetics 15:e1008528.

https://doi.org/10.1371/journal.pgen.1008528
- PubMed
- Google Scholar
1. Masson SWC
2. Madsen S
3. Cooke KC
4. Potter M
5. Vegas AD
6. Carroll L
7. Thillainadesan S
8. Cutler HB
9. Walder KR
10. Cooney GJ
11. Morahan G
12. Stöckli J
13. James DE
(2023) Leveraging genetic diversity to identify small molecules that reverse mouse skeletal muscle insulin resistance
eLife 12:RP86961.

https://doi.org/10.7554/eLife.86961
- PubMed
- Google Scholar
1. Mayhew AJ
2. Meyre D
(2017) Assessing the heritability of complex traits in humans: methodological challenges and opportunities
Current Genomics 18:332–340.

https://doi.org/10.2174/1389202918666170307161450
- PubMed
- Google Scholar
1. McCarroll SA
2. Murphy CT
3. Zou S
4. Pletcher SD
5. Chin CS
6. Jan YN
7. Kenyon C
8. Bargmann CI
9. Li H
(2004) Comparing genomic expression patterns across species identifies shared transcriptional profile in aging
Nature Genetics 36:197–204.

https://doi.org/10.1038/ng1291
- PubMed
- Google Scholar
1. McDermott-Roe C
2. Ye J
3. Ahmed R
4. Sun X-M
5. Serafín A
6. Ware J
7. Bottolo L
8. Muckett P
9. Cañas X
10. Zhang J
11. Rowe GC
12. Buchan R
13. Lu H
14. Braithwaite A
15. Mancini M
16. Hauton D
17. Martí R
18. García-Arumí E
19. Hubner N
20. Jacob H
21. Serikawa T
22. Zidek V
23. Papousek F
24. Kolar F
25. Cardona M
26. Ruiz-Meana M
27. García-Dorado D
28. Comella JX
29. Felkin LE
30. Barton PJR
31. Arany Z
32. Pravenec M
33. Petretto E
34. Sanchis D
35. Cook SA
(2011) Endonuclease G is a novel determinant of cardiac hypertrophy and mitochondrial function
Nature 478:114–118.

https://doi.org/10.1038/nature10490
- PubMed
- Google Scholar
1. Mehrabian M
2. Allayee H
3. Stockton J
4. Lum PY
5. Drake TA
6. Castellani LW
7. Suh M
8. Armour C
9. Edwards S
10. Lamb J
11. Lusis AJ
12. Schadt EE
(2005) Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits
Nature Genetics 37:1224–1233.

https://doi.org/10.1038/ng1619
- PubMed
- Google Scholar
1. Millstein J
2. Zhang B
3. Zhu J
4. Schadt EE
(2009) Disentangling molecular relationships with a causal inference test
BMC Genetics 10:23.

https://doi.org/10.1186/1471-2156-10-23
- PubMed
- Google Scholar
1. Minard AY
2. Wong MKL
3. Chaudhuri R
4. Tan S-X
5. Humphrey SJ
6. Parker BL
7. Yang JY
8. Laybutt DR
9. Cooney GJ
10. Coster ACF
11. Stöckli J
12. James DE
(2016) Hyperactivation of the insulin signaling pathway improves intracellular proteostasis by coordinately up-regulating the proteostatic machinery in adipocytes
The Journal of Biological Chemistry 291:25629–25640.

https://doi.org/10.1074/jbc.M116.741140
- PubMed
- Google Scholar
1. Mitok KA
2. Freiberger EC
3. Schueler KL
4. Rabaglia ME
5. Stapleton DS
6. Kwiecien NW
7. Malec PA
8. Hebert AS
9. Broman AT
10. Kennedy RT
11. Keller MP
12. Coon JJ
13. Attie AD
(2018) Islet proteomics reveals genetic variation in dopamine production resulting in altered insulin secretion
The Journal of Biological Chemistry 293:5860–5877.

https://doi.org/10.1074/jbc.RA117.001102
- PubMed
- Google Scholar
1. Molendijk J
2. Parker BL
(2021) Proteome-wide systems genetics to identify functional regulators of complex traits
Cell Systems 12:5–22.

https://doi.org/10.1016/j.cels.2020.10.005
- PubMed
- Google Scholar
1. Molendijk J
2. Blazev R
3. Mills RJ
4. Ng YK
5. Watt KI
6. Chau D
7. Gregorevic P
8. Crouch PJ
9. Hilton JBW
10. Lisowski L
11. Zhang P
12. Reue K
13. Lusis AJ
14. Hudson JE
15. James DE
16. Seldin MM
17. Parker BL
(2022) Proteome-wide systems genetics identifies UFMylation as a regulator of skeletal muscle function
eLife 11:e82951.

https://doi.org/10.7554/eLife.82951
- PubMed
- Google Scholar
(2023) Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line
eLife 12:e83606.

https://doi.org/10.7554/eLife.83606
- PubMed
- Google Scholar
1. Morrissey C
2. Grieve IC
3. Heinig M
4. Atanur S
5. Petretto E
6. Pravenec M
7. Hubner N
8. Aitman TJ
(2011) Integrated genomic approaches to identification of candidate genes underlying metabolic and cardiovascular phenotypes in the spontaneously hypertensive rat
Physiological Genomics 43:1207–1218.

https://doi.org/10.1152/physiolgenomics.00210.2010
- PubMed
- Google Scholar
1. Moses L
2. Pachter L
(2022) Museum of spatial transcriptomics
Nature Methods 19:534–546.

https://doi.org/10.1038/s41592-022-01409-2
- PubMed
- Google Scholar
1. Mund A
2. Brunner AD
3. Mann M
(2022) Unbiased spatial proteomics with single-cell resolution in tissues
Molecular Cell 82:2335–2349.

https://doi.org/10.1016/j.molcel.2022.05.022
- PubMed
- Google Scholar
1. Nadeau JH
2. Dudley AM
(2011) Genetics: systems genetics
Science 331:1015–1016.

https://doi.org/10.1126/science.1203869
- PubMed
- Google Scholar
1. Nagai A
2. Hirata M
3. Kamatani Y
4. Muto K
5. Matsuda K
6. Kiyohara Y
7. Ninomiya T
8. Tamakoshi A
9. Yamagata Z
10. Mushiroda T
11. Murakami Y
12. Yuji K
13. Furukawa Y
14. Zembutsu H
15. Tanaka T
16. Ohnishi Y
17. Nakamura Y
18. Kubo M
19. BioBank Japan Cooperative Hospital Group
(2017) Overview of the BioBank Japan project: study design and profile
Journal of Epidemiology 27:S2–S8.

https://doi.org/10.1016/j.je.2016.12.005
- PubMed
- Google Scholar
1. Needham EJ
2. Hingst JR
3. Parker BL
4. Morrison KR
5. Yang G
6. Onslev J
7. Kristensen JM
8. Højlund K
9. Ling NXY
10. Oakhill JS
11. Richter EA
12. Kiens B
13. Petersen J
14. Pehmøller C
15. James DE
16. Wojtaszewski JFP
17. Humphrey SJ
(2022) Personalized phosphoproteomics identifies functional signaling
Nature Biotechnology 40:576–584.

https://doi.org/10.1038/s41587-021-01099-9
- PubMed
- Google Scholar
1. Nelson ME
2. Madsen S
3. Cooke KC
4. Fritzen AM
5. Thorius IH
6. Masson SWC
7. Carroll L
8. Weiss FC
9. Seldin MM
10. Potter M
11. Hocking SL
12. Fazakerley DJ
13. Brandon AE
14. Thillainadesan S
15. Senior AM
16. Cooney GJ
17. Stöckli J
18. James DE
(2022) Systems-level analysis of insulin action in mouse strains provides insight into tissue- and pathway-specific interactions that drive insulin resistance
Cell Metabolism 34:227–239.

https://doi.org/10.1016/j.cmet.2021.12.013
- PubMed
- Google Scholar
1. Olsen TK
2. Baryawno N
(2018) Introduction to Single-Cell RNA Sequencing
Current Protocols in Molecular Biology 122:e57.

https://doi.org/10.1002/cpmb.57
- PubMed
- Google Scholar
1. Orozco LD
2. Morselli M
3. Rubbi L
4. Guo W
5. Go J
6. Shi H
7. Lopez D
8. Furlotte NA
9. Bennett BJ
10. Farber CR
11. Ghazalpour A
12. Zhang MQ
13. Bahous R
14. Rozen R
15. Lusis AJ
16. Pellegrini M
(2015) Epigenome-wide association of liver methylation patterns and complex metabolic traits in mice
Cell Metabolism 21:905–917.

https://doi.org/10.1016/j.cmet.2015.04.025
- PubMed
- Google Scholar
1. Pan DZ
2. Miao Z
3. Comenho C
4. Rajkumar S
5. Koka A
6. Lee SHT
7. Alvarez M
8. Kaminska D
9. Ko A
10. Sinsheimer JS
11. Mohlke KL
12. Mancuso N
13. Muñoz-Hernandez LL
14. Herrera-Hernandez M
15. Tusié-Luna MT
16. Aguilar-Salinas C
17. Pietiläinen KH
18. Pihlajamäki J
19. Laakso M
20. Garske KM
21. Pajukanta P
(2021) Correction to: Identification of TBX15 as an adipose master trans regulator of abdominal obesity genes
Genome Medicine 13:139.

https://doi.org/10.1186/s13073-021-00954-3
- PubMed
- Google Scholar
1. Parker BL
2. Calkin AC
3. Seldin MM
4. Keating MF
5. Tarling EJ
6. Yang P
7. Moody SC
8. Liu Y
9. Zerenturk EJ
10. Needham EJ
11. Miller ML
12. Clifford BL
13. Morand P
14. Watt MJ
15. Meex RCR
16. Peng K-Y
17. Lee R
18. Jayawardana K
19. Pan C
20. Mellett NA
21. Weir JM
22. Lazarus R
23. Lusis AJ
24. Meikle PJ
25. James DE
26. de Aguiar Vallim TQ
27. Drew BG
(2019) An integrative systems genetic analysis of mammalian lipid metabolism
Nature 567:187–193.

https://doi.org/10.1038/s41586-019-0984-y
- PubMed
- Google Scholar
1. Parks BW
2. Nam E
3. Org E
4. Kostem E
5. Norheim F
6. Hui ST
7. Pan C
8. Civelek M
9. Rau CD
10. Bennett BJ
11. Mehrabian M
12. Ursell LK
13. He A
14. Castellani LW
15. Zinker B
16. Kirby M
17. Drake TA
18. Drevon CA
19. Knight R
20. Gargalovic P
21. Kirchgessner T
22. Eskin E
23. Lusis AJ
(2013) Genetic control of obesity and gut microbiota composition in response to high-fat, high-sucrose diet in mice
Cell Metabolism 17:141–152.

https://doi.org/10.1016/j.cmet.2012.12.007
- PubMed
- Google Scholar
1. Peirce JL
2. Lu L
3. Gu J
4. Silver LM
5. Williams RW
(2004) A new set of BXD recombinant inbred lines from advanced intercross populations in mice
BMC Genetics 5:7.

https://doi.org/10.1186/1471-2156-5-7
- PubMed
- Google Scholar
Preprint
(2023) Many Bioinformatics Programming Tasks Can Be Automated with ChatGPT
arXiv.

https://arxiv.org/abs/2303.13528
- Google Scholar
1. Pravenec M
2. Klír P
3. Kren V
4. Zicha J
5. Kunes J
(1989)
An analysis of spontaneous hypertension in spontaneously hypertensive rats by means of new recombinant inbred strains

Journal of Hypertension 7:217–221.
- PubMed
- Google Scholar
1. Pravenec M
2. Křen V
3. Landa V
4. Mlejnek P
5. Musilová A
6. Šilhavý J
7. Šimáková M
8. Zídek V
(2014) Recent progress in the genetics of spontaneously hypertensive rats
Physiological Research 63:S1–S8.

https://doi.org/10.33549/physiolres.932622
- PubMed
- Google Scholar
1. Pravenec M
2. Saba LM
3. Zídek V
4. Landa V
5. Mlejnek P
6. Šilhavý J
7. Šimáková M
8. Strnad H
9. Trnovská J
10. Škop V
11. Hüttl M
12. Marková I
13. Oliyarnyk O
14. Malínská H
15. Kazdová L
16. Smith H
17. Tabakoff B
(2018) Systems genetic analysis of brown adipose tissue function
Physiological Genomics 50:52–66.

https://doi.org/10.1152/physiolgenomics.00091.2017
- PubMed
- Google Scholar
1. Presson AP
2. Sobel EM
3. Papp JC
4. Suarez CJ
5. Whistler T
6. Rajeevan MS
7. Vernon SD
8. Horvath S
(2008) Integrated weighted gene co-expression network analysis with an application to chronic fatigue syndrome
BMC Systems Biology 2:95.

https://doi.org/10.1186/1752-0509-2-95
- PubMed
- Google Scholar
1. Rau CD
2. Wang J
3. Avetisyan R
4. Romay MC
5. Martin L
6. Ren S
7. Wang Y
8. Lusis AJ
(2015) Mapping genetic contributions to cardiac pathology induced by Beta-adrenergic stimulation in mice
Circulation. Cardiovascular Genetics 8:40–49.

https://doi.org/10.1161/CIRCGENETICS.113.000732
- PubMed
- Google Scholar
1. Rhoads A
2. Au KF
(2015) PacBio sequencing and its applications
Genomics, Proteomics & Bioinformatics 13:278–289.

https://doi.org/10.1016/j.gpb.2015.08.002
- PubMed
- Google Scholar
1. Riordan JD
2. Nadeau JH
(2017) From peas to disease: modifier genes, network resilience, and the genetics of health
American Journal of Human Genetics 101:177–191.

https://doi.org/10.1016/j.ajhg.2017.06.004
- PubMed
- Google Scholar
1. Roden DM
2. Pulley JM
3. Basford MA
4. Bernard GR
5. Clayton EW
6. Balser JR
7. Masys DR
(2008) Development of a large-scale de-identified DNA biobank to enable personalized medicine
Clinical Pharmacology and Therapeutics 84:362–369.

https://doi.org/10.1038/clpt.2008.89
- PubMed
- Google Scholar
1. Saba LM
2. Flink SC
3. Vanderlinden LA
4. Israel Y
5. Tampier L
6. Colombo G
7. Kiianmaa K
8. Bell RL
9. Printz MP
10. Flodman P
11. Koob G
12. Richardson HN
13. Lombardo J
14. Hoffman PL
15. Tabakoff B
(2015) The sequenced rat brain transcriptome--its use in identifying networks predisposing alcohol consumption
The FEBS Journal 282:3556–3578.

https://doi.org/10.1111/febs.13358
- PubMed
- Google Scholar
1. Salas LA
2. Zhang Z
3. Koestler DC
4. Butler RA
5. Hansen HM
6. Molinaro AM
7. Wiencke JK
8. Kelsey KT
9. Christensen BC
(2022) Enhanced cell deconvolution of peripheral blood using DNA methylation for high-resolution immune profiling
Nature Communications 13:761.

https://doi.org/10.1038/s41467-021-27864-7
- PubMed
- Google Scholar
(2019) An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings
International Journal of Epidemiology 48:713–727.

https://doi.org/10.1093/ije/dyy262
- PubMed
- Google Scholar
1. Sasani TA
2. Ashbrook DG
3. Beichman AC
4. Lu L
5. Palmer AA
6. Williams RW
7. Pritchard JK
8. Harris K
(2022) A natural mutator allele shapes mutation spectrum variation in mice
Nature 605:497–502.

https://doi.org/10.1038/s41586-022-04701-5
- PubMed
- Google Scholar
(2019) High-diversity mouse populations for complex traits
Trends in Genetics 35:501–514.

https://doi.org/10.1016/j.tig.2019.04.003
- PubMed
- Google Scholar
1. Schadt EE
2. Lamb J
3. Yang X
4. Zhu J
5. Edwards S
6. Guhathakurta D
7. Sieberts SK
8. Monks S
9. Reitman M
10. Zhang C
11. Lum PY
12. Leonardson A
13. Thieringer R
14. Metzger JM
15. Yang L
16. Castle J
17. Zhu H
18. Kash SF
19. Drake TA
20. Sachs A
21. Lusis AJ
(2005) An integrative genomics approach to infer causal associations between gene expression and disease
Nature Genetics 37:710–717.

https://doi.org/10.1038/ng1589
- PubMed
- Google Scholar
1. Scheffler M
2. Aeschlimann M
3. Albrecht M
4. Bereau T
5. Bungartz H-J
6. Felser C
7. Greiner M
8. Groß A
9. Koch CT
10. Kremer K
11. Nagel WE
12. Scheidgen M
13. Wöll C
14. Draxl C
(2022) FAIR data enabling new horizons for materials research
Nature 604:635–642.

https://doi.org/10.1038/s41586-022-04501-x
- PubMed
- Google Scholar
1. Schughart K
2. Williams RW
(2017)
Systems Genetics: Methods and Protocols

Systems Genetics, Systems Genetics: Methods and Protocols, New York, NY, Springer, 10.1007/978-1-4939-6427-7.
- Google Scholar
1. Schwartz MLB
2. McCormick CZ
3. Lazzeri AL
4. Lindbuchler DM
5. Hallquist MLG
6. Manickam K
7. Buchanan AH
8. Rahm AK
9. Giovanni MA
10. Frisbie L
11. Flansburg CN
12. Davis FD
13. Sturm AC
14. Nicastro C
15. Lebo MS
16. Mason-Suares H
17. Mahanta LM
18. Carey DJ
19. Williams JL
20. Williams MS
21. Ledbetter DH
22. Faucett WA
23. Murray MF
(2018) A model for genome-first care: returning secondary genomic findings to participants and their healthcare providers in a large research cohort
American Journal of Human Genetics 103:328–337.

https://doi.org/10.1016/j.ajhg.2018.07.009
- PubMed
- Google Scholar
1. Seldin MM
2. Koplev S
3. Rajbhandari P
4. Vergnes L
5. Rosenberg GM
6. Meng Y
7. Pan C
8. Phuong TMN
9. Gharakhanian R
10. Che N
11. Mäkinen S
12. Shih DM
13. Civelek M
14. Parks BW
15. Kim ED
16. Norheim F
17. Chella Krishnan K
18. Hasin-Brumshtein Y
19. Mehrabian M
20. Laakso M
21. Drevon CA
22. Koistinen HA
23. Tontonoz P
24. Reue K
25. Cantor RM
26. Björkegren JLM
27. Lusis AJ
(2018) A strategy for discovery of endocrine interactions with application to whole-body metabolism
Cell Metabolism 27:1138–1155.

https://doi.org/10.1016/j.cmet.2018.03.015
- PubMed
- Google Scholar
1. Seldin M
2. Yang X
3. Lusis AJ
(2019) Systems genetics applications in metabolism research
Nature Metabolism 1:1038–1050.

https://doi.org/10.1038/s42255-019-0132-x
- PubMed
- Google Scholar
1. Senko AN
2. Overall RW
3. Silhavy J
4. Mlejnek P
5. Malínská H
6. Hüttl M
7. Marková I
8. Fabel KS
9. Lu L
10. Stuchlik A
11. Williams RW
12. Pravenec M
13. Kempermann G
(2022) Systems genetics in the rat HXB/BXH family identifies Tti2 as a pleiotropic quantitative trait gene for adult hippocampal neurogenesis and serum glucose
PLOS Genetics 18:e1009638.

https://doi.org/10.1371/journal.pgen.1009638
- PubMed
- Google Scholar
1. Seroussi U
2. Lugowski A
3. Wadi L
4. Lao RX
5. Willis AR
6. Zhao W
7. Sundby AE
8. Charlesworth AG
9. Reinke AW
10. Claycomb JM
(2023) A comprehensive survey of C. elegans argonaute proteins reveals organism-wide gene regulatory networks and functions
eLife 12:e83853.

https://doi.org/10.7554/eLife.83853
- PubMed
- Google Scholar
1. Shu L
2. Zhao Y
3. Kurt Z
4. Byars SG
5. Tukiainen T
6. Kettunen J
7. Orozco LD
8. Pellegrini M
9. Lusis AJ
10. Ripatti S
11. Zhang B
12. Inouye M
13. Mäkinen V-P
14. Yang X
(2016) Mergeomics: multidimensional data integration to identify pathogenic perturbations to biological systems
BMC Genomics 17:874.

https://doi.org/10.1186/s12864-016-3198-9
- PubMed
- Google Scholar
1. Shu L
2. Chan KHK
3. Zhang G
4. Huan T
5. Kurt Z
6. Zhao Y
7. Codoni V
8. Trégouët D-A
9. Yang J
10. Wilson JG
11. Luo X
12. Levy D
13. Lusis AJ
14. Liu S
15. Yang X
16. Cardiogenics Consortium
(2017) Shared genetic regulatory networks for cardiovascular disease and type 2 diabetes in multiple populations of diverse ethnicities in the United States
PLOS Genetics 13:e1007040.

https://doi.org/10.1371/journal.pgen.1007040
- PubMed
- Google Scholar
1. Shue E
2. Liu L
3. Li B
4. Feng Z
5. Li X
6. Hu G
(2023) Empowering beginners in bioinformatics with ChatGPT
Quantitative Biology 11:105–108.

https://doi.org/10.15302/j-qb-023-0327
- PubMed
- Google Scholar
(2005) Clusters of co-expressed genes in mammalian genomes are conserved by natural selection
Molecular Biology and Evolution 22:767–775.

https://doi.org/10.1093/molbev/msi062
- PubMed
- Google Scholar
(2016) Genetic background limits generalizability of genotype-phenotype relationships
Neuron 91:1253–1259.

https://doi.org/10.1016/j.neuron.2016.08.013
- PubMed
- Google Scholar
1. Skelly DA
2. Czechanski A
3. Byers C
4. Aydin S
5. Spruce C
6. Olivier C
7. Choi K
8. Gatti DM
9. Raghupathy N
10. Keele GR
11. Stanton A
12. Vincent M
13. Dion S
14. Greenstein I
15. Pankratz M
16. Porter DK
17. Martin W
18. O’Connor C
19. Qin W
20. Harrill AH
21. Choi T
22. Churchill GA
23. Munger SC
24. Baker CL
25. Reinholdt LG
(2020) Mapping the effects of genetic variation on chromatin state and gene expression reveals loci that control ground state pluripotency
Cell Stem Cell 27:459–469.

https://doi.org/10.1016/j.stem.2020.07.005
- PubMed
- Google Scholar
1. Smith GD
2. Ebrahim S
(2003) “Mendelian randomization”: can genetic epidemiology contribute to understanding environmental determinants of disease?
International Journal of Epidemiology 32:1–22.

https://doi.org/10.1093/ije/dyg070
- PubMed
- Google Scholar
1. Solberg Woods LC
2. Palmer AA
(2019) Using heterogeneous stocks for fine-mapping genetically complex traits
Methods in Molecular Biology 2018:233–247.

https://doi.org/10.1007/978-1-4939-9581-3_11
- PubMed
- Google Scholar
(2019) RNA sequencing: the teenage years
Nature Reviews. Genetics 20:631–656.

https://doi.org/10.1038/s41576-019-0150-2
- PubMed
- Google Scholar
1. Steinbock LJ
2. Radenovic A
(2015) The emergence of nanopores in next-generation sequencing
Nanotechnology 26:074003.

https://doi.org/10.1088/0957-4484/26/7/074003
- PubMed
- Google Scholar
(2018) Large-scale investigation of the reasons why potentially important genes are ignored
PLOS Biology 16:e2006643.

https://doi.org/10.1371/journal.pbio.2006643
- PubMed
- Google Scholar
1. Svenson KL
2. Gatti DM
3. Valdar W
4. Welsh CE
5. Cheng R
6. Chesler EJ
7. Palmer AA
8. McMillan L
9. Churchill GA
(2012) High-resolution genetic mapping using the Mouse Diversity outbred population
Genetics 190:437–447.

https://doi.org/10.1534/genetics.111.132597
- PubMed
- Google Scholar
(2019) Networking in biology: the hybrid rat diversity panel
Methods in Molecular Biology 2018:213–231.

https://doi.org/10.1007/978-1-4939-9581-3_10
- PubMed
- Google Scholar
1. Tabassum R
2. Rämö JT
3. Ripatti P
4. Koskela JT
5. Kurki M
6. Karjalainen J
7. Palta P
8. Hassan S
9. Nunez-Fontarnau J
10. Kiiskinen TTJ
11. Söderlund S
12. Matikainen N
13. Gerl MJ
14. Surma MA
15. Klose C
16. Stitziel NO
17. Laivuori H
18. Havulinna AS
19. Service SK
20. Salomaa V
21. Pirinen M
22. FinnGen Project
23. Jauhiainen M
24. Daly MJ
25. Freimer NB
26. Palotie A
27. Taskinen M-R
28. Simons K
29. Ripatti S
(2019) Genetic architecture of human plasma lipidome and its link to cardiovascular disease
Nature Communications 10:4329.

https://doi.org/10.1038/s41467-019-11954-8
- PubMed
- Google Scholar
(2019) Intestinal Microbiota in Cardiovascular Health and Disease: JACC State-of-the-Art Review
Journal of the American College of Cardiology 73:2089–2105.

https://doi.org/10.1016/j.jacc.2019.03.024
- PubMed
- Google Scholar
(1973) Genetic analysis of resistance to cadmium-induced testicular damage in mice
Experimental Biology and Medicine 143:629–633.

https://doi.org/10.3181/00379727-143-37380
- Google Scholar
1. Tuominen I
2. Fuqua BK
3. Pan C
4. Renaud N
5. Wroblewski K
6. Civelek M
7. Clerkin K
8. Asaryan A
9. Haroutunian SG
10. Loureiro J
11. Borawski J
12. Roma G
13. Knehr J
14. Carbone W
15. French S
16. Parks BW
17. Hui ST
18. Mehrabian M
19. Magyar C
20. Cantor RM
21. Ukomadu C
22. Lusis AJ
23. Beaven SW
(2021) The genetic architecture of carbon tetrachloride-induced liver fibrosis in mice
Cellular and Molecular Gastroenterology and Hepatology 11:199–220.

https://doi.org/10.1016/j.jcmgh.2020.08.010
- PubMed
- Google Scholar
1. Vanderlinden LA
2. Saba LM
3. Printz MP
4. Flodman P
5. Koob G
6. Richardson HN
7. Hoffman PL
8. Tabakoff B
(2014) Is the alcohol deprivation effect genetically mediated? Studies with HXB/BXH recombinant inbred rat strains
Alcoholism, Clinical and Experimental Research 38:2148–2157.

https://doi.org/10.1111/acer.12471
- PubMed
- Google Scholar
1. Voigt B
2. Kuramoto T
3. Mashimo T
4. Tsurumi T
5. Sasaki Y
6. Hokao R
7. Serikawa T
(2008) Evaluation of LEXF/FXLE rat recombinant inbred strains for genetic dissection of complex traits
Physiological Genomics 32:335–342.

https://doi.org/10.1152/physiolgenomics.00158.2007
- PubMed
- Google Scholar
1. von Scheidt M
2. Zhao Y
3. Kurt Z
4. Pan C
5. Zeng L
6. Yang X
7. Schunkert H
8. Lusis AJ
(2017) Applications and limitations of mouse models for understanding human atherosclerosis
Cell Metabolism 25:248–261.

https://doi.org/10.1016/j.cmet.2016.11.001
- PubMed
- Google Scholar
1. Wang X
2. Pandey AK
3. Mulligan MK
4. Williams EG
5. Mozhui K
6. Li Z
7. Jovaisaite V
8. Quarles LD
9. Xiao Z
10. Huang J
11. Capra JA
12. Chen Z
13. Taylor WL
14. Bastarache L
15. Niu X
16. Pollard KS
17. Ciobanu DC
18. Reznik AO
19. Tishkov AV
20. Zhulin IB
21. Peng J
22. Nelson SF
23. Denny JC
24. Auwerx J
25. Lu L
26. Williams RW
(2016) Joint mouse-human phenome-wide association to test gene function and disease risk
Nature Communications 7:10464.

https://doi.org/10.1038/ncomms10464
- PubMed
- Google Scholar
1. Wang D
2. Eraslan B
3. Wieland T
4. Hallström B
5. Hopf T
6. Zolg DP
7. Zecha J
8. Asplund A
9. Li L-H
10. Meng C
11. Frejno M
12. Schmidt T
13. Schnatbaum K
14. Wilhelm M
15. Ponten F
16. Uhlen M
17. Gagneur J
18. Hahne H
19. Kuster B
(2019) A deep proteome and transcriptome abundance atlas of 29 healthy human tissues
Molecular Systems Biology 15:e8503.

https://doi.org/10.15252/msb.20188503
- PubMed
- Google Scholar
1. Weissman AM
(1994)
The T-cell antigen receptor: a multisubunit signaling complex

Chemical Immunology 59:1–18.
- PubMed
- Google Scholar
1. Wilkinson MD
2. Dumontier M
3. Aalbersberg IJJ
4. Appleton G
5. Axton M
6. Baak A
7. Blomberg N
8. Boiten J-W
9. da Silva Santos LB
10. Bourne PE
11. Bouwman J
12. Brookes AJ
13. Clark T
14. Crosas M
15. Dillo I
16. Dumon O
17. Edmunds S
18. Evelo CT
19. Finkers R
20. Gonzalez-Beltran A
21. Gray AJG
22. Groth P
23. Goble C
24. Grethe JS
25. Heringa J
26. ’t Hoen PAC
27. Hooft R
28. Kuhn T
29. Kok R
30. Kok J
31. Lusher SJ
32. Martone ME
33. Mons A
34. Packer AL
35. Persson B
36. Rocca-Serra P
37. Roos M
38. van Schaik R
39. Sansone S-A
40. Schultes E
41. Sengstag T
42. Slater T
43. Strawn G
44. Swertz MA
45. Thompson M
46. van der Lei J
47. van Mulligen E
48. Velterop J
49. Waagmeester A
50. Wittenburg P
51. Wolstencroft K
52. Zhao J
53. Mons B
(2016) The FAIR Guiding Principles for scientific data management and stewardship
Scientific Data 3:160018.

https://doi.org/10.1038/sdata.2016.18
- PubMed
- Google Scholar
1. Williams RW
2. Williams EG
(2017) Resources for Systems Genetics
Methods in Molecular Biology 1488:3–29.

https://doi.org/10.1007/978-1-4939-6427-7_1
- PubMed
- Google Scholar
1. Williams EG
2. Pfister N
3. Roy S
4. Statzer C
5. Haverty J
6. Ingels J
7. Bohl C
8. Hasan M
9. Čuklina J
10. Bühlmann P
11. Zamboni N
12. Lu L
13. Ewald CY
14. Williams RW
15. Aebersold R
(2022) Multiomic profiling of the liver across diets and age in a diverse mouse population
Cell Systems 13:43–57.

https://doi.org/10.1016/j.cels.2021.09.005
- PubMed
- Google Scholar
1. Wishart DS
(2019) Metabolomics for investigating physiological and pathophysiological processes
Physiological Reviews 99:1819–1875.

https://doi.org/10.1152/physrev.00035.2018
- PubMed
- Google Scholar
1. Woolf B
2. Zagkos L
3. Gill D
(2022) TwoStepCisMR: A Novel Method and R Package for Attenuating Bias in cis-Mendelian Randomization Analyses
Genes 13:1541.

https://doi.org/10.3390/genes13091541
- PubMed
- Google Scholar
1. Wright SN
2. Leger BS
3. Rosenthal SB
4. Liu SN
5. Jia T
6. Chitre AS
7. Polesskaya O
8. Holl K
9. Gao J
10. Cheng R
11. Garcia Martinez A
12. George A
13. Gileta AF
14. Han W
15. Netzley AH
16. King CP
17. Lamparelli A
18. Martin C
19. St Pierre CL
20. Wang T
21. Bimschleger H
22. Richards J
23. Ishiwari K
24. Chen H
25. Flagel SB
26. Meyer P
27. Robinson TE
28. Solberg Woods LC
29. Kreisberg JF
30. Ideker T
31. Palmer AA
(2023) Genome-wide association studies of human and rat BMI converge on synapse, epigenome, and hormone signaling networks
Cell Reports 42:112873.

https://doi.org/10.1016/j.celrep.2023.112873
- PubMed
- Google Scholar
1. Xiao H
2. Bozi LHM
3. Sun Y
4. Riley CL
5. Philip VM
6. Chen M
7. Li J
8. Zhang T
9. Mills EL
10. Emont MP
11. Sun W
12. Reddy A
13. Garrity R
14. Long J
15. Becher T
16. Vitas LP
17. Laznik-Bogoslavski D
18. Ordonez M
19. Liu X
20. Chen X
21. Wang Y
22. Liu W
23. Tran N
24. Liu Y
25. Zhang Y
26. Cypess AM
27. White AP
28. He Y
29. Deng R
30. Schöder H
31. Paulo JA
32. Jedrychowski MP
33. Banks AS
34. Tseng Y-H
35. Cohen P
36. Tsai LT
37. Rosen ED
38. Klein S
39. Chondronikola M
40. McAllister FE
41. Van Bruggen N
42. Huttlin EL
43. Spiegelman BM
44. Churchill GA
45. Gygi SP
46. Chouchani ET
(2022) Architecture of the outbred brown fat proteome defines regulators of metabolic physiology
Cell 185:4654–4673.

https://doi.org/10.1016/j.cell.2022.10.003
- PubMed
- Google Scholar
1. Yu Q
2. Liu X
3. Keller MP
4. Navarrete-Perea J
5. Zhang T
6. Fu S
7. Vaites LP
8. Shuken SR
9. Schmid E
10. Keele GR
11. Li J
12. Huttlin EL
13. Rashan EH
14. Simcox J
15. Churchill GA
16. Schweppe DK
17. Attie AD
18. Paulo JA
19. Gygi SP
(2023) Sample multiplexing-based targeted pathway proteomics with real-time analytics reveals the impact of genetic variation on protein expression
Nature Communications 14:555.

https://doi.org/10.1038/s41467-023-36269-7
- PubMed
- Google Scholar
1. Zeng P
2. Shao Z
3. Zhou X
(2021) Statistical methods for mediation analysis in the era of high-throughput genomics: Current successes and future challenges
Computational and Structural Biotechnology Journal 19:3209–3224.

https://doi.org/10.1016/j.csbj.2021.05.042
- PubMed
- Google Scholar
1. Zhang Q
2. Mesner LD
3. Calabrese GM
4. Dirckx N
5. Li Z
6. Verardo A
7. Yang Q
8. Tower RJ
9. Faugere MC
10. Farber CR
11. Clemens TL
(2021a) Genomic variants within chromosome 14q32.32 regulate bone mass through MARK3 signaling in osteoblasts
The Journal of Clinical Investigation 131:e142580.

https://doi.org/10.1172/JCI142580
- PubMed
- Google Scholar
1. Zhang K
2. Hocker JD
3. Miller M
4. Hou X
5. Chiou J
6. Poirion OB
7. Qiu Y
8. Li YE
9. Gaulton KJ
10. Wang A
11. Preissl S
12. Ren B
(2021b) A single-cell atlas of chromatin accessibility in the human genome
Cell 184:5985–6001.

https://doi.org/10.1016/j.cell.2021.10.024
- PubMed
- Google Scholar
1. Zhao Y
2. Chen J
3. Freudenberg JM
4. Meng Q
5. Rajpal DK
6. Yang X
(2016) Network-based identification and prioritization of key regulators of coronary artery disease loci
Arteriosclerosis, Thrombosis, and Vascular Biology 36:928–941.

https://doi.org/10.1161/ATVBAHA.115.306725
- PubMed
- Google Scholar
1. Zhou X
2. Crow AL
3. Hartiala J
4. Spindler TJ
5. Ghazalpour A
6. Barsky LW
7. Bennett BJ
8. Parks BW
9. Eskin E
10. Jain R
11. Epstein JA
12. Lusis AJ
13. Adams GB
14. Allayee H
(2015) The genetic landscape of hematopoietic stem cell frequency in mice
Stem Cell Reports 5:125–138.

https://doi.org/10.1016/j.stemcr.2015.05.008
- PubMed
- Google Scholar
1. Zhou W
2. Kanai M
3. Wu K-HH
4. Rasheed H
5. Tsuo K
6. Hirbo JB
7. Wang Y
8. Bhattacharya A
9. Zhao H
10. Namba S
11. Surakka I
12. Wolford BN
13. Lo Faro V
14. Lopera-Maya EA
15. Läll K
16. Favé M-J
17. Partanen JJ
18. Chapman SB
19. Karjalainen J
20. Kurki M
21. Maasha M
22. Brumpton BM
23. Chavan S
24. Chen T-T
25. Daya M
26. Ding Y
27. Feng Y-CA
28. Guare LA
29. Gignoux CR
30. Graham SE
31. Hornsby WE
32. Ingold N
33. Ismail SI
34. Johnson R
35. Laisk T
36. Lin K
37. Lv J
38. Millwood IY
39. Moreno-Grau S
40. Nam K
41. Palta P
42. Pandit A
43. Preuss MH
44. Saad C
45. Setia-Verma S
46. Thorsteinsdottir U
47. Uzunovic J
48. Verma A
49. Zawistowski M
50. Zhong X
51. Afifi N
52. Al-Dabhani KM
53. Al Thani A
54. Bradford Y
55. Campbell A
56. Crooks K
57. de Bock GH
58. Damrauer SM
59. Douville NJ
60. Finer S
61. Fritsche LG
62. Fthenou E
63. Gonzalez-Arroyo G
64. Griffiths CJ
65. Guo Y
66. Hunt KA
67. Ioannidis A
68. Jansonius NM
69. Konuma T
70. Lee MTM
71. Lopez-Pineda A
72. Matsuda Y
73. Marioni RE
74. Moatamed B
75. Nava-Aguilar MA
76. Numakura K
77. Patil S
78. Rafaels N
79. Richmond A
80. Rojas-Muñoz A
81. Shortt JA
82. Straub P
83. Tao R
84. Vanderwerff B
85. Vernekar M
86. Veturi Y
87. Barnes KC
88. Boezen M
89. Chen Z
90. Chen C-Y
91. Cho J
92. Smith GD
93. Finucane HK
94. Franke L
95. Gamazon ER
96. Ganna A
97. Gaunt TR
98. Ge T
99. Huang H
100. Huffman J
101. Katsanis N
102. Koskela JT
103. Lajonchere C
104. Law MH
105. Li L
106. Lindgren CM
107. Loos RJF
108. MacGregor S
109. Matsuda K
110. Olsen CM
111. Porteous DJ
112. Shavit JA
113. Snieder H
114. Takano T
115. Trembath RC
116. Vonk JM
117. Whiteman DC
118. Wicks SJ
119. Wijmenga C
120. Wright J
121. Zheng J
122. Zhou X
123. Awadalla P
124. Boehnke M
125. Bustamante CD
126. Cox NJ
127. Fatumo S
128. Geschwind DH
129. Hayward C
130. Hveem K
131. Kenny EE
132. Lee S
133. Lin Y-F
134. Mbarek H
135. Mägi R
136. Martin HC
137. Medland SE
138. Okada Y
139. Palotie AV
140. Pasaniuc B
141. Rader DJ
142. Ritchie MD
143. Sanna S
144. Smoller JW
145. Stefansson K
146. van Heel DA
147. Walters RG
148. Zöllner S
149. Martin AR
150. Willer CJ
151. Daly MJ
152. Neale BM
153. Biobank of the Americas
154. Biobank Japan Project
155. BioMe
156. BioVU
157. CanPath - Ontario Health Study
158. China Kadoorie Biobank Collaborative Group
159. Colorado Center for Personalized Medicine
160. deCODE Genetics
161. Estonian Biobank
162. FinnGen
163. Generation Scotland
164. Genes & Health Research Team
165. LifeLines
166. Mass General Brigham Biobank
167. Michigan Genomics Initiative
168. National Biobank of Korea
169. Penn Medicine BioBank
170. Qatar Biobank
171. QSkin Sun and Health Study
172. Taiwan Biobank
173. HUNT Study
174. UCLA ATLAS Community Health Initiative
175. Uganda Genome Resource
176. UK Biobank
(2022) Global biobank meta-analysis initiative: powering genetic discovery across human disease
Cell Genomics 2:100192.

https://doi.org/10.1016/j.xgen.2022.100192
- PubMed
- Google Scholar
1. Zhu J
2. Lum PY
3. Lamb J
4. GuhaThakurta D
5. Edwards SW
6. Thieringer R
7. Berger JP
8. Wu MS
9. Thompson J
10. Sachs AB
11. Schadt EE
(2004) An integrative genomics approach to the reconstruction of gene networks in segregating populations
Cytogenetic and Genome Research 105:363–374.

https://doi.org/10.1159/000078209
- PubMed
- Google Scholar
1. Zhu Z
2. Zhang F
3. Hu H
4. Bakshi A
5. Robinson MR
6. Powell JE
7. Montgomery GW
8. Goddard ME
9. Wray NR
10. Visscher PM
11. Yang J
(2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets
Nature Genetics 48:481–487.

https://doi.org/10.1038/ng.3538
- PubMed
- Google Scholar

Article and author information

Author details

Hooman Allayee
1. Departments of Population & Public Health Sciences, University of Southern California, Los Angeles, United States
2. Biochemistry & Molecular Medicine, Keck School of Medicine, University of Southern California, Los Angeles, United States
Contribution
Conceptualization, Writing – original draft, Writing – review and editing

Contributed equally with
Charles R Farber

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-2384-5239
Charles R Farber
1. Center for Public Health Genomics, University of Virginia School of Medicine, Charlottesville, United States
2. Departments of Biochemistry & Molecular Genetics, University of Virginia School of Medicine, Charlottesville, United States
3. Public Health Sciences, University of Virginia School of Medicine, Charlottesville, United States
Contribution
Conceptualization, Writing – original draft, Writing – review and editing

Contributed equally with
Hooman Allayee

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-6748-4711
Marcus M Seldin

Department of Biological Chemistry, University of California, Irvine, Irvine, United States

Contribution
Conceptualization, Writing – original draft, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-8026-4759
Evan Graehl Williams

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg, Luxembourg

Contribution
Conceptualization, Writing – original draft, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-9746-376X
David E James
1. School of Life and Environmental Sciences, University of Sydney, Camperdown, Australia
2. Faculty of Medicine and Health, University of Sydney, Camperdown, Australia
3. Charles Perkins Centre, University of Sydney, Camperdown, Australia
Contribution
Conceptualization, Writing – original draft, Writing – review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-5946-5257
Aldons J Lusis
1. Departments of Human Genetics, University of California, Los Angeles, Los Angeles, United States
2. Medicine, University of California, Los Angeles, Los Angeles, United States
3. Microbiology, Immunology, & Molecular Genetics, David Geffen School of Medicine of UCLA, Los Angeles, United States
Contribution
Conceptualization, Writing – original draft, Writing – review and editing

For correspondence
JLusis@mednet.ucla.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-9013-0228

Funding

National Institutes of Health (R01HL133169)

Hooman Allayee

National Institutes of Health (R01HL148110)

Hooman Allayee

National Institutes of Health (R01HL168493)

Hooman Allayee

National Institutes of Health (R01AR079839)

Charles R Farber

National Institutes of Health (R01AR077992)

Charles R Farber

National Institutes of Health (R01AR071657)

Charles R Farber

National Institutes of Health (R01AR079179)

Charles R Farber

National Institutes of Health (R01HL148110)

Aldons J Lusis

National Institutes of Health (R01HL144651)

Aldons J Lusis

National Institutes of Health (R01DK117850)

Aldons J Lusis

Australian Research Council (Laureate Fellow)

David E James

National Institutes of Health (DP1DK130640)

Marcus M Seldin

National Institutes of Health (R21AA030358)

Marcus M Seldin

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.