Tools and Resources

Developmental Biology

Building the vertebrate codex using the gene breaking protein trap library

Department of Biochemistry and Molecular Biology, Mayo Clinic, United States
Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, United States
Functional & Chemical Genomics Program, Oklahoma Medical Research Foundation, United States
Department of Genetics, Development and Cell Biology, Iowa State University, United States
Zebrafish Centre for Advanced Drug Discovery & Keenan Research Centre for Biomedical Science, Li Ka Shing Knowledge Institute, St. Michael's Hospital, Unity Health Toronto & University of Toronto, Canada
Department of Cardiovascular Medicine, Mayo Clinic, United States
Department of Embryology, Carnegie Institution for Science, United States
Department of Clinical Genomics, Mayo Clinic, United States
Department of Otorhinolaryngology, Mayo Clinic, United States
Genomics and Molecular Medicine Unit, CSIR–Institute of Genomics and Integrative Biology, India
Department of Biology, Temple University, United States
Institute of Zoology, Developmental Biology Unit, University of Cologne, Germany

Aug 11, 2020

Open access
Copyright information

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

One key bottleneck in understanding the human genome is the relative under-characterization of 90% of protein coding regions. We report a collection of 1200 transgenic zebrafish strains made with the gene-break transposon (GBT) protein trap to simultaneously report and reversibly knockdown the tagged genes. Protein trap-associated mRFP expression shows previously undocumented expression of 35% and 90% of cloned genes at 2 and 4 days post-fertilization, respectively. Further, investigated alleles regularly show 99% gene-specific mRNA knockdown. Homozygous GBT animals in ryr1b, fras1, tnnt2a, edar and hmcn1 phenocopied established mutants. 204 cloned lines trapped diverse proteins, including 64 orthologs of human disease-associated genes with 40 as potential new disease models. Severely reduced skeletal muscle Ca²⁺ transients in GBT ryr1b homozygous animals validated the ability to explore molecular mechanisms of genetic diseases. This GBT system facilitates novel functional genome annotation towards understanding cellular and molecular underpinnings of vertebrate biology and human disease.

eLife digest

The human genome counts over 20,000 genes, which can be turned on and off to create the proteins required for most of life processes. Once produced, proteins need move to specific locations in the cell, where they are able to perform their jobs. Despite striking scientific advances, 90% of human genes are still under-studied; where the proteins they code for go, and what they do remains unknown.

Zebrafish share many genes with humans, but they are much easier to manipulate genetically. Here, Ichino et al. used various methods in zebrafish to create a detailed ‘catalogue’ of previously poorly understood genes, focusing on where the proteins they coded for ended up and the biological processes they were involved with.

First, a genetic tool called gene-breaking transposons (GBTs) was used to create over 1,200 strains of genetically altered fish in which a specific protein was both tagged with a luminescent marker and unable to perform its role. Further analysis of 204 of these strains revealed new insight into the role of each protein, with many having unexpected roles and localisations. For example, in one zebrafish strain, the affected gene was similar to a human gene which, when inactivated, causes severe muscle weakness. These fish swam abnormally slowly and also had muscle problems, suggesting that the GBT fish strains could ‘model’ the human disease.

This work sheds new light on the role of many previously poorly understood genes. In the future, similar collections of GBT fish strains could help researchers to study both normal human biology and disease. They could especially be useful in cases where the genes responsible for certain conditions are still difficult to identify.

Introduction

Analyses of genomic sequences from over 100 vertebrate species (Meadows and Lindblad-Toh, 2017) have revealed that we need more than nucleic acid sequence alone to comprehend the vertebrate genome. A more complete understanding of any genetic locus requires knowledge of its expression pattern and its function(s) in subcellular, cellular, and organismal contexts—the compendium of information that can be described as a gene ‘codex’. Despite their importance, the expression patterns and functions of most protein coding genes remain surprisingly uncharacterized. The number of these genes linked to human disease without functional insights into their gene-disease relationships highlights the significance of this knowledge gap (Kettleborough et al., 2013). In recent estimates, 80% of rare, undiagnosed diseases are thought to have genetic underpinnings (Robe, 2005; Varga et al., 2018; Wangler et al., 2017). Tools are therefore needed to identify and annotate the expression and function(s) of these poorly characterized gene products in both biological and pathological processes.

Zebrafish (Danio rerio) has emerged as an outstanding model to bridge the gap between sequence and function in the vertebrate genome. Investigations of gene function in zebrafish, from organismal to subcellular, are amenable to both forward and reverse genetic approaches (Stoeger et al., 2018). Additionally, the natural transparency of developing zebrafish enables live, non-invasive collection of gene expression data at a subcellular resolution on an organismal scale. Therefore, the zebrafish facilitates parallel discovery of gene expression and function towards a comprehensive codex of the vertebrate genome. To begin constructing this vertebrate codex, we previously developed a unique, revertible mutagenesis tool called the gene-break transposon (GBT) with elements that cooperate to report gene sequence, expression pattern, and function (Clark et al., 2011a). Specifically, when integrated in the sense orientation of a transcriptional unit, the GBT protein trap overrides endogenous splicing and creates a fusion between upstream exons and its start-codon deficient monomeric RFP (mRFP) reporter. Then, the strong internal polyadenylation site and putative border element following the mRFP truncate the gene product. Finally, the GBT construct is flanked by loxP sites on either side to enable excision and subsequent rescue with Cre-recombinase (Clark et al., 2011a).

Visualization of the start-codon deficient mRFP reporter requires an in-frame integration. In the original GBT protein trap construct, RP2.1 (Figure 1A), this in-frame requirement restricts mRFP expression to a single reading frame and leaves the potential to truncate genes without reporting their expression with mRFP (Clark et al., 2011a). We therefore developed a new series of GBT protein trap constructs, including versions to trap expression in each of the three potential reading frames (Figure 1A). Alongside the original, we employed these new vectors in zebrafish to generate and catalog over 800 additional GBT protein trap lines with visible mRFP expression at 2 days post-fertilization (dpf) (end of embryogenesis) or four dpf (larval stage). 147 of these additional GBT lines were cloned, and candidate genes were identified for another 144 GBT lines. mRFP expression in cloned GBT lines showcased novel expression patterns for a population of genes encoding diverse proteins in function and localization, including 64 implicated in human disease. Further, animals homozygous for the GBT allele in ryr1b displayed severely dampened skeletal muscle Ca²⁺ transients, demonstrating the ability to elucidate molecular mechanisms of genetic disorders. Since detailed investigations of mutant phenotypes are vital to functional annotation of the vertebrate genome, the mutagenic reporters in our GBT system provide the basis for this functional annotation to better understand normal biology and human disease.

Figure 1 with 1 supplement see all

Download asset Open asset

Schematic of the RP2 and RP8 gene-break transposon (GBT) system with all three reading frames of AUG-less mRFP reporter.

(**A–C**) Schematic of the GBT system, RP2 and RP8 incorporate a protein-trap cassette fused with three reading frames of AUG-less mRFP reporter and a 3’ exon trap cassette with GFP or tagBFP reporters, respectively. (A) RP2 series (RP2.1, RP2.2 and RP2.3). Underline: Previously published vector construct (**B–C**) RP8 series (RP8.1, RP8.2 and RP8.3) with a schematic RP8 insertion event showing expected transcription off of a locus below (C). ITR: inverted terminal repeat, SA: splice acceptor, lox: Cre recombinase recognition sequence, *mRFP: AUG-less mRFP sequence, poly (A)+: polyadenylation signal, red octagon: extra transcriptional terminator and putative border element, *β-act*: carp beta-actin enhancer, *γ-cry*: gamma crystalline promoter, SD: splice donor, E: enhancer, P: promoter, and WT: wild-type.

Results

GBT vector series RP2 and RP8 illuminate all three vertebrate proteomic reading frames

We previously reported the intron-based gene-break transposon (GBT) as an effective and revertible loss-of-function tool for zebrafish (Clark et al., 2011a). The original GBT construct called RP2/RP2.1 contains the following key features (Figure 1A): 1) flanking miniTol2 sequences for transposase-mediated random integration (Balciunas et al., 2006; Kawakami et al., 2004; Urasaki et al., 2006), 2) a 5’ protein trap containing a strong splice-acceptor (SA) and a start codon-free mRFP reporter to detect 5’ sequence and visualize in vivo expression of the trapped locus with the endogenous promoter (Clark et al., 2011a; Ding et al., 2013; Ding et al., 2017; Liao et al., 2012; Petzold et al., 2009; Westcot et al., 2015; Xu et al., 2012), 3) a mutagenic transcriptional terminator containing both a polyadenylation signal (pA) and a putative border element to truncate the trapped locus in conjunction with the protein trap (Sivasubbu et al., 2006), 4) a 3’ exon trap with a β-actin promoter driving expression of GFP to report 3’ sequence and detect lines with weak (or absent) mRFP expression or with the integration in other frames of the mRFP reporter. (Clark et al., 2011a; Petzold et al., 2009; Sivasubbu et al., 2006), 5) a second mini-intron within the GFP expression cassette that can further contribute to loss of wild-type transcripts, and 6) flanking loxP sites for Cre-mediated excision and restoration of trapped locus function using both germline (Petzold et al., 2009) and somatic approaches (Clark et al., 2011a; Ding et al., 2013; Westcot et al., 2015).

Initial experiments with the RP2.1 construct, however, revealed some limitations. First, effective transcript trapping does not always generate mRFP reporter expression because the RP2.1 plasmid is designed for a single reading frame. Molecular cloning of GFP⁺/mRFP^- lines demonstrated the requirement to capture an appropriate reading frame to visualize the mRFP reporter. Even though RP2.1 is designed to use one main reading frame, some lines with mRFP expression used an alternate ‘CAG’ five nucleotides downstream of the main splice acceptor which offered a second chance at creating a functional mRFP reporter (Clark et al., 2011a). Therefore, to maximize genome coverage of our mutagenesis vectors in this study, we created a series of RP2 constructs to encode functional mRFP in each of the three reading frames (Figure 1A). Second, the 3’ exon trap in the RP2 series uses the nearly ubiquitous β-actin promoter to drive expression of GFP (detectable around the seven- to eight-somite-stage similar to ubiquitous GFP expression driven under the EF1α enhancer/promoter [Davidson et al., 2003]) which could interfere with another GFP-based reporter system in future studies (Clark et al., 2011a). To overcome this limitation, we engineered a novel, next-generation GBT series called RP8. RP8 constructs possess a new 3’ exon trap cassette that uses the γ-crystalline promoter to drive expression of lens-specific tagBFP instead of the ubiquitous expression GFP with RP2 series vectors. (Figure 1B–C). Additionally, all RP8 series for three reading frames reporting mRFP constructs are built on a smaller vector backbone and include new restriction enzyme sites that render these vectors modular for subsequent genetic engineering. Using all five of these new GBT constructs in zebrafish, we conducted an initial screen for expression of protein trap mRFP and observed that all RP2 and RP8 series constructs readily produced mRFP fusion proteins expressed from their endogenous promoters (Figure 1—figure supplement 1).

Creation of a GBT-line collection enables illumination of the vertebrate genome

We then deployed all of these GBT vectors to generate over 800 additional zebrafish GBT lines. A key feature of the protein trap in these lines is the ability to non-invasively image the spatial and temporal expression patterns of the trapped loci. Our initial mRFP⁺ lines demonstrated that, using standard methods, this imaging was going to be a major bottleneck (Figure 2A). Consequently, we utilized both SCORE imaging—a capillary tube placed in a refractive index-matched medium for efficient sample rotation—on an ApoTome (Petzold et al., 2010, and see Materials and methods) or a Zeiss Lightsheet Z.1 SPIM microscope (see Materials and methods) to enable high throughput fluorescence imaging. To date, we have now cataloged over 1,200 GBT lines with robust mRFP expression in heterozygous F2 animals at two dpf and/or four dpf according to our screening pipeline and made all imaging data freely accessible on zfishbook (www.zfishbook.org) (Clark et al., 2012; Figure 2A). We have cryopreserved these 1,200 GBT lines and have retained them at the Mayo Clinic Zebrafish Facility (MCZF) with a copy also sent to the Zebrafish International Resource Center (ZIRC) (Figure 2A).

Figure 2

Download asset Open asset

GBT screening pipeline.

(A) Overview of GBT screening pipeline. Wild-type embryos at 1 cell were co-injected with RP plasmid and Tol2 transposase mRNA to create F0 founders. These F0 larvae were screened for non-mosaic RP expression, raised, and outcrossed for two generations. Then, mRFP⁺ F2 heterozygous larvae were 3-dimensionally imaged at 2 and 4 dpf and this imaging data were uploaded to zfishbook (http://www.zfishbook.org/). Sperm from four F2 males in over 1200 robust mRFP expressing lines were cryopreserved using the Zebrafish International Resource Center (ZIRC) standard protocol and stored at both ZIRC and Mayo Clinic Zebrafish Core Facility (MCZF). DNA and RNA isolated from these four F2 males with cryopreserved sperm was utilized to perform next-generation sequencing and to confirm RFP linkage of candidate lines by manual PCRs (iPCR, TAIL-PCR, 5’ RACE and 3’ RACE). Venn diagram illustrates current library of over 1,200 GBT lines with 204 GBT-confirmed lines out of 348 molecularly analyzed GBT-candidate lines. (B) Next generation sequencing based validation for GBT integration loci. Fin biopsies from four F2 males were utilized as DNA source for the validation process to identify GBT integration loci. Extracted genomic DNA was fragmented, pooled in 96-wells plate, and ligated with barcode linker to identify each single male with cryopreserved sperm. Linker-mediated (LM) PCR with the primers, R-ITR P1 and LP1 and nested PCR with the primers, R-ITR P2 and LP2 were conducted to perform Illumina sequencing the final PCR products. The integration events of individual sperm-cryopreserved male were mapped on zebrafish reference genome sequence with bioinformatics analysis. This figure was created with BioRender.com. The area proportional Venn diagram was produced using BioVenn (http://www.biovenn.nl/).

Molecular cloning of a subset of GBT lines highlights the genetic diversity of this protein trap collection

Traditional molecular methods, such as inverse PCR (iPCR) and thermal asymmetric interlaced (TAIL) PCR and 5’ and 3’ rapid amplification of cDNA ends (RACE), are labor-intensive and represent a functional bottleneck in identifying randomly integrated loci in GBT lines. To overcome this, we employed a rapid cloning process based on methods used to isolate retroviral integrations that leverage the massive parallel sequencing technology of the Illumina MiSeq and a custom bioinformatics pipeline that involves both mapping and annotation (Figure 2B; Varshney et al., 2013a; Varshney et al., 2013b). First, fin-clips from four male animals per GBT line were obtained during sperm cryopreservation and used as a source of DNA for cloning the integrated locus. Next, high-throughput sequencing amplified reads with barcodes linked to the source of DNA from the sperm-cryopreserved males. Mapping reads to the genome indicated potential GBT integration loci in each individual. A shared integration locus in multiple individuals from a single GBT line was considered a candidate integrated locus, and we termed a GBT line with at least one candidate integrated locus a ‘GBT-candidate line’. After a candidate was determined using the sequencing pipeline or other manual molecular approaches, such as 5’ RACE, 3’ RACE, iPCR, or TAIL PCR, we used standard PCR to test if the candidate integration locus segregates with mRFP expression.

At the end of this pipeline, 204 GBT-candidate lines met the highest stringency of confirmed expression linkage and were classified as ‘GBT-confirmed lines’ (Figure 2A). While 57 of these GBT-confirmed lines have been previously published (Clark et al., 2011a; Ding et al., 2013; Ding et al., 2017; El-Rass et al., 2017; Ma et al., 2020; Westcot et al., 2015), 147 of these GBT-confirmed lines are newly characterized in this manuscript and were selected for confirmation based upon their expression pattern and/or homozygous phenotype (Supplementary file 1). A small subset of these GBT-confirmed lines mapped to areas in the genome without annotated transcripts. Publicly available RNA-sequencing data (White et al., 2017) revealed reads flanking a majority of these mapped integrations. Some of these reads contained evidence of splicing in the sense orientation of the mRFP reporter (Supplementary file 1). Finally, another 144 GBT-candidate lines from this pipeline have yet to be confirmed (Figure 2A). Integration locus annotation of both GBT-candidate and GBT-confirmed lines is available on zfishbook (www.zfishbook.org) (Clark et al., 2012).

RP2.1 induces high knockdown efficiency of endogenous transcripts in GBT-confirmed lines

We wanted to determine the knockdown efficacy of the GBT system as a quantitative assessment of mutagenicity. We therefore compiled qRT-PCR data to compare wild-type and truncated, mRFP-fused transcript levels for all 26 RP2.1-derived GBT-confirmed lines that we and others have tested (Clark et al., 2011a; Ding et al., 2013; Ding et al., 2017, GBT0235—this manuscript). This compilation determined at minimum 97% knockdown in animals homozygous for the RP2.1 alleles. We next directly compared the transcriptional effects of RP2.1 with those of other published transposon-based protein trap systems (Figure 3). The FlipTrap system produced a range of 70–96% knockdown in six tested fish alleles (Trinh et al., 2011), similar to our initial R-series protein trap vectors (R14-R15) that contained a single splice acceptor and a simple transcriptional terminator (Liao et al., 2012; Petzold et al., 2009). The pFT1, which contains a single splice acceptor, but a tandem array of five simple polyadenylation sites, appears to be an improvement over these systems with 89–94% knockdown from four tested fish alleles (Ni et al., 2012). The 97% minimum knockdown observed with RP2.1 is quantitatively higher than these other systems and could be deployed with other insertional genome engineering approaches. We expect the RP8 series vectors to have similar transcriptional knockdown to RP2.1, but to date we have not quantified the effect to measure their mutagenicity.

Figure 3

Download asset Open asset

Knockdown efficiency of RP2.1 compared with previous gene-trap systems.

Violin plots comparing percent knockdown efficiency in the analyzed individual lines generated by four protein trap systems. All plots show median. The data of previous protein trap systems were converted from the data in the original articles, R14-R15, our initial R-series protein trap vectors (n = 6), (Clark et al., 2011a; FlipTrap, FlipTrap vectors (n = 6), Trinh et al., 2011; FT1, FT1 vector (n = 4), Ni et al., 2012; RP2.1, RP2.1 vector (n = 26), Clark et al., 2011a; Ding et al., 2013; Ding et al., 2017; El-Rass et al., 2017; Westcot et al., 2015 and unpublished data) (Figure 3—source data 1). The graph was made in JMP14 (SAS, Cary, NC).

Figure 3—source data 1 Numeric data analyzing knockdown efficiency in lrpprc^mn0235Gt^/mn0235Gt. Source data analyzing relative expression of lrpprc mRNA in six dpf-larvae with RFP expression and dark liver phenotype crossed with heterozyous lrpprc^+/mn0235Gt adults. no RT: no reverse transcriptase, RT: with reverse transcriptase, DL: dark liver phenotype, Cq: quantification cycle, d: delta, KD: knockdown, N/A: not applicable.: https://cdn.elifesciences.org/articles/54572/elife-54572-fig3-data1-v2.xlsx
Download elife-54572-fig3-data1-v2.xlsx

Phenotype appearance rate in GBT lines is comparable to other mutagenic technologies for forward genetic screening

In parallel to knockdown efficacy, we wanted to know the GBT construct effectiveness at mutagenizing its integrated locus. To assess this mutagenic efficiency, we conducted an initial phenotypic screen on F2 embryos and early larvae. In 179 mRFP⁺ GBT lines, we identified 12 recessive phenotypes visible during the first five days of development that, among others, included lethal, cardiac, muscular, and integumentary defects as reported in previous studies (Supplementary file 2).

Molecular analyses revealed that a subset of these phenotypes stem from GBT integrations in genes with established loss of function mutant phenotypes including ryr1b, tnnt2a, fras1, hmcn1, and edar (Clark et al., 2011a; Westcot et al., 2015; Hatzold J et al., unpublished). In accord with their mRNA knockdown potency, these five GBT-confirmed lines (ryr1b, tnnt2a, fras1, hmcn1, and edar) phenocopied their respective homozygous mutants generated with other strategies and thereby validated the mutagenicity of the GBT constructs. To date, 17 of our GBT-confirmed lines have been published with homozygous phenotypes ranging from embryonic lethal, to reduced adult viability, to differences in pharmacological susceptibility (Supplementary file 2). Additional GBT-confirmed lines with homozygous phenotypes will continue to be identified and characterized in future studies.

ryr1b confirmed line is a pioneer model of human disease that validates GBT ability to functionally annotate a genetic locus

We next sought to validate the ability of GBT constructs to functionally annotate genes. During the initial GBT library creation, we identified a GBT line with skeletal muscle-specific mRFP expression and a slow swimming phenotype (Clark et al., 2011a). We confirmed that this line possesses an RP2.1 integration between exon 81 and exon 82 of ryr1b and designated it ryr1b^mn0348Gt. Animals homozygous for this ryr1b^mn0348Gt allele show 97% knockdown of wild-type ryr1b mRNA levels (Clark et al., 2011a). Due to the well-characterized nature of ryr1b from the relatively relaxed mutant (Hirata et al., 2007), this line was ideal for a proof of concept experiment in this study to validate functional genome annotation with GBT constructs. ryr1b orthologs are known across species to encode calcium-activated calcium channels that release sarcoplasmic reticulum Ca²⁺ stores to facilitate excitation-contraction coupling in skeletal muscles (Hernández-Ochoa et al., 2015; Hirata et al., 2007). Therefore, we set out to test whether loss of ryr1b in ryr1b^mn0348Gt^/mn0348Gt animals dampens skeletal muscle Ca²⁺ transients and may explain their previously reported slow swimming phenotype (Clark et al., 2011a).

To address this, we injected the skeletal muscle-targeted construct p-mylpfa:GCaMP3 (Baxendale et al., 2012) into both ryr1b^+/+ and ryr1b^mn0348Gt^/mn0348Gt animals, treated these animals at 2 dpf with 20 mM pentylenetetrazole (PTZ) to maximize the probability of recording muscle activity, and assayed individual skeletal muscle Ca²⁺ transients associated with PTZ-induced convulsions (Figure 4A). p-mylpfa:GCaMP3 injection at the single-cell stage resulted in mosaic-labeled, GCaMP3⁺ myocytes in both ryr1b^+/+ and ryr1b^mn0348Gt^/mn0348Gt animals (Figure 4B,F). Likewise, PTZ-treated ryr1b^+/+ and ryr1b^mn0348Gt^/mn0348Gt animals showed spontaneous, convulsion-associated Ca²⁺ transients in their myocytes at two dpf (Figures 4C–E,G–I). However, PTZ-induced Ca²⁺ transients in myocytes of ryr1b^+/+ animals had higher peak amplitude when averaged within fish (Figure 4J–K) or within myocytes (Figure 4—figure supplement 1A) and shorter rise time (Figure 4J,M) than those in ryr1b^mn0348Gt^/mn0348Gt animals. PTZ-induced Ca²⁺ transient peak-width (Figure 4J,L) and decay time (Figure 4J,N) were not significantly different between ryr1b^+/+ and ryr1b^mn0348Gt^/mn0348Gt animals. In contrast, myocytes in ryr1b^+/+ animals had more Ca²⁺ transients during the imaging period than myocytes in ryr1b^RP2.1/RP2.1 animals (Figure 4—figure supplement 1B). We were thus able to use a GBT-confirmed line to demonstrate that a smaller peak amplitude (consistent with relatively relaxed [Hirata et al., 2007]), slower upstroke, and lower frequency of Ca²⁺ transients in skeletal muscle likely provide the basis for the slow swimming phenotype in ryr1b^mn0348Gt^/mn0348Gt animals. The consistency of our findings in ryr1b^mn0348Gt^/mn0348Gt with those in relatively relaxed mutants (Hirata et al., 2007) validates the functional genome annotation available with the GBT mutagenic system.

Figure 4 with 1 supplement see all

Download asset Open asset

GBT demonstrates that neural disinhibition mediated Ca²⁺ transients in *mylpfa⁺* myocytes require the ryanodine receptor *ryr1b* in vivo.

(A) Cartoon showing approach to assay Ca²⁺ transients in zebrafish myocytes through (1) injection of *p-mylpfa:GCaMP3* (Baxendale et al., 2012) at the single cell stage, (2) embedding in 1% low melt agar/20 mM pentylenetetrazole (PTZ)/5 µM (S)-(-)-blebbistatin, (3) imaging for 3 min to record transient-associated changes in myocyte GCaMP3 fluorescence at 2 days post-fertilization, and (4) Ca²⁺ transient analysis. (B–I) Static images of GCaMP3 expressing myocytes (**B, F**) and representative GCaMP3 time-series images showing baseline (**C, G**), transient peak (**D, H**), and recovery (**E, I**) in *ryr1b^+/+* (C–E) and *ryr1b^mn0348Gt*^/mn0348Gt (G–I) animals, respectively. Scale bar = 20 µm. (J) Representative ∆F/F₀ traces of Ca²⁺ transients from *ryr1b*^+/+ (black) and *ryr1b^mn0348Gt*^/mn0348Gt (gray) myocytes. (K–N) Violin plots comparing transient peak ∆F/F₀ (averaged within fish) (K), Ca²⁺ transient peak-width (L), Ca²⁺ transient rise (M) and decay (N) time between *ryr1b*^+/+ and *ryr1b^mn0348Gt*^/mn0348Gt animals. All plots show median with interquartile range. For (K) n_ryr1b+/+ = 19 animals, n_{ryr1bmn0348Gt}_/mn0348Gt = 16 animals. For (**L–M**) n_ryr1b+/+ = 32 cells, n_{ryr1bmn0348Gt}_/mn0348Gt = 16 cells. For (N) n_ryr1b+/+ = 32 cells, n_{ryr1bmn0348Gt}_/mn0348Gt = 15 cells. Data are compiled from four independent experiments containing at least two animals in each group. p-values determined using the Mann-Whitney U test. Effect size (Cohen’s d)=1.829 (K) and 0.866 (M). Source data can be found in Figure 4—source data 1 (**K, L, M, N**) and Figure 4—source data 2 (J).

Figure 4—source data 1 Summary data analyzing the parameters of Ca²⁺ transients in individual tested animals. wt = ryr1b^+/+, gbt348hom = ryr1b^mn0348Gt^/mn0348Gt, peak = peak ∆F/F₀, num = number of transients/responses, totcell = number of cells, width = peak width at half max, rise = 10–90% rise time, and decay = 90–50% decay time.: https://cdn.elifesciences.org/articles/54572/elife-54572-fig4-data1-v2.xlsx
Download elife-54572-fig4-data1-v2.xlsx
Figure 4—source data 2 Individual ∆F/F₀ traces of GCaMP3-fluorescence in both ryr1b^+/+ and ryr1b^mn0348Gt^/mn0348Gt myocytes.: https://cdn.elifesciences.org/articles/54572/elife-54572-fig4-data2-v2.xlsx
Download elife-54572-fig4-data2-v2.xlsx

GBT protein trapping generates a variety of potential models of human disease

This functional genome annotation available with the GBT system in zebrafish is powerful for understanding the genetic causes of human disease. For instance, mutations in RYR1, the human ortholog of ryr1b, are well-associated with a rare genetic neuromuscular disorder called central core disease. Central core disease commonly presents with mild to severe muscle weakness (Jungbluth et al., 2018) which is analogous to the slow swimming phenotype we saw in our ryr1b^mn0348Gt^/mn0348Gt animals (Clark et al., 2011a) and likely arises from similar disruptions to skeletal muscle Ca²⁺ transients (Figure 4). A subset of this GBT collection consequently represents a potential library of human disease models. Intriguingly, 82% of OMIM listed human disease-associated genes (2601 genes) can be related to at least one zebrafish ortholog (Howe et al., 2013).

We therefore took a new angle and investigated whether any GBT-confirmed lines represent potential human disease models. Within the set of GBT-confirmed lines that match a human ortholog (n = 177), 64 (36%) are integrated in genes associated with human diseases, including those of the nervous, circulatory, endocrine, metabolic, digestive, musculoskeletal, immune, and integumentary systems (select genes listed in Figure 5A, all 64 genes with disease-associated human orthologs listed in Supplementary file 3). 40 of these human disease-associated GBT-confirmed lines represent potential novel genetic disease models as we failed to find a description for any established disease models in mice or zebrafish for orthologs of these genes (Figure 5B and Supplementary file 3).

Figure 5

Download asset Open asset

Disease-associated human orthologs of the GBT trapped genes are implicated in human genetic disorders of multiple organ systems.

(A) Representative human orthologs of the GBT-tagged genes are associated with genetic disorders in multi-organ systems. Image provided by Mayo Clinic Media Services. Underline: Disease causative genes with documentations of established disease model in mouse or zebrafish (B) Area proportional Venn diagram of 64 human orthologs tagged that are associated with human genetic disorders. 40 human orthologs of GBT-tagged genes are associated with human genetic disorders without an established disease model in zebrafish or mouse. Area proportional Venn diagram was produced using BioVenn (http://www.biovenn.nl/).

GBT protein trapping creates loss of function products for a diverse population of proteins

Functional genome annotation with GBT constructs is equally powerful in detecting roles for genes in basic cellular processes. We and others have previously used imaging to investigate effective protein trapping in GBT-confirmed lines. This imaging has revealed diverse cellular and subcellular protein expression patterns (Clark et al., 2011a; Ding et al., 2013; Ding et al., 2017; El-Rass et al., 2017; Liao et al., 2012; Petzold et al., 2009; Westcot et al., 2015; Xu et al., 2012). In this study, we imaged GBT lines using a Lightsheet microscope for the first time. Multi-area tiling with a 20 × objective enabled rapid acquisition of 3-dimensional mRFP fusion protein localization across the entire organism (Figure 6A). Confocal imaging in areas of interest revealed even more detail with subcellular resolution (Figure 6A–B). Further confocal imaging demonstrated diverse subcellular localizations in GBT-confirmed lines, (Figure 6B–C) GBT-candidate lines, and GBT lines (Figure 6—figure supplement 1). This subcellular protein localization data from GBT lines can provide crucial information in piecing together gene function.

Figure 6 with 1 supplement see all

Download asset Open asset

GBT-confirmed lines illuminate and disrupt genes encoding proteins with diverse functions and subcellular localizations.

Figure 6—source data 1 PANTHER protein classes of human orthologs tagged in GBT-confirmed lines.: https://cdn.elifesciences.org/articles/54572/elife-54572-fig6-data1-v2.xlsx
Download elife-54572-fig6-data1-v2.xlsx

We therefore wanted to assess the subcellular diversity of all gene products trapped in our current collection of GBT-confirmed lines. As an approach to complement our imaging assessments, we focused on computational approaches to explore subcellular protein diversity in current GBT-confirmed lines. 177 of the GBT-trapped genes were annotated to their human orthologs in at least one public database (ZFIN, Ensembl, Homologene, and InParanoid version 8) (Supplementary file 1). Several of these genes were provisionally annotated using BLASTP or a synteny analysis tool, SynFind in Comparative Genomics (CoGe) database (https://genomevolution.org/CoGe/SynFind.pl) (Lyons and Freeling, 2008). We assessed the functional diversityhuman orthologs of human orthologs of these 177 GBT-confirmed loci with data from the PANTHER version 14.1 database on protein class ontology (http://www.pantherdb.org/) (Mi et al., 2019), the Human Protein Atlas on genome-wide experimental proteomics (www.proteinatlas.org. August 27, 2019) (Uhlén et al., 2015), and the UniProtKB on knowledge-based proteomics (UniProtKB, https://www.uniprot.org/, UniProt Consortium, 2018). PANTHER protein classification revealed that 105 human orthologs (60%, n = 176: see Materials and methods) are classified to at least one of 21 protein classes, and 19 human orthologs (11%) belong to transcription factors (Figure 6D and Figure 6—source data 1). Human Protein Atlas and UniProtKB subcellular localization data likewise showed diverse classifications of expression with a large group of nuclear localized human orthologs of these 177 GBT-confirmed loci (Figure 6E and Supplementary file 4).

We then asked whether this if our computational analysis corresponded to the patterns seen in our imaging data. We found that LRPPRC (human ortholog of lrpprc (GBT0235)—Figure 6A–B) was not annotated to a protein class in PANTHER but mapped to mitochondria in Human Protein Atlas, consistent with its puncta expression pattern (Figure 6A–B). RYR1 (human ortholog of ryr1b (GBT0348)—Figure 6C) was annotated as a transporter in PANTHER and was mapped to the cytosol, Golgi apparatus, and vesicles in Human Protein Atlas, consistent with its more uniform expression pattern (Figure 6C). Overall, protein class ontology and known subcellular localizations of cloned GBT genes suggest that the GBT system traps and enables functional annotation for a rich diversity of proteins. Additionally, the GBT-confirmed lines in orthologs of human genes without a known subcellular localization potentiate the discovery of their subcellular expression pattern in the context of a living animal.

mRFP expression profiling in GBT-confirmed lines reveals substantial new expression data at both 2 dpf and 4 dpf

We next asked if the mRFP expression patterns in our GBT-confirmed lines unveiled novel cellular expression data. To address this, we focused on the GBT-confirmed lines that were non-redundant and mapped to a known protein coding gene. Importantly, these GBT-confirmed lines exhibited expression patterns that are tissue specific and include assorted brain regions, heart, skin, muscle, vasculature, and blood (Figure 7A–R). We analyzed publicly available expression data of these 193 tagged genes in wild-type fish (downloaded from ZFIN on August 28^th, 2019). Our GBT-confirmed lines revealed expression patterns (available on www.zfishbook.org) for 67 genes at 2 dpf and 174 genes at four dpf without publicly available expression data in ZFIN (Figure 7S–T).

Figure 7

Download asset Open asset

GBT protein trap elucidates novel gene expression patterns in embryonic and larval zebrafish.

(**A–C**) Dorsal views of 2 days post-fertilization (dpf) embryos with GBT protein trap mRFP expression patterns ranging from *bcl11ba* in the forebrain and hindbrain (A), to *col7a1* in the skin (B), and *plpp2a* in the otoliths (C). (**D-F**) Lateral views of 2 dpf embryos with GBT protein trap mRFP expression patterns ranging from *cyth3a* in blood cells (D), to *dph1* in somites (E), and *ino80c* around the yolk (F). (**G–L**) Dorsal views of GBT protein trap mRFP expression patterns in 4 dpf larvae including *nusap1* in the forebrain and midbrain (G), *gpm6ba* in the brain, spinal cord, and pineal gland (H), *unkl* in the olfactory pits (I), *foxl2a* in the forebrain and midbrain (J), *zgc:194659* in the brain and spinal cord (K), and *marcksl1a* in the lens, skin, and notochord (L). (**M–R**) Lateral views of GBT protein trap mRFP expression patterns in 4 dpf larvae including *nfatc3a* in heart and muscle (M), *dele1* in muscle (N), *pard3bb* in the gut and pronephros (O), *LOC100537272* in vessels (P), *mgat5* in neuromasts (Q), and *ahnak* in skin (R). Scale bars = 200 µm. (**S–T**) Area proportional Venn diagrams of 193 genes trapped in GBT-confirmed lines comparing the ZFIN-assembled database with mRFP expression in GBT lines available through zfishbook at two dpf (S) and four dpf (T). 67 (35%) and 174 (90%) of 193 genes trapped in GBT-confirmed lines have no description about wild-type expression at 2 dpf and 4 dpf, respectively.

Discussion

Gene-break transposon system as a next generation mutagenesis system

GBT technology represents the first method for revertible allele generation in vertebrates outside of the mouse model (Clark et al., 2011a). In this manuscript we broadened GBT genomic coverage through the development of an RP2 construct series to encode functional mRFP in each of the potential reading frames (Figure 1A). While each individual construct still only integrates in-frame in a subset of introns, the RP2 series potentiates in-frame mRFP for any intron with Tol2-mediated integration. We also desired to increase GBT utility for subsequent genomic engineering applications. As RP2 series constructs were not modularly designed, we iteratively developed a next generation RP8 GBT series. All RP8 constructs include new restriction enzyme sites that render them modular for custom engineering. Additionally, RP8 series constructs use a smaller backbone designed to enhance transgenic efficiency. RP8 vectors most notably possess a new 3’ exon trap cassette that uses the γ-crystalline promoter to drive expression of lens-specific tagBFP instead of the ubiquitous expression of GFP delivered from RP2 vectors (Figure 1B). We found this lens-specific tagBFP to be equally useful in screening founders with GBT integrations. While we did not explicitly validate that the RP8 series provides transcriptional effects equivalent to the RP2 series, the major functional change in RP8 lies in its 3’ exon trap. We therefore expect the knockdown to be similar between RP8 and RP2.

Two additional zebrafish transposon-based protein trap vectors have been established. Dr. Fraser’s group developed the FlipTrap system (Trinh et al., 2011) that is mutagenic in the presence of Cre-recombinase. However, this FlipTrap system is primarily focused on imaging fusion proteins in vivo and addressing cellular dynamics. The Chen lab developed a complementary flipping system called the FT1 system that uses either Cre or Flp recombinase to regulate its alleles depending on the original insertion orientation (Ni et al., 2012). Our GBT system is highly complementary and non-redundant with these alternative transposon-based protein trap methods. The 5’ protein trap in the GBT system is terminated through an enhanced polyadenylation signal in conjunction with a putative border element and a second splice acceptor in the 3’ exon trap helps eliminate any pass-through. Together these elements achieve higher gene-specific mRNA knockdown than the single splice acceptor and the basic polyadenylation signal in FlipTrap and FT1 (Figure 3). In addition to a 5’ protein trap, our GBT system also possesses a 3’ exon trap that serves as a means for screening integrations, reports 3’ sequence, and possesses the ability to trap (without mRFP expression) non-coding RNAs that undergo splicing (Figure 1).

Consistent with its mRNA knockdown abilities, the mutagenic efficacy of the GBT system is functionally similar to other genome-wide forward genetic approaches at identifying critical early developmental loci. The GBT system achieved 7% recovery of visible early (through five dpf) developmental phenotypes during an initial forward genetic screen. This recovery is comparable to the 5% recovered visible phenotypes from the Sanger TILLING consortium analysis of truncated zebrafish genes (Kettleborough et al., 2013). Our 7% phenotype recovery is also comparable to prior retroviral (Amsterdam and Hopkins, 2004) and ENU (Haffter et al., 1996) zebrafish mutagenesis works that estimated between 1400 and 2400 genes (~5–9% of the genome) would result in a visible embryonic phenotype when mutated.

Gene-break transposon system enables functional genome annotation and generates novel potential human disease models

The connections between gene, expression pattern, function, and phenotype (or human disease) can be elucidated using our GBT system. During the initial GBT library creation, we identified a GBT-confirmed line with an RP2 integration between exon 81 and exon 82 of ryr1b (ENSDART00000036015.9, Ensembl Release 100 on April 2020), a zebrafish ortholog of human RYR1. Mutations in RYR1 are well-linked to a rare genetic neuromuscular disorder known as central core disease that presents with mild to severe muscle weakness (Jungbluth et al., 2018). Indeed, homozygous animals in this ryr1b GBT-confirmed line possess skeletal muscle-specific mRFP expression and a slow swimming phenotype (Clark et al., 2011a). Previously, a spontaneous mutant called relatively relaxed (ryr1b^mi340/mi340) was shown to have a slow swimming phenotype, truncated ryr1b protein with a pre-mature stop involved in an insertional mutagen, and defective skeletal muscle Ca²⁺ transients (Hirata et al., 2007). Our ryr1b GBT-confirmed line was therefore ideal to validate the functional genome annotation abilities of GBT constructs.

Similar to the relatively relaxed mutants which carry an insertion that introduces a premature stop codon between exons 48 and 49 of ryr1b (Hirata et al., 2007), we noted severely dampened skeletal muscle Ca²⁺ transients in animals homozygous for the ryr1b GBT allele (Figure 4). In addition to previously reported decreases in peak amplitude (Hirata et al., 2007) we found that ryr1b^mn0348Gt^/mn0348Gt animals also displayed a slower upstroke and lower frequency of skeletal muscle Ca²⁺ transients than wildtypes, functionally annotating these roles for the C-terminal region of ryr1b gene in vivo. Including this ryr1b line, we generated GBT-confirmed lines with integrations in 64 zebrafish orthologs of human disease-associated genes (Figure 5, Supplementary file 1, Supplementary file 3) in this study. GBT-confirmed lines with integrations in 40 zebrafish orthologs of human disease-associated genes may represent novel potential disease models, as we failed to find any description of existing zebrafish or mouse models for these genes or diseases. Our GBT system importantly possesses a built-in cure due to its revertible nature. Therefore, these GBT potential disease models will allow direct comparison of tissue-specific gene restoration with any therapeutic approach.

GBT protein trapping provides the basis for annotation of functionally diverse proteins and novel transcripts mapped on poorly assembled genomic regions

To achieve genomic representation, unbiased protein trapping is an important consideration. Tol2 transposase-mediated systems are known to facilitate near-random integration, but we wanted to explore this in the context of the GBT protein trapping constructs. We utilized PANTHER, Human Protein Atlas, and UniProtKB to explore the protein class and subcellular localization of the human orthologs of the genes trapped in GBT-confirmed lines (Figure 6). While nuclear localized proteins, such as transcription factors, represented the largest class of GBT-trapped genes, we identified a diverse set of proteins in our GBT-confirmed lines that localize all the way from the nucleoli to the extracellular space. The reason for the enrichment of nuclear genes is unknown. The rich diversity of proteins observed in our GBT-confirmed lines still supports that the entire collection has high diversity and is consistent with the random nature of Tol2-mediated genome integration events (Clark et al., 2011a). Completion of the zebrafish reference genomes also has enabled many new discoveries to be made with regards to the position of hundreds of genes that affect embryogenesis, behavior, and physiology. However, poorly assembled regions remain in both the zebrafish and the human genome (Howe et al., 2013). We indeed found that 10 GBT integrations in the confirmed lines (with mRFP expression) failed to map to any predicted genes. However, RNA sequencing reads in public datasets identified potential unannotated coding sequences aligned with these GBT integration loci (Supplementary file 1). While 5’ and 3’ RACE are necessary to confirm the mRNA fusion products, these unannotated coding sequences represent the possibility to annotate novel, protein-coding transcripts in these GBT lines. Therefore, GBT protein trapping can find, illuminate expression, and elucidate in vivo functions of novel genes and/or gene variants in poorly annotated regions of reference genomes.

GBT protein trapping annotates novel endogenous gene expression

GBT-based mRFP fusion proteins represent a notable advance over traditional techniques for probing endogenous gene expression (e.g., immunohistochemistry, in situ hybridization) that have yielded very little gene expression data at later developmental stages. The truncated mRFP fusion proteins in both RP2 and RP8 series constructs exhibited distinct cellular localizations in our GBT lines throughout development, including two dpf and beyond. Approximately 90% of GBT-confirmed lines showcased novel expression patterns of their annotated genes at four dpf (Figure 7).

These GBT-based mRFP fusion proteins allow investigation of subcellular localization of these tagged-gene products (Figure 6, Figure 6—figure supplement 1), with the exception of cases where the protein localization signal is contained in the C-terminal domain (Clark et al., 2011a; Trinh and Fraser, 2013). We indeed observed mRFP accumulation in the kidney tubules, white blood cells or developing bones in some GBT lines, likely based upon the remaining signal sequences at the N-terminus of the endogenous protein. Still, visualizing protein expression dynamics of GBT-trapped proteins in most lines should facilitate important initial annotations regarding subcellular localization of uncharacterized proteins to investigate molecular functions in vivo. With ever-improving fluorescence-based imaging tools (Liu et al., 2018), our GBT lines have the potential to annotate both cellular and subcellular gene expression at diverse stages on an organismal scale.

Gene-break protein trap library is a rich resource for the community

Taken together, GBT-based mRFP-reporters demonstrate how much we still have left to understand about the expression patterns of the overall proteome and, ultimately, the complex codex that is our genome. Even at the relatively well-studied 2-dpf stage, nearly 40% of GBT-confirmed lines elucidated novel gene expression data (Figure 7). Cataloging these expression patterns enables investigators to make collections of lines with expression in their cell/tissue of interest and/or a phenotype. The remaining 144 GBT-candidate lines and over 800 GBT lines represent a rich resource for genomic discoveries. For any GBT-candidate or GBT lines of interest, a similar cloning pipeline (Figure 2) can be employed to identify the GBT integration locus. In addition, the refinement of the zebrafish genome will enhance our ability to complete the annotation from GBT-line to GBT-confirmed line for any given line with a desired expression profile and/or phenotype. Together, this 1,200+ GBT-line collection is a new contribution for using zebrafish to annotate the vertebrate genome.

Future genomic insights using the GBT system

Although our GBT lines were made with random integration, new targeted integration tools, such as GeneWeld (Wierson et al., 2020), that employ gene editing techniques will empower labs to build custom GBT lines for their gene of interest (El Khoury et al., 2018). The three reading frames and modularity of the RP8 series are especially well suited to targeted integration approaches. Further, a combination of targeted and random integration may best facilitate discovery. For instance, a targeted approach could integrate a GBT cassette into a well-characterized, process-associated gene. Then, random integration could be used to probe for genes that potentiate or abrogate the disruption in the original process-associated gene. With this approach, GBTs are a powerful tool to investigate multigenic processes, including human disease.

GBT system offers advantages over frame-shift mutations

Finally, targeted mutagenic technology, such as CRISPR and TALEN systems, has become the gold standard reverse genetic approach. However, engineered mutant animals using approaches generate targeted indel mutation frequently fail to display overt phenotypes, often explained by genetic compensation (Balciunas, 2018; El-Brolosy and Stainier, 2017). One mechanism includes cellular increases in transcripts of genes in the same family that can functionally substitute when activated in a mutant background (Balciunas, 2018). Recently, a number of studies using reverse genetics tools have revealed phenotype differences between knockouts (indel mutants), and knockdowns (antisense-treated animals) in multiple model systems including Arabidopsis, Drosophila, zebrafish, mouse, and human cell lines. This discrepancy is attributed to transcriptomic changes in mutant but not in knockdown animals (Reviewed from El-Brolosy and Stainier, 2017). For example, knockdown of egfl7, an endothelial extracellular matrix (ECM) gene, induces severe vascular defects, whereas most egfl7 mutants exhibit no obvious defect, resulting from upregulation of other ECM proteins, especially emilins in egfl7 mutants, but not in egfl7 morphants (Ronzitti, 2019). As another mechanism of genetic compensation, mRNA processing—including nonsense-associated exon skipping and the use of alternative start or splice sites to escape nonsense-mediated decay—has been recently demonstrated to hinder loss-of-function approaches in zebrafish (Anderson et al., 2017; Prykhozhij et al., 2017), in human cell lines (Lalonde et al., 2017; Winter et al., 2019) and in the human population (Jagannathan and Bradley, 2016). In contrast, the molecular mechanism of GBT mutagenesis can normally avoid this genetic compensation effect seen with small indel mutations because the strong poly (A)-trapping element in the 5’ exon trap domain of RP cassettes can reduce mRNA to below 1% of the complete wild-type transcript level. This reduction eliminates many sources of transcriptional adaptations triggered by a loss of function mutation, such as alternative transcriptional start sites, splicing, or alternative translation initiation. The GBT can therefore act as a useful validation tool when targeted mutations with other technologies fail to display any phenotype.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Recombinant DNA reagent	pGBT-RP2.1	Clark et al., 2011a	RRID:Addgene_31828, Genbank: HQ335170	Figure 1A
Recombinant DNA reagent	pGBT-RP2.2	This paper	Genbank: MT815588	Figure 1A
Recombinant DNA reagent	pGBT-RP2.3	This paper	Genbank: MT815589	Figure 1A
Recombinant DNA reagent	pGBT-RP8.1	This paper	Genbank: MT815590	Figure 1B
Recombinant DNA reagent	pGBT-RP8.2	This paper	Genbank: MT815591	Figure 1B
Recombinant DNA reagent	pGBT-RP8.3	This paper	Genbank: MT815592	Figure 1B
Recombinant DNA reagent	pGBT-RP7.1	This paper		An intermediate construct to create pGBT-RP8.1
Recombinant DNA reagent	pGBT-RP6.1	This paper		An intermediate construct to create pGBT-RP8.1
Recombinant DNA reagent	pGBT-RP5.1	This paper		An intermediate construct of pGBT-RP8.1
Recombinant DNA reagent	pre(−1)GBT-RP5.1	This paper		An intermediate construct of pGBT-RP5.1
Recombinant DNA reagent	pre(−2)GBT-RP5.1	This paper		An intermediate construct of pGBT-RP5.1
Recombinant DNA reagent	pre(−3)GBT-RP5.1	This paper		An intermediate construct of pGBT-RP5.1
Recombinant DNA reagent	pKTol2-SE	Clark et al., 2011b
Recombinant DNA reagent	pUC57-I-SceI_LoxP_Splice	This paper		DNA source to create pre(−3)GBT-RP5.1
Recombinant DNA reagent	pUC57	Genscript	SD1176
Recombinant DNA reagent	pKTol2gC-nlsTagBFP	This paper		DNA source to create pre(−2)GBT-RP5.1
Recombinant DNA reagent	pGBT-R15	Clark et al., 2011a	RRID:Addgene_31826, Genbank ID: HQ335168
Recombinant DNA reagent	pGBT-PX	Sivasubbu et al., 2006	RRID:Addgene_31824, Genbank ID: HQ335166
Recombinant DNA reagent	pCR4-bactmIntron	This paper		DNA source to create pGBT-RP8.1
Recombinant DNA reagent	pCR4-bact_I1	This paper		DNA source of the carp beta-actin intron amplified from pGBT-RP2.1
Recombinant DNA reagent	pCR4-TOPO	Invitrogen	450030
Recombinant DNA reagent	pEXPR-mylpfa:GCaMP3	Baxendale et al., 2012
Chemical compound, drug	phenylthiocarbamide	Sigma-Aldrich	P7629
Chemical compound, drug	tricaine	Sigma-Aldrich	A5040
Chemical compound, drug	low melt agarose	Fisher Scientific	BP1360
Chemical compound, drug	pentylenetetrazole	Sigma-Aldrich	P6500
Chemical compound, drug	(S)-(-)-blebbistatin	Tocris	1852
Chemical compound, drug	β-mercaptoethanol	Sigma-Aldrich	M6250
Chemical compound, drug	proteinase K	Roche	3115879001
Commercial assay or kit	T4 DNA ligase	New England Biolabs	M0202S
Commercial assay or kit	RNeasy Micro Kit	QIAGEN	74004
Commercial assay or kit	Stainless steel beads	Next Advance	SSB05
Commercial assay or kit	MaXtract High Density tubes	QIAGEN	129056
Commercial assay or kit	SuperScript II Reverse Transcriptase	Thermo Fisher Scientific	18064014
Commercial assay or kit	SensiFAST SYBR Lo-ROX kit	Bioline	BIO-94005
Commercial assay or kit	QIAquick Gel Extraction Kit	QIAGEN	28704
Software, algorithm	GraphPad Prism 8	GrapgPad	RRID:SCR_002798
Software, algorithm	R	www.R-project.org	RRID:SCR_001905
Software, algorithm	R-Studio	www.rstudio.com/
Software, algorithm	pwr package	https://CRAN.R-project.org/package=pwr
Software, algorithm	wilcox.test function	www.R-project.org
Software, algorithm	coin package	Hothorn et al., 2006, Hothorn et al., 2008
Software, algorithm	outliers package	https://CRAN.R-project.org/package=outliers
Software, algorithm	effsize package	https://CRAN.R-project.org/package=effsize
Software, algorithm	JMP version 14	http://www.jmp.com/en_us/software/jmp.html	RRID:SCR_014242
Software, algorithm	SynFind	https://genomevolution.org/CoGe/SynFind.pl
Software, algorithm	BLASTP	http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK_LOC=blasthome	RRID:SCR_001010
Software, algorithm	BioMart, Ensembl tool	http://useast.ensembl.org/biomart/martview/	RRID:SCR_002344
Software, algorithm	PANTHER version 14.1	http://www.pantherdb.org/	RRID:SCR_004869
Software, algorithm	FIJI	https://fiji.sc/	RRID:SCR_002285
Software, algorithm	MetaMorph Microscopy Automation and Image Analysis Software	Molecular Devices	RRID:SCR_002368
Software, algorithm	Digidata 1440A	Molecular Devices
Software, algorithm	Clampex 10.3	Molecular Devices
Software, algorithm	Integrative Genomics Viewer (version 2.4.19)	Thorvaldsdóttir et al., 2013	RRID:SCR_011793
Software, algorithm	Galaxy	https://usegalaxy.org/	RRID:SCR_006281
Software, algorithm	BAMtools	Barnett et al., 2011	RRID:SCR_015987
Software, algorithm	TopHat	Kim et al., 2013	RRID:SCR_013035
Software, algorithm	Zebrafish Information Network (ZFIN)	https://zfin.org/	RRID:SCR_002560
Software, algorithm	Ensembl	https://useast.ensembl.org/index.html	RRID:SCR_002344
Software, algorithm	InParanoid version 8	http://inparanoid.sbc.su.se/cgi-bin/index.cgi	RRID:SCR_006801
Software, algorithm	The Human Protein Atlas	www.proteinatlas.org	RRID:SCR_006710
Software, algorithm	UniProtKB	https://www.uniprot.org/	RRID:SCR_004426
Software, algorithm	Online Mendelian Inheritance in Man (OMIM)	https://omim.org/	RRID:SCR_006437
Software, algorithm	Mouse Genome Informatics (MGI)	http://www.informatics.jax.org	RRID:SCR_006460
Software, algorithm	zfishbook	https://zfishbook.org/	RRID:SCR_006896
Other	RNA-seq dataset	White et al., 2017	GRCz10.WTSI.36hpf.1.bam	ftp://ftp.ensembl.org/pub/data_files/danio_rerio/GRCz10/rnaseq/
Other	RNA-seq dataset	White et al., 2017	GRCz10.WTSI.48hpf.1.bam	ftp://ftp.ensembl.org/pub/data_files/danio_rerio/GRCz10/rnaseq/
Other	RNA-seq dataset	White et al., 2017	GRCz10.WTSI.4dpf.1.bam	ftp://ftp.ensembl.org/pub/data_files/danio_rerio/GRCz10/rnaseq/

Share this article

Cite this article

Schematic of the RP2 and RP8 gene-break transposon (GBT) system with all three reading frames of AUG-less mRFP reporter.

GBT screening pipeline.

Knockdown efficiency of RP2.1 compared with previous gene-trap systems.

Figure 3—source data 1

GBT demonstrates that neural disinhibition mediated Ca2+ transients in mylpfa+ myocytes require the ryanodine receptor ryr1b in vivo.

Figure 4—source data 1

Figure 4—source data 2

Disease-associated human orthologs of the GBT trapped genes are implicated in human genetic disorders of multiple organ systems.

GBT-confirmed lines illuminate and disrupt genes encoding proteins with diverse functions and subcellular localizations.

Figure 6—source data 1

GBT protein trap elucidates novel gene expression patterns in embryonic and larval zebrafish.

Author details

Noriko Ichino

Contribution

Competing interests

MaKayla R Serres

Contribution

Competing interests

Rhianna M Urban

Contribution

Competing interests

Mark D Urban

Contribution

Competing interests

Anthony J Treichel

Contribution

Competing interests

Kyle J Schaefbauer

Contribution

Competing interests

Lauren E Tallant

Contribution

Competing interests

Gaurav K Varshney

Contribution

Competing interests

Kimberly J Skuster

Contribution

Competing interests

Melissa S McNulty

Contribution

Competing interests

Camden L Daby

Contribution

Competing interests

Ying Wang

Contribution

Competing interests

Hsin-kai Liao

Contribution

Competing interests

Suzan El-Rass

Contribution

Competing interests

Yonghe Ding

Contribution

Competing interests

Weibin Liu

Contribution

Competing interests

Jennifer L Anderson

Contribution

Competing interests

Mark D Wishman

Contribution

Competing interests

Ankit Sabharwal

Contribution

Competing interests

Lisa A Schimmenti

Contribution

Competing interests

Sridhar Sivasubbu

Contribution

Competing interests

Darius Balciunas

Contribution

Competing interests

GBT demonstrates that neural disinhibition mediated Ca²⁺ transients in mylpfa⁺ myocytes require the ryanodine receptor ryr1b in vivo.