Introduction

The ability to manipulate small subsets of neurons is critical to many of the experimental approaches used to study neuronal circuits. In Drosophila, researchers have generated genetic lines that express an exogenous transcription factor, primarily GAL4, in a subset of neurons (Griffith 2012; Venken et al., 2011). The GAL4 protein then drives expression of indicator or effector genes carried in a separate UAS transgenic construct (Fischer et al., 1988; Brand & Perrimon, 1993). This modular approach has proven to be very powerful but depends on generating collections of lines with reproducible GAL4 expression limited to different, specific subsets of cells. In the 1990s, so-called enhancer trap lines were the method of choice (Bellen et al., 1989). In this method, a GAL4 gene that lacks its own upstream control elements is inserted as part of a transposable element into different genome locations where its expression might come under the control of nearby endogenous regulatory elements. However, the resultant patterns were generally broad, with expression in hundreds of cells, limiting their use for manipulating specific neuronal cell types (Manseau et al., 1997; Yoshihara & Ito, 2000; Ito et al., 2003).

The FlyLight Project Team (https://www.janelia.org/project-team/flylight) was started to address this limitation with the overall goal of generating a large collection of GAL4 lines that each drove expression in a distinct small subset of neurons—ideally individual cell types. Many labs studying the upstream regulatory elements of individual genes in the 1980s and 1990s had observed that short segments of DNA located upstream of protein coding regions or in introns, when assayed for enhancer function, frequently drove expression in small, reproducible subsets of cells (see Levine and Tjian 2003). Pfeiffer et al. (2008) developed an efficient strategy for scaling up such assays of individual DNA fragments and showed that a high percentage of 2-3 kb genomic fragments, when cloned upstream of a core promoter driving GAL4 expression, produced distinct patterns of expression. Importantly, these patterns were much sparser than those observed with enhancer traps (Pfeiffer et al., 2008). The approach also took advantage of newly developed methods for site specific integration of transgenes into the genome (Groth et al. 2004), which facilitated the ability to compare constructs by placing them in the same genomic context. FlyLight was established in 2009 to scale up this approach. In 2012, the project reported the expression patterns produced by 6,650 different genomic segments in the adult brain and ventral nerve cord (Jenett et al., 2012) and later in the larval central nervous system (CNS; Li et al., 2014). In the adult central brain, we estimated that this ‘Generation 1’ (Gen1) collection contained 3,850 lines in which the number of labeled central-brain neurons was in the range of 20 to 5,000. These GAL4 driver lines and a collection of similarly constructed LexA lines have been widely used by hundreds of laboratories. However, we concluded that less than one percent of our lines had expression in only a single cell type, highlighting the need for a better approach to generating cell-type-specific driver lines.

To gain more specific expression the project turned to intersectional methods. These methods require two different enhancers to be active in a cell to observe expression of a functional GAL4 transcriptional activator in that cell. We adopted the split-GAL4 approach that was developed by Luan et al. (2008) and subsequently optimized by Pfeiffer et al. (2010) in which an enhancer drives either the activation domain (AD) or the DNA binding domain (DBD) of GAL4 (or optimized alternatives) in separate proteins. When present in the same cell the proteins carrying the AD and DBD domains, each inactive in isolation, dimerize to form a functional GAL4 transcription factor.

The work of Jenett et al. (2012) described the expression patterns of thousands of enhancers. Using these data, anatomical experts could identify Gen1 enhancers that express in the cell type of interest and cross flies that express the AD or DBD half of GAL4 under the control of two of the same enhancers and observe the resultant intersected expression pattern. Large collections of such AD or DBD genetic drivers, which we refer to as hemidriver lines, were generated at Janelia (Dionne et al., 2017) and at the IMP in Vienna (Tirian et al., 2017). In ~5-10% of such crosses, the cell type of interest was still observed, but now as part of a much sparser expression pattern than displayed by either of the initial enhancers. In about 1-2% of these genetic intersections, expression appeared to be limited to a single cell type.

At Janelia, early efforts were directed at targeting neuronal populations in the optic lobes (Tuthill et al., 2013; Nern et al., 2015; Wu et al., 2016) and mushroom bodies (Aso et al., 2014). We were encouraged by the fact that we were able to generate lines specific for the majority of cell types in these populations. With the involvement of additional collaborating groups, the project was extended to several other CNS regions (Table 1). In our initial studies we relied on expert human annotators performing extensive visual surveys of expression data to identify candidate enhancers to intersect. More recently, two advances have greatly facilitated this process. First, we have developed computational approaches (Otsuna et al., 2018, Hirsh et al., 2020, Mais et al., 2021, Meissner et al., 2023) to search databases of neuronal morphologies generated by stochastic labeling (Nern et al., 2015) of several thousand of the Jenett et al. (2012) and Tirian et al. (2017) GAL4 lines. Second, electron microscopy (EM) datasets (Scheffer et al., 2020; Cheong et al., 2023; Marin et al., 2023; Takemura et al., 2023; Nern et al., 2024; Schlegel et al., 2024) have provided comprehensive cell type inventories for many brain regions.

Publications reporting split-GAL4 lines from the adult cell-type-specific collection.

Publications are listed by year. The number of lines from the collection in each publication is listed, along with the CNS regions and/or cell types most commonly labeled. Many of these publications describe additional lines that were not included in the collection described here.

In this report, we summarize the results obtained over the past decade and present a collection of the best cell-type-specific split-GAL4 lines that were identified.

Results

Cell-type-specific split-GAL4 line collection

We describe here a collection of 3060 split-GAL4 lines targeting cell types across the adult Drosophila CNS. All lines were created in collaborations with the FlyLight Project Team at Janelia Research Campus from 2013 to 2023, based on examination of over 77,000 split combinations. 1644 lines were published previously and are drawn together here as part of the larger collection. Table 1 summarizes prior publications. The remaining 1416 lines in the collection are described in other in preparation manuscripts or newly reported here as shown in Figure 1 Supplemental File 1, which specifies proper citations and other line-specific information.

To confirm and consistently document the expression patterns of included lines, all were rescreened by crossing to a UAS-CsChrimson-mVenus reporter inserted in at a specific genomic location (attP18) (Figure 1; Klapoetke et al., 2014). At least one male and one female CNS were dissected, antibody labeled, and imaged per line. Expression patterns were validated by manual qualitative comparison to prior data, where available, and scored for specificity and consistency (see below and Methods). Male/female differences were observed in 109 of these 3060 lines, confirming previously reported sexual dimorphisms or suggesting potential areas for further study (Figure 1F; Figure 1 Supplemental File 1).

Example cell-type-specific lines.

(A) Split-GAL4 line SS52577 is expressed in P-FNv neurons arborizing in the protocerebral bridge, fan-shaped body, and nodulus (Wolff & Rubin, 2018).

(B) Split-GAL4 line MB043C is expressed primarily in PAM-α1 dopaminergic neurons that mediate reinforcement signals of nutritional value to induce stable olfactory memory for driving wind-directed locomotion and higher-order learning (Aso et al., 2014; Ichinose et al., 2015; Aso & Rubin, 2016; Aso et al., 2023; Yamada et al., 2023).

(C) Split-GAL4 line SS40265 is expressed in members of the 8B(t1) cluster of cholinergic neurons that connect the lower tectulum neuropil of the prothorax with the gnathal neuropil and the ventral most border of the vest neuropil of the brain ventral complex.

(D) Split-GAL4 line SS60203 is expressed in ascending neurons likely innervating the wing neuropil.

(E) Split-GAL4 line SS47938 is expressed in LBL40, mediating backwards walking (Feng et al., 2020; same sample used in Fig 5b, CC-BY license).

(F) Split-GAL4 line SS36564 is expressed in female-specific aIPg neurons (F1) and not observed in males (F2; Schretter et al., 2020).

(G) MCFO of split-GAL4 line SS72207 with specific expression in DNg34, a cell type described in Namiki et al., 2018.

Scale bars, 50 µm. See Figure 1 Supplemental file 1 for more line information and Figure 1 Supplemental file 2 for images of all cell-type-specific lines.

Previously published and newer lines were evaluated as a whole, taking advantage of new screening data to highlight existing lines or identify new lines that best label cell types. The lines were selected for inclusion based on several factors:

  1. Diversity: Where cell type information was available, especially from comparison to EM volumes (Scheffer et al., 2020; Cheong et al., 2023; Marin et al., 2023; Takemura et al., 2023; Nern et al., 2024), we generally limited each cell type to the two best split-GAL4 lines, so as to cover a wider range of cell types.

  2. Specificity: 1767 lines were scored as highest quality, well suited to activation-based behavioral studies, with strong and consistent labeling of a single identified cell type and minimal detected off-target expression (Figure 2A; Figure 1 Supplemental File 1). 1258 lines showed spatially segregated off-target expression that doesn’t interfere with neuron visualization for anatomy or physiology (Figure 2B-C; Figure 1 Supplemental File 1). A control line with minimal detected expression was also included (SS01062; Namiki et al., 2018).

  3. Consistency: 34 lines were very specific but showed weaker or less consistent expression (Figure 2D; Figure 2 — figure supplement 1; Figure 1 Supplemental File 1). These lines reveal anatomy but may be challenging to use for manipulations without examination of expression in each individual fly.

  4. Regions of interest: Lines were generated in collaboration with Janelia labs and collaborators studying particular CNS regions or classes of neurons. While this collection includes many high-quality lines across the CNS, most efforts were directed to regions of interest within the CNS (see below and Table 1).

Examples of line quality levels.

The 3060 cell type lines were scored for expression.

(A) Quality level 1 (1767 lines): Split-GAL4 line OL0015B (Wu, Nern, et al., 2016) is specifically and strongly expressed in a single cell type. Occasional weak expression may be seen in other cells.

(B) Quality level 2 (1232 lines): Split-GAL4 line SS59643 (Wolff, et al., 2024) has expression in two cell types. Occasional weak expression may be seen in other cells.

(C) Quality level 3 (26 lines): Split-GAL4 line SS59643 (Wolff, et al., 2024) has expression in three or more cell types.

(D) Quality level 4 (34 lines): Split-GAL4 line SS61022 has specific expression but weak or variable labeling efficiency. See Figure 2 Supplement 1 for examples of variable expression.

(E) Quality level 5: IS36417 is an Initial Split combination not selected for stabilization. Groups of neurons are visible, but the cell type of interest was not labeled with sufficient specificity for further work. Such lines were only included in the raw image collection.

Scale bars, 50 µm.

Example expression variability in Quality level 4 lines. Five samples from split-GAL4 line SS61022 show variable expression. (A-C) Male flies. (D-E) Female flies. Scale bars, 50 µm.

We examined the distribution of the cell type collection across the male and female CNS. To visualize the distribution, we aligned each image to a unisex template and segmented neuron patterns from rescreening images into a heat map (Figure 3). The cell type lines show neuronal expression across most of the CNS, with 96% of voxels having expression from at least 5 lines, 90% with 10 or more lines, and 66% with 20 or more lines (Figure 3 Supplemental File 1). Prominently labeled areas include the fan-shaped body, lobula, and superior medial protocerebrum in the brain, and the T1 medial ventral association center and intermediate tectulum in the ventral nerve cord (VNC). The antennal lobe is very rarely labeled. Other relatively rarely labeled areas include the anterior ventrolateral protocerebrum and prow in the brain and the ovoid (accessory mesothoracic neuropil) in the VNC.

Spatial distribution of cell type lines.

(A-D) Images of one male and one female sample from 3029 cell type rescreening lines were aligned to JRC2018 Unisex (Bogovic et al., 2020), segmented from background (see Methods), binarized, overlaid, and maximum intensity projected, such that brightness indicates the number of lines with expression. All images were scaled uniformly to a maximum brightness equal to 206 lines on Fiji’s “royal” LUT (scale inset in D). This saturated a small portion of the male antennal mechanosensory and motor center (AMMC) that reached a peak value of 260 lines per voxel, for the purpose of better visualizing the rest of the CNS.

(A) Female CNS.

(B) Male CNS.

(C) Female image stack minus male, then maximum intensity projected.

(D) Male image stack minus female, then maximum intensity projected.

(E-F) Split-GAL4 line SS56987 (Schretter, et al., 2020) is sex-specifically expressed in pC1d neurons that largely lie within the region of the female central brain highlighted in (C). Male and female images have different brightness scales.

(G-H) Split-GAL4 line SS35230 (Shuai, et al., 2024) is sex-specifically expressed in abdominal ganglion neurons that largely lie within the region of the female VNC highlighted in (C). Male and female images have different brightness scales. All scale bars, 50 µm.

We compared the line distribution between female and male images (Figure 3C-D). We observed more female split-GAL4 expression in the epaulette and abdominal ganglion, and more male expression in the antennal mechanosensory and motor center and several regions of the mushroom body (Figure 3 Supplemental File 1). Two example sexually dimorphic split-GAL4 lines from the cell type collection illustrate regions of increased split-GAL4 expression in females (Figures 3E-H). The lines were identified based on NeuronBridge searches of the ‘female minus male’ image (Clements et al., 2021; Clements et al., 2024). Examination of individual male images suggests that (in contrast to the female) the subtraction approach highlighted male artifacts of the reporter system: either background incompletely segmented away from neurons (e.g. in the metathoracic ganglion) or repetitive neuron signal unrelated to other line-specific expression (‘sparse-T’ neuron and projections into the AMMC; Meissner, et al., 2023).

The collection of lines described above has been deposited with Bloomington Drosophila Stock Center (Bloomington, IN) for availability until at least August 31, 2026. Images and line metadata are available at https://splitgal4.janelia.org. Anatomical searching for comparison to other light microscopy (LM) and EM data is available at https://neuronbridge.janelia.org.

Raw image data collection and analysis workflow

In addition to the line collection described above, we describe a split-GAL4 image data resource. It consists of all good quality FlyLight split-GAL4 images and associated metadata generated in the project’s first nine years of split-GAL4 characterization (Figure 1 Supplemental File 4). It includes additional data for the lines described above, together with data for many additional split-Gal4 combinations. Many of these represent lower quality lines that label multiple nearby cell types (Figure 2D and see below), but the image collection also includes additional high quality lines that were not chosen for stabilization or are currently not maintained as stable lines (for example, because of similarity to other lines in the collection).

FlyLight’s analysis workflow consisted of multiple related image-generating pipelines (Figure 4A-B). Typically, we screened an “Initial Split” (IS) cross temporarily combining the two candidate split hemidrivers and reporter before building (“stabilizing”) a genetically stable “Stable Split” (SS) line. IS and SS data with the same 5-6 digit code reflect the same combination of split hemidrivers. Some split-GAL4 lines use region-specific nomenclatures, with names prefixed by “MB” (mushroom body), “OL” (optic lobe), or “LH” (lateral horn), and in most cases label neurons within or nearby those regions. We examined about 77,000 IS crosses and after quick visual evaluation chose to image a fly CNS sample from about half of them. 36,033 such IS samples (individual flies) are included in this image collection (Table 2). About 10% of IS crosses were made into SS lines and imaged again, with 8,273 such lines included.

Split-GAL4 workflow and FlyLight data release statistics

(A) Typical workflow of predicting and characterizing split-GAL4 combinations.

(B) Example with Split-GAL4 line SS23880 (Garner et al., 2023).

Summary of FlyLight adult fly line and image releases

Includes descriptions of Jenett et al., 2012; Dionne et al., 2018; Tirian & Dickson, 2017; publications in Table 1; and stock distribution by Bloomington Drosophila Stock Center and Janelia Fly Facility. Section with italic text is specific to publications from Table 1 and this publication. Image counts are unique between categories, whereas line counts overlap extensively between categories. “Image tiles” are considered as unique 3D regions, with each MCFO tile captured using two LSM image stacks. Stock shipments count each shipment of each stock separately. Raw and processed images are available at https://www.janelia.org/gal4-gen1, https://gen1mcfo.janelia.org, https://splitgal4.janelia.org, and many can be searched based on anatomy at https://neuronbridge.janelia.org.

Further documentation of the full SS pattern at higher quality with varying levels of pre-synaptic labeling was generated by the Polarity pipeline (Aso et al., 2014; Sterne et al., 2021), followed by full 20x imaging and selected regions of interest imaging at 63x, with 37,409 such samples from 7,039 lines included. Single neuron stochastic labeling by MultiColor FlpOut (MCFO; Nern, et al., 2015) revealed single neuron morphology and any diversity latent within the full SS pattern, with 54,807 such samples from 7,679 lines included in the image collection.

In total the raw image collection consists of 46,653 IS/SS combinations, 129,665 samples (flies), 612,124 3D image stacks, and 4,371,364 secondary (processed) image outputs, together 192 TB in size. The IS data may be particularly valuable for work on understudied CNS regions, as it contains biological intersectional results (as opposed to computational predictions) that may have specific expression outside regions focused on in prior publications (Table 1).

Due to the size of the raw image collection, it has not undergone the same level of validation as our other image collections, and caution is recommended in interpreting the data therein. We believe it is nonetheless of overall good quality. The images are available at https://flylight-raw.janelia.org and are gradually being made searchable via NeuronBridge due to the size of the dataset. At the time of this version of record, the 36,000 IS images have been added, with the remaining data planned to be added by the end of 2024. For a further guide to interpreting the image data, see Figure 4 Supplemental file 1.

Larval split-GAL4 line collection

We include 982 split-GAL4 lines, 31 Generation 1 GAL4 lines, and 353 Generation 1 LexA lines selected for specific expression in the third-instar larval CNS (see Figure 1 Supplemental file 1). Seven more split-GAL4 lines were selected based on both larval and adult expression.

We selected larval lines based on the following criteria: 1) The projection pattern had to be sparse enough for individual neurons in a line to be identifiable; 2) The line had to be useful for imaging of neural activity or patch-clamp recording; and 3) the line had to be useful for establishing necessity or sufficiency of those neurons in behavioral paradigms. We thus selected lines with no more than 2 neurons per brain hemisphere or per subesophageal zone (SEZ) or VNC hemisegment. When there are more neurons in an expression pattern, overlap in projections makes it hard to uniquely identify these neurons. Also, interpreting neural manipulation results in behavioral experiments becomes much more challenging.

If lines had sparse expression in the brain or SEZ but also had expression in the VNC (sparse or not), we nevertheless considered them useful for studying the brain or SEZ neurons because 1) the brain neurons could be identified; 2) the line could be used for imaging or patch-clamp recording in the brain; and 3) the split-GAL4 line could be cleaned up for behavioral studies, using teashirt-killer-zipper to eliminate VNC expression (Dolan et al., 2017). Similarly, lines with specific VNC expression but with undesired additional expression in the brain or SEZ were still useful for imaging and behavioral studies in the VNC, because a loss of phenotype following teashirt-killer-zipper could establish causal involvement of VNC neurons. For LexA lines, we included a few with 3 or more neurons per hemisphere or per hemisegment because there are fewer LexA lines overall and they could still be used for functional connectivity studies.

Images of all of the selected lines are available at Virtual Fly Brain (VFB; https://raw.larval.flylight.virtualflybrain.org/) and will be further integrated into the main VFB website (https://virtualflybrain.org/). We anticipate these lines will be useful tools in conjunction with electron microscopy volumes of the larval nervous system.

Discussion

This collection of cell-type-specific split-GAL4 lines and associated image resource is a toolkit for studies of many Drosophila neurons. The adult cell-type-specific line collection covers many identified cell types, and the images are being made instantly searchable for LM and EM comparisons at NeuronBridge, enabling selection of existing or new combinations of genetic tools for specific targeting based on anatomy. In total, the Janelia FlyLight Project Team has contributed 540,000 3D images of 230,000 GAL4, LexA, and split-GAL4 fly CNS samples. As of this publication, Janelia and Bloomington Drosophila Stock Center have distributed FlyLight stocks 300,000 times to thousands of groups in over 50 countries. We have also released standardized protocols for fly dissection and immunolabeling at the FlyLight protocols website https://www.janelia.org/project-team/flylight/protocols.

Several caveats should be kept in mind when using these tools and other transgenic systems. Many split-GAL4 lines still have some off-target expression (see Figure 2). It is always good practice to validate results using multiple lines labeling the same neurons. In most cases, the off-target expression will not be present in multiple lines, supporting assigning a phenotype to the shared neurons. The less precise a set of lines, the more lines and other supporting evidence should be used. Different effectors and genomic insertion sites also vary in strength and specificity of expression and should be validated for each application (Pfeiffer et al., 2010; Aso et al., 2014). For example, MCFO images can have brighter single neurons than in full split-GAL4 patterns. UAS-CsChrimson-mVenus allows for a direct correlation between labeling and manipulation that isn’t available for every effector. See Figure 4 Supplemental File 1 for other examples of differences between effectors used in this study. If multiple transgenes are inserted into the same genomic locus, they can interact directly in unanticipated ways (transvection; Mellert & Truman, 2012). Fly age should also be controlled, as expression can vary over fly development (Harris et al., 2015), and effectors may take time to mature after expression (Baird et al., 2000).

Cell type consensus naturally evolves as new observational modalities are developed (e.g. connectomics and RNA sequencing) and comparisons are made across existing data, such as between EM volumes. Not all lines in this collection have cell type information, and some types will likely change. Different cell types also vary widely in their number of constituent neurons, and within a type there is often some variation between hemispheres and individuals. Thus, for any cell types with more than one neuron we cannot claim to have fully labeled every neuron of a type.

Despite extensive efforts, we have not developed specific lines for every cell type in the fly CNS. Moreover, the peripheral nervous system and the rest of the body were beyond the scope of this effort. Our hemidrivers likely do not effectively label some CNS cell types (discounting extremely broad expression), as they are based on Generation 1 collections that together may only label about half of all biological enhancers. As a rule of thumb, we estimate being able to use these tools to create precise split-GAL4 lines for one third of cell types, imprecise lines for another third, and no usable line for the remaining third of cell types. This hit rate nonetheless is often enough to characterize key circuit components. Continued iteration with the existing toolset is expected to lead to diminishing returns on improved CNS coverage.

An ideal toolkit with complete coverage of cell types with specific lines would require additional development. Capturing a full range of enhancers requires alternatives to our model of short genetic fragments inserted into a small number of genomic locations. Predicting enhancer combinations based on transcriptomic data and capturing them at their native location holds promise (e.g., Pavlou et al., 2016; Chen et al., 2023), though high-quality transcriptomics data with cell type resolution are not yet available for much of the CNS. Insertions throughout the genome bring their own challenges of unpredictable side-effects, however, and specific targeting of every cell type remains a challenge. Further restriction by killer zipper, GAL80, Flp, or other intersectional techniques will likely remain an area of development (Pavlou et al., 2016; Dolan et al., 2017; Ewen-Campen et al., 2023).

The Janelia FlyLight Project Team has achieved its goal of developing tools to study the Drosophila nervous system. The team and collaborators have delivered GAL4/LexA lines labeling many neurons, single-neuron images enabling correlation with EM and the prediction of split-GAL4 lines, this collection of cell-type-specific split-GAL4 lines, an image resource of many more split combinations, and a series of focused studies of the nervous system.

Materials and methods

Key Resources Table

Fly driver stocks

Included split-GAL4 driver stocks and references are listed in Figure 1 Supplemental File 1. See Table 2 for counts of driver stocks available from Bloomington Drosophila Stock Center.

Fly reporter stocks

Rescreening of the cell type collection made use of a 20XUAS-CsChrimson-mVenus in attP18 reporter (Klapoetke et al, 2014). Additional reporters are listed in the Key Resources Table and Figure 1 Supplemental File 4.

Fly crosses and dissection

Flies were raised on standard cornmeal molasses food typically between 21 and 25 C and 50% humidity. Crosses with hs-Flp MCFO were at 21-22 C, except for a 10-60 minute 37 C heat shock. Other crosses were typically at 25 C for most of development and 22 C in preparation for dissection. Flies were typically dissected at 1-5 days old for the cell type collection rescreening, 1-8 days old for other non-MCFO crosses and 3-8 days old for MCFO images reported in this study. Dissection and fixation were carried out as previously described (Aso et al., 2014; Nern et al., 2015). Protocols are available at https://www.janelia.org/project-team/flylight/protocols.

Immunohistochemistry, imaging, and image processing

Immunohistochemistry, imaging, and image processing were carried out as previously described (Aso et al., 2014; Nern et al., 2015; Sterne et al., 2021; Meissner et al., 2023). See Key Resources table for antibodies. Additional details of image processing, including source code for processing scripts, are available at https://data.janelia.org/pipeline. See Figure 4 Supplemental Figure 1 for a user guide to interpreting the image data.

Cell type collection rescreening

If inconsistencies were observed in expression between rescreened images and older image data for a line, the cross was repeated. If still inconsistent, the other copy of the stock (all are kept in duplicate) was examined. If only one of the two stock copies gave the expected expression pattern, it was used to replace the other copy. Repeated problems with expression pattern, except for stochasticity within a cell type, were grounds for removal from the collection. If sex differences were observed, crosses were also repeated. Lines reported with sex differences showed the same difference at least twice. We could not eliminate the possibility of stock issues appearing as sex differences by coincidentally appearing different between sexes and consistent within them.

Line quality levels

Cell type lines were qualitatively scored first by visual examination of averaged color depth Maximum Intensity Projection (MIP) CNS images (a version of Figure 1 Supplemental File 2 with less image compression; Otsuna et al., 2018), and then by a second observer examining individual images. Scores were applied as described in Figure 2. As images were taken from a variety of reporters, an effort was made to discount background specific to the reporters, especially 5XUAS-IVS-myr::smFLAG in VK00005.

Line distribution analysis

3,029 cell type split-GAL4 lines were analyzed, each including images of one male and one female fly sample from rescreening. 6,058 brains and 6,058 VNCs were included. Images were aligned to the JRC2018 Unisex template (Bogovic et al., 2020) and segmented using Direction-Selective Local Thresholding (DSLT; Kawase et al., 2015). DSLT weight value, neuron volume percentage, MIP ratio against the aligned template, and MIP-weight ratio were adjusted to refine segmentation. DSLT connected fragmented neurons with up to a 5 µm gap, enhancing neuronal continuity. Noise was eliminated and objects were separated into individual neurons using a connecting component algorithm. The algorithm distinguishes individual neurons by recognizing connected voxel clusters as distinct entities based on DSLT 3D segmentation, followed by removal of voxels below 45/4095 (12 bit) brightness and objects smaller than 30 voxels.

The segmentation produced 19,313 non-overlapping brain objects and 18,860 VNC objects, consisting of neurons, trachea, debris, and antibody background. Most non-neuronal objects and some very dim neurons were eliminated using color depth MIP-based 2D shape filters and machine learning filters, followed by visual inspection, yielding 10,778 brain and 9,310 VNC objects. We observed 74 objects containing tissue background with neurons. The background was manually 3D removed using VVDViewer. Neuron masks were combined into a single volume heat map with a newly developed Fiji plugin “Seg_volume_combine_to_heatmap.jar” (https://github.com/JaneliaSciComp/3D-fiber-auto-segmentation/tree/main). Voxel brightness indicates the number of objects (at most one per line) within each voxel.

Acknowledgements

This work is part of the FlyLight Project Team at Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA. Author order includes the following alphabetical groups: FlyLight Project Team, Janelia Fly Facility, Janelia Scientific Computing Shared Resource, and contributing laboratories. We thank Kristin Scott for visitor project mentorship and the generation of lines labeling the subesophageal ganglion. We thank Arnim Jenett and Teri Ngo for early contributions to split-GAL4 screening. We thank Yichun Shuai for annotations of lines generated with the Aso lab. We thank Project Technical Resources for management coordination and staff support. We thank Melanie Radcliff and Crystal Sullivan Di Pietro for administrative support. We thank Mark Bolstad, Tom Dolafi, Leslie L Foster, Sean Murphy, Donald J Olbris, Todd Safford, Eric Trautman, and Yang Yu for their work on software infrastructure. We thank Olivia DeThomasis, Raj Nair, Hina Naz, Jessalyn Pugh, Brian Rebhorn, Rose Rogall, and Rodney Simmons for preparation of fly food. We thank David Shepherd and Sam Whitehead for their help with analyzing VNC interneuron types. We thank Stephan Saalfeld for imaging discussions. Stocks obtained from the Bloomington Drosophila Stock Center (NIH P40OD018537) were used in this study. We thank them, especially Annette Parks, Cale Whitworth, and Sam Zheng, for the maintenance and distribution of stocks from the Janelia collection. We thank Rob Court, Alex McLachlan, and David Osumi-Sutherland for their work on Virtual Fly Brain and coordination on image distribution. We thank Brian Crooks and Zbigniew Iwinski of Zeiss for microscopy assistance. This work was supported in part by the Janelia Visiting Scientist Program. This article is subject to HHMI’s Open Access to Publications policy. HHMI lab heads and project team leads have previously granted a nonexclusive CC BY 4.0 license to the public and a sublicensable license to HHMI in their research articles. Pursuant to those licenses, the author-accepted manuscript of this article can be made freely available under a CC BY 4.0 license immediately upon publication.

Author contributions

Geoffrey W Meissner, Methodology, Validation, Formal analysis, Investigation, Data curation, Writing—original draft, Writing—review & editing, Visualization, Supervision, Project administration; Allison Vannan, Validation, Investigation, Data curation, Project administration; Jennifer Jeter, Validation, Investigation, Visualization, Project administration; Kari Close, Investigation; Gina M DePasquale, Investigation; Zachary Dorman, Validation, Investigation, Project administration; Kaitlyn Forster, Investigation; Jaye Anne Beringer, Investigation; Theresa V Gibney, Investigation; Joanna H Hausenfluck, Investigation; Yisheng He, Investigation; Kristin Henderson, Investigation, Writing—review & editing; Lauren Johnson, Investigation; Rebecca M Johnston, Methodology, Validation, Investigation, Project administration; Gudrun Ihrke, Writing—review & editing, Supervision, Project administration; Nirmala Iyer, Investigation; Rachel Lazarus, Validation, Investigation, Project administration; Kelley Lee, Validation, Investigation, Project administration; Hsing-Hsi Li, Validation, Investigation, Project administration; Hua-Peng Liaw, Investigation; Brian Melton, Validation, Investigation; Scott Miller, Investigation; Reeham Motaher, Investigation; Alexandra Novak, Investigation; Omotara Ogundeyi, Investigation; Alyson Petruncio, Investigation; Jacquelyn Price, Validation, Investigation, Project administration; Sophia Protopapas, Investigation; Susana Tae, Validation, Investigation, Project administration; Jennifer Taylor, Investigation; Rebecca Vorimo, Validation, Investigation, Project administration; Brianna Yarbrough, Investigation; Kevin Xiankun Zeng, Validation, Investigation; Christopher T Zugates, Investigation, Supervision, Project administration; Heather Dionne, Investigation; Claire Angstadt, Investigation; Kelly Ashley, Investigation; Amanda Cavallaro, Validation, Investigation, Supervision, Project administration; Tam Dang, Investigation; Guillermo A Gonzalez III, Investigation; Karen L Hibbard, Investigation; Cuizhen Huang, Investigation; Jui-Chun Kao, Validation, Investigation; Todd Laverty, Supervision, Project administration; Monti Mercer, Investigation; Brenda Perez, Investigation; Scarlett Pitts, Validation, Investigation; Danielle Ruiz, Investigation; Viruthika Vallanadu, Investigation; Grace Zhiyu Zheng, Investigation; Cristian Goina, Software; Hideo Otsuna, Methodology, Software, Formal analysis, Data curation, Visualization; Konrad Rokicki, Software, Writing—review & editing, Visualization, Supervision; Robert R Svirskas, Software, Visualization; Han SJ Cheong, Investigation, Data curation; Michael-John Dolan, Investigation, Data curation; Erica Ehrhardt, Investigation, Data curation; Kai Feng, Investigation, Data curation; Basel El Galfi, Investigation, Data curation; Jens Goldammer, Investigation, Data curation, Writing—review & editing; Stephen J Huston, Investigation, Data curation; Nan Hu, Investigation, Data curation; Masayoshi Ito, Investigation, Data curation; Claire McKellar, Investigation, Data curation; Ryo Minegishi, Investigation, Data curation; Shigehiro Namiki, Investigation, Data curation; Aljoscha Nern, Conceptualization, Methodology, Investigation, Data curation, Writing—review & editing; Catherine E Schretter, Investigation, Data curation; Gabriella R Sterne, Investigation, Data curation, Writing—review & editing; Lalanti Venkatasubramanian, Investigation, Data curation; Kaiyu Wang, Investigation, Data curation; Tanya Wolff, Investigation, Data curation, Writing— review & editing; Ming Wu, Investigation, Data curation; Reed George, Supervision; Oz Malkesman, Supervision, Project administration; Yoshinori Aso, Conceptualization, Methodology, Investigation, Data curation, Writing—review & editing, Supervision; Gwyneth M Card, Conceptualization, Supervision; Barry J Dickson, Conceptualization, Supervision; Wyatt Korff, Conceptualization, Supervision, Funding acquisition; Kei Ito, Conceptualization, Writing—review & editing, Supervision; James W Truman, Conceptualization, Supervision; Marta Zlatic, Conceptualization, Writing—review & editing, Supervision; Gerald M Rubin, Conceptualization, Methodology, Investigation, Writing—original draft, Writing—review & editing, Supervision, Project administration, Funding acquisition; FlyLight Project Team, Investigation, Validation

Data availability

The footprint of this image resource (~200 TB) exceeds our known current practical limits on standard public data repositories. Thus, we have made all the primary adult data (and a variety of processed outputs) used in this study freely available under a CC BY 4.0 license at https://doi.org/10.25378/janelia.21266625.v1 and through the publicly accessible websites https://splitgal4.janelia.org and https://flylight-raw.janelia.org. Many images are being made searchable with the same permissions on the user-friendly NeuronBridge website https://neuronbridge.janelia.org. NeuronBridge code is available at Clements et al., 2021 and the application and implementation are discussed further in Clements et al., 2024. Larval images are available at https://raw.larval.flylight.virtualflybrain.org/. All other data generated or analyzed during this study are included in the manuscript and supporting files.