From EM neurons to peripheral targets via GAL4 lines • neuronbridger

Goal

Generate hypotheses linking VNC efferent neurons to specific peripheral tissues by chaining three resources:

MANC (Male Adult Nerve Cord) connectome via the malevnc R package — pick up abdominal efferents:
- MNad* abdominal motor neurons;
- EN* efferent / endocrine neurons with soma in an abdominal neuromere (A1–A8). Note: the prefix is EN, not ENN.
NeuronBridge colour-depth MIP search (data release v3_9_0, the first to ship MANC MIPs) — find GAL4 / split-GAL4 lines that label each EM cell.
KDRC KGutProject (Lim et al. 2021) — for any matched line, look up which adult-gut region it stains. A shared line is strong indirect evidence that the EM cell innervates that region.

Very few VNC efferents have a confirmed peripheral target, so the output is a ranked list of candidates for follow-up experiments.

Template space. This vignette uses NeuronBridge’s pre-computed VNC MIP libraries — it doesn’t render new MIPs from MANC meshes. If you want to render MIPs to compare directly, the VNC space is JRC2018VNCU_HR (dims 573 × 1119 × 219, voxdims 0.461 × 0.461 × 0.7 µm) — the VNC analogue of JRC2018U_HR used by Render a MIP from an EM neuron and Render a MIP from registered LM data. Pass target_space = "VNC" to nrrd_to_mip().

Prerequisites

A neuPrint authentication token (see ?neuprintr::neuprint_login). Put it in .Renviron as neuprint_token=....
Companion natverse packages:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("natverse/neuronbridger")
remotes::install_github("natverse/malevnc")
remotes::install_github("natverse/neuprintr")

library(neuronbridger)
library(malevnc)
library(neuprintr)
library(dplyr)

NB_VERSION <- "v3_9_0"   # first NB release that includes MANC MIPs

Step 1: Pick target neurons from MANC

manc_neuprint() opens a connection against the public MANC release (manc:v1.2.3 at time of writing). manc_neuprint_meta() returns one row per body, with columns including bodyid, type, class, somaSide, somaNeuromere (values T1–T3, A1–A10), entryNerve, exitNerve and predictedNt.

conn <- manc_neuprint()

# Motor neurons whose cell type starts "MNad" — these exit through abdominal
# nerves (AbN1–AbN4) to innervate abdominal muscles.
mn.ad <- manc_neuprint_meta("/type:MNad.*", conn = conn)

# Efferent neurons (EN*) with somata in an abdominal segment. These include
# neurosecretory / endocrine cells that project out of the CNS through
# peripheral nerves.
en.ab <- manc_neuprint_meta("/type:EN.*", conn = conn) |>
  filter(somaNeuromere %in% paste0("A", 1:8))

targets <- bind_rows(
  mn.ad |> mutate(group = "MNad motor"),
  en.ab |> mutate(group = "abdominal EN")
) |>
  mutate(bodyid = as.character(bodyid)) |>
  select(bodyid, type, group, somaNeuromere, somaSide, exitNerve, predictedNt)

targets
# MANC has 229 MNad* motor neurons and 44 abdominal EN* efferents
# at the v1.2.3 release — 273 targets in total.

As a sanity check, plot a handful:

library(nat)
nopen3d()
neurons <- manc_read_neurons(targets$bodyid[1:10], conn = conn)
plot3d(neurons, soma = 1000, lwd = 2)

Step 2: Find NeuronBridge matches

2a — Resolve each body’s MANC MIP

neuronbridge_info() queries the per-body JSON and returns one row per MIP that NeuronBridge holds for that body. A single MANC neuron typically shows up in three libraries (MANC, VNC, male-CNS); we want the MANC one, because that is the MIP whose hits were precomputed against the FlyLight VNC libraries.

mip.map <- list()
for (bid in targets$bodyid) {
  info <- try(neuronbridge_info(bid, dataset = "by_body",
                                version = NB_VERSION), silent = TRUE)
  if (inherits(info, "try-error") || !nrow(info)) next
  manc <- info[grepl("MANC", info$libraryName, ignore.case = TRUE), ]
  if (!nrow(manc)) next
  manc$bodyid <- bid
  mip.map[[bid]] <- manc[1, c("bodyid", "nb.id", "libraryName")]
}
mip.map <- do.call(rbind, mip.map)
nrow(mip.map)        # ~ all 273 targets resolve

2b — Pull precomputed colour-depth hits

hits <- list()
for (i in seq_len(nrow(mip.map))) {
  h <- try(neuronbridge_hits(mip.map$nb.id[i], version = NB_VERSION),
           silent = TRUE)
  if (inherits(h, "try-error") || is.null(h) || !nrow(h)) next
  h$bodyid <- mip.map$bodyid[i]
  # Best MIP per line for this one neuron.
  h <- h[order(h$normalizedScore, decreasing = TRUE), , drop = FALSE]
  h <- h[!duplicated(h$publishedName), , drop = FALSE]
  hits[[i]] <- h
}
hits <- do.call(plyr::rbind.fill, hits)

Restrict to FlyLight VNC LM libraries (brain libraries are not informative for a motor or endocrine neuron):

hits.vnc <- hits |>
  filter(grepl("FlyLight|Split", libraryName, ignore.case = TRUE),
         !grepl("Brain",        libraryName, ignore.case = TRUE)) |>
  mutate(normalizedScore = as.numeric(normalizedScore))

# Annotate back with the target-neuron side of the join.
hits.vnc <- merge(hits.vnc,
                  targets[, c("bodyid", "type", "group",
                              "somaNeuromere", "exitNerve")],
                  by = "bodyid", all.x = TRUE)

2c — Per-neuron top-5 with an elbow cutoff

normalizedScore is clipped at 50,000 and well-targeted neurons often have several top lines within ~10% of each other, so a fixed score threshold is the wrong tool. We keep the top line, then accept line k only if its score is ≥ 75% of k-1 and ≥ 20% of the top — capped at 5 lines.

top_with_elbow <- function(df, cap = 5,
                           drop_frac  = 0.75,
                           floor_frac = 0.20) {
  df <- df[order(df$normalizedScore, decreasing = TRUE), , drop = FALSE]
  if (!nrow(df)) return(df)
  keep <- TRUE
  if (nrow(df) > 1) {
    ratios <- df$normalizedScore[-1] / df$normalizedScore[-nrow(df)]
    keep <- c(TRUE,
              cumprod(ratios >= drop_frac) == 1 &
              df$normalizedScore[-1] / df$normalizedScore[1] >= floor_frac)
  }
  head(df[keep, , drop = FALSE], cap)
}

hits.top <- hits.vnc |>
  group_by(bodyid) |>
  group_modify(~ top_with_elbow(.x)) |>
  ungroup()

# Lines kept per neuron:
table(table(hits.top$bodyid))
# In practice, most neurons hit the cap of 5 (top-5 scores are close together)
# and only a handful collapse to 1 line.

Quick eyeball of a few top hits:

open3d()
scan_mip(mips = head(hits.top, 20), no.hits = 20)

If you want to be selective — for instance, dropping lines that also label nearby motor neurons you do not want — neuronbridge_avoid() is useful: pass the one bodyid you want in search and a vector of IDs you want to stay clear of in avoid.

favourite <- targets$bodyid[1]
avoid     <- setdiff(targets$bodyid, favourite)

clean <- neuronbridge_avoid(search = favourite, avoid = avoid,
                            threshold = 23000)

Collect the unique candidate line set for the KDRC cross-reference:

candidate.lines <- sort(unique(hits.top$publishedName))
length(candidate.lines)

Step 3: Cross-reference against KDRC KGutProject

The KGutProject is a database of 353 GAL4 (and a handful of split-GAL4) driver lines screened by Lim et al. 2021 for expression across ten regions of the adult gut: Crop, PV (proventriculus), R1–R5 (midgut), MHJ (midgut–hindgut junction), Hindgut/ileum and Rectum. If one of our NeuronBridge hits is listed there, that is a direct clue that the line — and by extension the abdominal neuron it labels — may project to that gut region.

The neuronbridger package ships a small set of helpers (kdrc_start_session(), kdrc_search_line(), kdrc_line_regions(), kdrc_lookup_lines(), kdrc_close_session()) that drive a headless Chrome session via chromote to run the site’s own search and detail look-ups. KDRC has no public REST API — its AUIquery.do endpoint is gated by proprietary client-side signing — but reusing the page’s own JavaScript works cleanly.

Look up every candidate line

if (!requireNamespace("chromote", quietly = TRUE))
  install.packages("chromote")

# One-shot batch lookup. This drives a headless Chrome; expect roughly
# 8 seconds per line (search) plus ~3 seconds per hit (detail page), so a
# 500-line run is about an hour.
session <- kdrc_start_session()
kdrc    <- kdrc_lookup_lines(candidate.lines, session = session)
kdrc_close_session(session)

# Each hit row has Y/N flags for the ten KDRC regions plus BDSC ID,
# associated gene, and the stable `p_seq` you can paste into
# https://flyinfo.kr/FlexForm_KGut_Detail_Load.do?tmpl=template/E&p_seq=<p_seq>
# to see the expression images.
head(subset(kdrc, kgut_hit == "Y"))

Most candidates in our set are Janelia split-GAL4 (SS*, IS*), which KDRC did not screen — those come back as kgut_hit = "N". In the live run for 510 candidate lines we picked up only 6 direct hits (all Gen1 GMR lines), which on its own is thin. Two of those — R15D08 (rut) and R71D08 (dally) — broadly label the midgut + hindgut, so where they overlap our top-5 MANC hits they are still strong leads, but the abdominal EN* efferents get no direct hits at all because their top-5 matches are nearly all split-GAL4 lines.

Expand coverage through split-GAL4 hemidrivers

A split-GAL4 line is an intersection of two enhancer-driven transgenes: an AD (activation domain) half and a DBD (DNA-binding domain) half. Each half is itself a Gen1 GMR (R*) or VT (VT*) enhancer, so even when a split is absent from KDRC, one of its halves may be in there — and any gut expression seen in that half is a hint about what the split is labelling in the abdomen.

split_halves() resolves every SS* / IS* / MB* line in the candidate list to its AD and DBD enhancer codes from the FlyLight Split-GAL4 image-release metadata on S3:

halves <- split_halves(candidate.lines)
head(halves[halves$is_split & !is.na(halves$ad), ])

# Gather every distinct hemidriver enhancer we haven't already queried,
# then re-run the KDRC look-up against them.
extra.lines <- setdiff(
  unique(c(halves$ad, halves$dbd)),
  c(candidate.lines, NA_character_)
)
session  <- kdrc_start_session()
kdrc_ext <- kdrc_lookup_lines(extra.lines, session = session)
kdrc_close_session(session)

In our live run, split_halves() resolved the AD/DBD codes for 396 of 452 splits (~88%) — the remaining 56 are mostly IS* samples whose metadata predates FlyLight publishing components in its S3 JSON. Of the 561 new hemidriver enhancers we queried, 124 showed up as KDRC hits — vastly more than the six direct matches, and enough to bring the abdominal EN* group into the heatmap.

Join back onto the neuron table

Now we build a long-form KDRC table covering both direct hits and hemidriver hits, then expand every (neuron, line) pair into three possible match routes — the line itself (direct), its AD half, or its DBD half — and join against the KDRC regions:

REGIONS <- c("Crop","PV","R1","R2","R3","R4","R5","MHJ","Hindgut","Rectum")

to_long <- function(tab) {
  tab |>
    filter(kgut_hit == "Y") |>
    select(publishedName = query_line, all_of(REGIONS)) |>
    pivot_longer(all_of(REGIONS), names_to = "region", values_to = "flag") |>
    filter(flag == "Y") |>
    select(publishedName, region)
}
kdrc_all <- dplyr::bind_rows(to_long(kdrc), to_long(kdrc_ext)) |>
  distinct()

# For each NeuronBridge top-5 pair, emit one row per possible match route.
direct <- hits.top |> mutate(match_key = publishedName, match_kind = "direct")
ad  <- hits.top |>
  left_join(halves |> select(publishedName = line, ad), by = "publishedName") |>
  filter(!is.na(ad)) |>
  mutate(match_key = ad, match_kind = "AD") |>
  select(-ad)
dbd <- hits.top |>
  left_join(halves |> select(publishedName = line, dbd), by = "publishedName") |>
  filter(!is.na(dbd)) |>
  mutate(match_key = dbd, match_kind = "DBD") |>
  select(-dbd)

expanded <- dplyr::bind_rows(direct, ad, dbd)

peripheral <- expanded |>
  inner_join(kdrc_all, by = c("match_key" = "publishedName")) |>
  select(bodyid, type, group, publishedName, match_key, match_kind,
         normalizedScore, region)

peripheral

Each row is a (MANC neuron, GAL4 line, match route, gut region) tuple: the neuron’s top-5 GAL4 line matches on colour-depth MIP search, and either that line itself or one of its hemidrivers is documented in KDRC as expressed in that region — a testable hypothesis that this MNad* motor or EN* endocrine cell projects to that part of the gut. Hemidriver matches are indirect evidence, so always inspect the relevant expression images before committing to an experiment.

Summary heatmap and line-hit CSV

Two useful summary artefacts:

a cell-type × gut-region heatmap showing, per MANC cell type, how many (neuron, line) pairs land on each KDRC region;
a flat CSV suitable for sharing — one row per (cell type, candidate line) pair with its best colour-depth score and the KDRC regions that line is flagged for.

library(tidyr)

# Per cell-type × region count — using the `peripheral` table we built
# above (direct + AD + DBD match routes combined).
mat.df <- peripheral |>
  count(type, region, name = "n") |>
  complete(type   = sort(unique(hits.top$type)),
           region = REGIONS, fill = list(n = 0))

mat <- mat.df |>
  pivot_wider(names_from = region, values_from = n) |>
  as.data.frame()
rownames(mat) <- mat$type
mat <- as.matrix(mat[, REGIONS, drop = FALSE])

# Order rows: MNad* first, then EN*; drop all-zero rows so the heatmap
# only shows cell types that have at least one KDRC hit.
row_group <- ifelse(startsWith(rownames(mat), "MNad"), "MNad", "EN")
mat       <- mat[order(row_group, rownames(mat)), , drop = FALSE]
row_group <- row_group[order(row_group, rownames(mat))]
keep      <- rowSums(mat) > 0
mat       <- mat[keep, , drop = FALSE]
row_group <- row_group[keep]

# Render: base-R image() plus text() so the counts are drawn inside cells.
max.n <- max(mat, 1)
pal   <- colorRampPalette(c("#f7fbff", "#6baed6", "#08306b"))(max.n + 1)
par(mar = c(4, 7, 4, 6))
image(seq_len(ncol(mat)), seq_len(nrow(mat)), t(mat),
      col = pal, axes = FALSE, xlab = "", ylab = "",
      main = "Abdominal MANC types × KDRC gut regions\n(direct + AD/DBD hemidriver matches)")
axis(1, at = seq_len(ncol(mat)), labels = colnames(mat), las = 1)
axis(2, at = seq_len(nrow(mat)), labels = rownames(mat), las = 1,
     cex.axis = 0.7)
for (i in seq_along(row_group))
  rect(ncol(mat) + 0.6, i - 0.5, ncol(mat) + 0.9, i + 0.5,
       col = ifelse(row_group[i] == "MNad", "steelblue", "orange"),
       border = NA, xpd = TRUE)
legend("bottomright", inset = c(-0.18, 0), xpd = TRUE,
       legend = c("MNad motor", "abdominal EN"),
       fill   = c("steelblue", "orange"), cex = 0.75, bty = "n",
       title  = "group")
for (i in seq_len(nrow(mat))) for (j in seq_len(ncol(mat))) {
  v <- mat[i, j]
  text(j, i, v,
       col = if (v > max.n / 2) "white" else "black",
       cex = 0.75)
}
box()

The CSV: one row per (cell type, candidate line, match route), with the best MIP score across neurons of that type, which enhancer mediated the match, and the comma-joined list of gut regions the line is flagged for. Written into ~/Downloads/ so it’s easy to share.

csv_out <- peripheral |>
  mutate(normalizedScore = as.numeric(normalizedScore)) |>
  group_by(type, publishedName, match_key, match_kind) |>
  summarise(
    group            = dplyr::first(group),
    best_score       = max(normalizedScore, na.rm = TRUE),
    n_bodies_of_type = dplyr::n_distinct(bodyid),
    kdrc_regions     = paste(sort(unique(region)), collapse = ", "),
    .groups = "drop"
  ) |>
  arrange(group, type, desc(best_score))

write.csv(csv_out,
          file.path(path.expand("~/Downloads"),
                    "abdominal_peripheral_line_hits.csv"),
          row.names = FALSE)

Caveats

NeuronBridge data version. The package’s default v2_1_1 is hemibrain-only; MANC MIPs are in v3_9_0 onwards. Always pass version = "v3_9_0" for a VNC workflow.
normalizedScore ceiling. Scores are clipped at 50000. For well-matched MANC neurons the top handful of VNC lines often sit at or near the ceiling and are essentially tied — inspect the MIPs visually rather than trusting rank order blindly.
Match quality. Always eyeball the MIPs via scan_mip() and, ideally, the 3-D stack for the GAL4 line before committing to an experiment.
Line contamination. A split-GAL4 that matches a single motor neuron often also labels dozens of others. neuronbridge_avoid() and neuronbridge_line_contents() help you find cleaner drivers.
KGutProject scope. KGutProject is a gut screen, so absence from the database means “not tested for gut expression”, not “does not label gut”. Other peripheral targets (reproductive tract, heart, fat body, epidermis) are not covered here and would need a different resource.
Hemidriver matches are indirect. An AD or DBD hemidriver that labels part of the gut on its own (as an enhancer-GAL4) tells you the enhancer fragment is active in that tissue. The split-GAL4 built from that half will only label the intersection with the other half, so the split may not actually mark the gut even when one of its halves does. Treat hemidriver-mediated KDRC hits as hypotheses to confirm in the split itself, not as direct evidence.

Citations

MANC connectome / malevnc: Marin, E.C., Morris, B.J., Stürner, T., Champion, A.S., Krzeminski, D., Badalamente, G., Gkantia, M., Dunne, C.R., Eichler, K., et al. (2023). Systematic annotation of a complete adult male Drosophila nerve cord connectome reveals principles of functional organisation. eLife.
MANC motor neurons (MNad etc.): Cheong, H.S.J., Eichler, K., Stürner, T., Asinof, S.K., Champion, A.S., Marin, E.C., Oram, T.B., Sumathipala, M., et al. (2023). Transforming descending input into behavior: The organization of premotor circuits in the Drosophila Male Adult Nerve Cord connectome. eLife.
NeuronBridge: Clements, J., Goina, C., Kazimiers, A., Otsuna, H., Svirskas, R., Rokicki, K. (2020). NeuronBridge Codebase.
KGutProject / KDRC: Lim, S.Y., You, H., Lee, J., Lee, T.H., Jang, Y., Lee, S., Nicolas, P., Park, B.S., Namkoong, S., Yoon, J., Jung, H., Ahn, C., Shim, Y.-H., Park, D., Kwon, J.Y., Lee, W.-J., Kim, Y.-J., Suh, G.S.B. (2021). Identification and characterization of GAL4 drivers that mark distinct cell types and regions in the Drosophila adult gut. J. Neurogenet. 35(1):33–44.