vignettes/abdominal_peripheral_targets.Rmd
abdominal_peripheral_targets.RmdThis vignette chains three resources to go from connectome cell types to candidate genetic reagents and, finally, to a peripheral tissue where those reagents might label something useful:
The MANC
(Male Adult Nerve Cord) connectome, accessed via the malevnc R
package. We pick up two classes of neuron that leave the central nervous
system through abdominal nerves:
MNad (motor neuron, abdominal).EN (efferent neuron) and whose soma
sits in an abdominal neuromere (A1–A8). Note
the prefix is EN, not ENN.NeuronBridge,
accessed via neuronbridger, to find GAL4 and split-GAL4
lines that match those EM neurons on colour-depth MIP search. We use
NeuronBridge data release v3_9_0, which is
the first release to include MANC MIPs. Earlier releases (the package
default v2_1_1) only cover the hemibrain brain
dataset.
The KDRC KGutProject database of GAL4-line expression in the adult Drosophila gut (Lim et al. 2021). For any NeuronBridge hit that also appears in KGutProject, we learn which gut region the line marks — a strong clue that the matched abdominal neuron projects to that tissue.
Very few motor or endocrine neurons in the nerve cord have been pinned to a specific peripheral organ, so this pipeline is meant as a way to generate hypotheses you can follow up experimentally.
?neuprintr::neuprint_login). Put it in
.Renviron as neuprint_token=....
if (!require("remotes")) install.packages("remotes")
remotes::install_github("natverse/neuronbridger")
remotes::install_github("natverse/malevnc")
remotes::install_github("natverse/neuprintr")manc_neuprint() opens a connection against the public
MANC release (manc:v1.2.3 at time of writing).
manc_neuprint_meta() returns one row per body, with columns
including bodyid, type, class,
somaSide, somaNeuromere (values
T1–T3, A1–A10),
entryNerve, exitNerve and
predictedNt.
conn <- manc_neuprint()
# Motor neurons whose cell type starts "MNad" — these exit through abdominal
# nerves (AbN1–AbN4) to innervate abdominal muscles.
mn.ad <- manc_neuprint_meta("/type:MNad.*", conn = conn)
# Efferent neurons (EN*) with somata in an abdominal segment. These include
# neurosecretory / endocrine cells that project out of the CNS through
# peripheral nerves.
en.ab <- manc_neuprint_meta("/type:EN.*", conn = conn) |>
filter(somaNeuromere %in% paste0("A", 1:8))
targets <- bind_rows(
mn.ad |> mutate(group = "MNad motor"),
en.ab |> mutate(group = "abdominal EN")
) |>
mutate(bodyid = as.character(bodyid)) |>
select(bodyid, type, group, somaNeuromere, somaSide, exitNerve, predictedNt)
targets
# MANC has 229 MNad* motor neurons and 44 abdominal EN* efferents
# at the v1.2.3 release — 273 targets in total.As a sanity check, plot a handful:
library(nat)
nopen3d()
neurons <- manc_read_neurons(targets$bodyid[1:10], conn = conn)
plot3d(neurons, soma = 1000, lwd = 2)neuronbridge_info() queries the per-body JSON and
returns one row per MIP that NeuronBridge holds for that body. A single
MANC neuron typically shows up in three libraries (MANC, VNC, male-CNS);
we want the MANC one, because that is the MIP whose hits were
precomputed against the FlyLight VNC libraries.
mip.map <- list()
for (bid in targets$bodyid) {
info <- try(neuronbridge_info(bid, dataset = "by_body",
version = NB_VERSION), silent = TRUE)
if (inherits(info, "try-error") || !nrow(info)) next
manc <- info[grepl("MANC", info$libraryName, ignore.case = TRUE), ]
if (!nrow(manc)) next
manc$bodyid <- bid
mip.map[[bid]] <- manc[1, c("bodyid", "nb.id", "libraryName")]
}
mip.map <- do.call(rbind, mip.map)
nrow(mip.map) # ~ all 273 targets resolve
hits <- list()
for (i in seq_len(nrow(mip.map))) {
h <- try(neuronbridge_hits(mip.map$nb.id[i], version = NB_VERSION),
silent = TRUE)
if (inherits(h, "try-error") || is.null(h) || !nrow(h)) next
h$bodyid <- mip.map$bodyid[i]
# Best MIP per line for this one neuron.
h <- h[order(h$normalizedScore, decreasing = TRUE), , drop = FALSE]
h <- h[!duplicated(h$publishedName), , drop = FALSE]
hits[[i]] <- h
}
hits <- do.call(plyr::rbind.fill, hits)Restrict to FlyLight VNC LM libraries (brain libraries are not informative for a motor or endocrine neuron):
hits.vnc <- hits |>
filter(grepl("FlyLight|Split", libraryName, ignore.case = TRUE),
!grepl("Brain", libraryName, ignore.case = TRUE)) |>
mutate(normalizedScore = as.numeric(normalizedScore))
# Annotate back with the target-neuron side of the join.
hits.vnc <- merge(hits.vnc,
targets[, c("bodyid", "type", "group",
"somaNeuromere", "exitNerve")],
by = "bodyid", all.x = TRUE)normalizedScore in NeuronBridge is clipped at
50000, and for a well-targeted neuron the top few lines
often cluster within ~10% of each other — so a fixed absolute score
threshold (23000, as in the split-design vignette) is the
wrong tool. Instead, for each target neuron we:
normalizedScore (best MIP per
line, from step 2b).
top_with_elbow <- function(df, cap = 5,
drop_frac = 0.75,
floor_frac = 0.20) {
df <- df[order(df$normalizedScore, decreasing = TRUE), , drop = FALSE]
if (!nrow(df)) return(df)
keep <- TRUE
if (nrow(df) > 1) {
ratios <- df$normalizedScore[-1] / df$normalizedScore[-nrow(df)]
keep <- c(TRUE,
cumprod(ratios >= drop_frac) == 1 &
df$normalizedScore[-1] / df$normalizedScore[1] >= floor_frac)
}
head(df[keep, , drop = FALSE], cap)
}
hits.top <- hits.vnc |>
group_by(bodyid) |>
group_modify(~ top_with_elbow(.x)) |>
ungroup()
# Lines kept per neuron:
table(table(hits.top$bodyid))
# In practice, most neurons hit the cap of 5 (top-5 scores are close together)
# and only a handful collapse to 1 line.Quick eyeball of a few top hits:
If you want to be selective — for instance, dropping lines that also
label nearby motor neurons you do not want —
neuronbridge_avoid() is useful: pass the one
bodyid you want in search and a vector of IDs
you want to stay clear of in avoid.
favourite <- targets$bodyid[1]
avoid <- setdiff(targets$bodyid, favourite)
clean <- neuronbridge_avoid(search = favourite, avoid = avoid,
threshold = 23000)Collect the unique candidate line set for the KDRC cross-reference:
The KGutProject is a database of 353 GAL4 (and a handful of split-GAL4) driver lines screened by Lim et al. 2021 for expression across ten regions of the adult gut: Crop, PV (proventriculus), R1–R5 (midgut), MHJ (midgut–hindgut junction), Hindgut/ileum and Rectum. If one of our NeuronBridge hits is listed there, that is a direct clue that the line — and by extension the abdominal neuron it labels — may project to that gut region.
The neuronbridger package ships a small set of helpers
(kdrc_start_session(), kdrc_search_line(),
kdrc_line_regions(), kdrc_lookup_lines(),
kdrc_close_session()) that drive a headless Chrome session
via chromote to
run the site’s own search and detail look-ups. KDRC has no public REST
API — its AUIquery.do endpoint is gated by proprietary
client-side signing — but reusing the page’s own JavaScript works
cleanly.
if (!requireNamespace("chromote", quietly = TRUE))
install.packages("chromote")
# One-shot batch lookup. This drives a headless Chrome; expect roughly
# 8 seconds per line (search) plus ~3 seconds per hit (detail page), so a
# 500-line run is about an hour.
session <- kdrc_start_session()
kdrc <- kdrc_lookup_lines(candidate.lines, session = session)
kdrc_close_session(session)
# Each hit row has Y/N flags for the ten KDRC regions plus BDSC ID,
# associated gene, and the stable `p_seq` you can paste into
# https://flyinfo.kr/FlexForm_KGut_Detail_Load.do?tmpl=template/E&p_seq=<p_seq>
# to see the expression images.
head(subset(kdrc, kgut_hit == "Y"))Most candidates in our set are Janelia split-GAL4 (SS*,
IS*), which KDRC did not screen — those come back as
kgut_hit = "N". In the live run for 510 candidate lines we
picked up only 6 direct hits (all Gen1 GMR lines),
which on its own is thin. Two of those — R15D08 (rut) and
R71D08 (dally) — broadly label the midgut + hindgut, so
where they overlap our top-5 MANC hits they are still strong leads, but
the abdominal EN* efferents get no direct hits at all
because their top-5 matches are nearly all split-GAL4 lines.
A split-GAL4 line is an intersection of two enhancer-driven
transgenes: an AD (activation domain) half and a
DBD (DNA-binding domain) half. Each half is itself a
Gen1 GMR (R*) or VT (VT*) enhancer, so even
when a split is absent from KDRC, one of its halves may be in there —
and any gut expression seen in that half is a hint about what the split
is labelling in the abdomen.
split_halves() resolves every
SS* / IS* / MB* line in the candidate list to its AD and
DBD enhancer codes from the FlyLight Split-GAL4 image-release metadata
on S3:
halves <- split_halves(candidate.lines)
head(halves[halves$is_split & !is.na(halves$ad), ])
# Gather every distinct hemidriver enhancer we haven't already queried,
# then re-run the KDRC look-up against them.
extra.lines <- setdiff(
unique(c(halves$ad, halves$dbd)),
c(candidate.lines, NA_character_)
)
session <- kdrc_start_session()
kdrc_ext <- kdrc_lookup_lines(extra.lines, session = session)
kdrc_close_session(session)In our live run, split_halves() resolved the AD/DBD
codes for 396 of 452 splits (~88%) — the remaining 56 are mostly
IS* samples whose metadata predates FlyLight publishing
components in its S3 JSON. Of the 561 new hemidriver enhancers we
queried, 124 showed up as KDRC hits — vastly more than
the six direct matches, and enough to bring the abdominal
EN* group into the heatmap.
Now we build a long-form KDRC table covering both direct hits and
hemidriver hits, then expand every (neuron, line) pair into three
possible match routes — the line itself (direct), its AD
half, or its DBD half — and join against the KDRC regions:
REGIONS <- c("Crop","PV","R1","R2","R3","R4","R5","MHJ","Hindgut","Rectum")
to_long <- function(tab) {
tab |>
filter(kgut_hit == "Y") |>
select(publishedName = query_line, all_of(REGIONS)) |>
pivot_longer(all_of(REGIONS), names_to = "region", values_to = "flag") |>
filter(flag == "Y") |>
select(publishedName, region)
}
kdrc_all <- dplyr::bind_rows(to_long(kdrc), to_long(kdrc_ext)) |>
distinct()
# For each NeuronBridge top-5 pair, emit one row per possible match route.
direct <- hits.top |> mutate(match_key = publishedName, match_kind = "direct")
ad <- hits.top |>
left_join(halves |> select(publishedName = line, ad), by = "publishedName") |>
filter(!is.na(ad)) |>
mutate(match_key = ad, match_kind = "AD") |>
select(-ad)
dbd <- hits.top |>
left_join(halves |> select(publishedName = line, dbd), by = "publishedName") |>
filter(!is.na(dbd)) |>
mutate(match_key = dbd, match_kind = "DBD") |>
select(-dbd)
expanded <- dplyr::bind_rows(direct, ad, dbd)
peripheral <- expanded |>
inner_join(kdrc_all, by = c("match_key" = "publishedName")) |>
select(bodyid, type, group, publishedName, match_key, match_kind,
normalizedScore, region)
peripheralEach row is a (MANC neuron, GAL4 line, match route, gut region)
tuple: the neuron’s top-5 GAL4 line matches on colour-depth MIP search,
and either that line itself or one of its hemidrivers is documented in
KDRC as expressed in that region — a testable hypothesis that this
MNad* motor or EN* endocrine cell projects to
that part of the gut. Hemidriver matches are indirect evidence, so
always inspect the relevant expression images before committing to an
experiment.
Two useful summary artefacts:
library(tidyr)
# Per cell-type × region count — using the `peripheral` table we built
# above (direct + AD + DBD match routes combined).
mat.df <- peripheral |>
count(type, region, name = "n") |>
complete(type = sort(unique(hits.top$type)),
region = REGIONS, fill = list(n = 0))
mat <- mat.df |>
pivot_wider(names_from = region, values_from = n) |>
as.data.frame()
rownames(mat) <- mat$type
mat <- as.matrix(mat[, REGIONS, drop = FALSE])
# Order rows: MNad* first, then EN*; drop all-zero rows so the heatmap
# only shows cell types that have at least one KDRC hit.
row_group <- ifelse(startsWith(rownames(mat), "MNad"), "MNad", "EN")
mat <- mat[order(row_group, rownames(mat)), , drop = FALSE]
row_group <- row_group[order(row_group, rownames(mat))]
keep <- rowSums(mat) > 0
mat <- mat[keep, , drop = FALSE]
row_group <- row_group[keep]
# Render: base-R image() plus text() so the counts are drawn inside cells.
max.n <- max(mat, 1)
pal <- colorRampPalette(c("#f7fbff", "#6baed6", "#08306b"))(max.n + 1)
par(mar = c(4, 7, 4, 6))
image(seq_len(ncol(mat)), seq_len(nrow(mat)), t(mat),
col = pal, axes = FALSE, xlab = "", ylab = "",
main = "Abdominal MANC types × KDRC gut regions\n(direct + AD/DBD hemidriver matches)")
axis(1, at = seq_len(ncol(mat)), labels = colnames(mat), las = 1)
axis(2, at = seq_len(nrow(mat)), labels = rownames(mat), las = 1,
cex.axis = 0.7)
for (i in seq_along(row_group))
rect(ncol(mat) + 0.6, i - 0.5, ncol(mat) + 0.9, i + 0.5,
col = ifelse(row_group[i] == "MNad", "steelblue", "orange"),
border = NA, xpd = TRUE)
legend("bottomright", inset = c(-0.18, 0), xpd = TRUE,
legend = c("MNad motor", "abdominal EN"),
fill = c("steelblue", "orange"), cex = 0.75, bty = "n",
title = "group")
for (i in seq_len(nrow(mat))) for (j in seq_len(ncol(mat))) {
v <- mat[i, j]
text(j, i, v,
col = if (v > max.n / 2) "white" else "black",
cex = 0.75)
}
box()The CSV: one row per (cell type, candidate line, match route), with
the best MIP score across neurons of that type, which enhancer mediated
the match, and the comma-joined list of gut regions the line is flagged
for. Written into ~/Downloads/ so it’s easy to share.
csv_out <- peripheral |>
mutate(normalizedScore = as.numeric(normalizedScore)) |>
group_by(type, publishedName, match_key, match_kind) |>
summarise(
group = dplyr::first(group),
best_score = max(normalizedScore, na.rm = TRUE),
n_bodies_of_type = dplyr::n_distinct(bodyid),
kdrc_regions = paste(sort(unique(region)), collapse = ", "),
.groups = "drop"
) |>
arrange(group, type, desc(best_score))
write.csv(csv_out,
file.path(path.expand("~/Downloads"),
"abdominal_peripheral_line_hits.csv"),
row.names = FALSE)v2_1_1 is hemibrain-only; MANC MIPs are in
v3_9_0 onwards. Always pass version = "v3_9_0"
for a VNC workflow.normalizedScore ceiling. Scores are
clipped at 50000. For well-matched MANC neurons the top handful of VNC
lines often sit at or near the ceiling and are essentially tied —
inspect the MIPs visually rather than trusting rank order blindly.scan_mip() and, ideally, the 3-D stack for the GAL4 line
before committing to an experiment.neuronbridge_avoid() and
neuronbridge_line_contents() help you find cleaner
drivers.malevnc: Marin,
E.C., Morris, B.J., Stürner, T., Champion, A.S., Krzeminski, D.,
Badalamente, G., Gkantia, M., Dunne, C.R., Eichler, K., et al. (2023).
Systematic annotation of a complete adult male Drosophila nerve cord
connectome reveals principles of functional organisation.
eLife.