Match a published GAL4 line to its EM cell (AstA-SEZ)

Goal

Find the FlyWire-FAFB-v783 cell matched by a published, weakly-labelling split-GAL4 — the kind of cell where the line says “this is the neuron” but the LM stack alone is not specific enough to be sure.

The example here is SS32423 (Sterne et al. 2021, aMulet), reported to label an Allatostatin-A (AstA) neuron of the subesophageal zone (SEZ). Both of its hemidrivers — AD VT019900 and DBD R65D06 — lie in the AstA locus on 3R. The canonical “AstA-GAL4” tile R65D05 (Hergarden et al. 2012) is the chromosomal neighbour of R65D06, and together with R65D07 and a few flanking tiles forms a natural panel of AstA-locus drivers. The FAFB-v783 cell that matches every panel member is the strong candidate.

The pipeline chains four filters to a short ranked list:

NeuronBridge colour-depth match against SS32423 + the sibling AstA-locus drivers (release v3_9_0);
soma in an SEZ neuropil (GNG, SAD, AMMC, or FLA);
top ~10th percentile of soma_dcv_density — the dense-core-vesicle proxy for peptidergic identity, consistent with AstA secretion;
visual agreement with the SS32423 MIP.

Outputs: a ranked CSV (in inst/extdata/asta_sez/) and a side-by-side figure of the SS32423 MIP and the surviving candidate(s). A reproducer sits at inst/scripts/run_asta_sez.R.

Prerequisites

if (!require("remotes")) install.packages("remotes")
remotes::install_github("natverse/neuronbridger")
remotes::install_github("natverse/fafbseg")
remotes::install_github("natverse/nat.flybrains")
remotes::install_github("natverse/nat.templatebrains")
remotes::install_github("natverse/nat.jrcbrains")
install.packages(c("arrow", "dplyr", "tidyr", "ggplot2", "ggrepel"))

library(neuronbridger)
library(arrow); library(dplyr); library(tidyr); library(ggplot2)

NB_VERSION <- "v3_9_0"   # current release; ships FlyWire-FAFB-v783 MIPs

The compiled FAFB-v783 metadata + soma-DCV-detection feather tables used here are an in-prep product of the Wei-Chung Allen Lee lab (Harvard / lee-lab DCV manuscript, in preparation). Replace DCV_DIR with the directory holding the equivalent feathers on your own machine — the rest of the pipeline is path-agnostic.

DCV_DIR <- "<path/to/sjcabs/fafb>"   # lee-lab DCV feathers (in prep.)
META_FEATHER     <- file.path(DCV_DIR, "fafb_783_meta.feather")
SOMA_DCV_FEATHER <- file.path(DCV_DIR, "fafb_783_soma_dcv_detection.feather")

Step 1: Build the AstA-locus driver panel

The panel is built by hand from the literature: SS32423 plus the GMR and VT enhancer fragments that tile the AstA locus and have FlyLight imaging in NeuronBridge. R65D04 is included in the literature but absent from NB v3_9_0 — confirm and drop it on the fly.

driver_panel <- tibble::tribble(
  ~line,        ~role,
  "SS32423",    "primary lead — aMulet split (Sterne et al. 2021); both halves AstA-locus",
  "R65D05",     "Pfeiffer 'AstA-GAL4' (Hergarden 2012; Pool 2014; Landayan 2021)",
  "R65D06",     "DBD half of SS32423 — Pfeiffer tile in AstA locus",
  "R65D04",     "neighbouring AstA-locus tile — absent from NB v3_9_0",
  "R65D07",     "neighbouring AstA-locus tile",
  "R65E01",     "neighbouring AstA-locus tile",
  "VT019900",   "AD half of SS32423 — Vienna Tile in AstA locus"
)

Confirm and select brain MIPs (FAFB hits only come from brain MIPs, not VNC ones). NB stores many redundant MIPs per line — broad lines have 50+ — and neuronbridge_hits() calls plyr::rbind.fill() internally which is quadratic in the number of result rows, so we cap at 5 MIPs per line. Multiple MIPs of the same line are slightly different stainings of the same GAL4, and slice_max() downstream takes the best score per neuron, so 5 is plenty.

panel_info <- list()
for (ln in driver_panel$line) {
  out <- try(neuronbridge_info(ln, dataset = "by_line",
                               version = NB_VERSION), silent = TRUE)
  if (inherits(out, "try-error") || !nrow(out)) next
  brain <- out[out$anatomicalArea == "Brain", , drop = FALSE]
  if (!nrow(brain)) next
  brain <- head(brain, 10)
  brain$line <- ln
  panel_info[[ln]] <- brain
}
panel_info <- bind_rows(panel_info)
# In v3_9_0: SS32423 (16 brain MIPs), R65D05 (75), R65D06 (60), R65D07
# (15), R65E01 (48), VT019900 (21); R65D04 absent. Capped at 10 → 60
# MIPs total.

Step 2: Pull NeuronBridge FAFB-v783 hits per line

SS32423 ships 16 brain MIPs in v3_9_0. The dominant pattern is a bilateral SEZ-soma pair with arbours ascending dorso-medially — the cell the line was designed to label. A few MIPs are MCFO stochastic single-cell renders, so inspect the full set before choosing a representative MIP:

SS32423 — all 16 brain MIPs in NB v3_9_0

For each driver, fetch its colour-depth hits, restrict to FlyWire-FAFB (libraryName == "FlyWire_FAFB_v783_realign"), and cache per line.

KEEP_COLS <- c("publishedName","libraryName","normalizedScore",
               "matchingPixels","alignmentSpace","anatomicalArea","gender")
hits_list <- list()
for (ln in unique(panel_info$line)) {
  ln_mips <- panel_info[panel_info$line == ln, ]
  rows <- list()
  for (i in seq_len(nrow(ln_mips))) {
    h <- try(neuronbridge_hits(ln_mips$nb.id[i], version = NB_VERSION),
             silent = TRUE)
    if (inherits(h, "try-error") || is.null(h) || !nrow(h)) next
    h <- h[grepl("FlyWire", h$libraryName), , drop = FALSE]
    if (!nrow(h)) next
    h <- as.data.frame(h[, intersect(KEEP_COLS, colnames(h)), drop = FALSE])
    h$normalizedScore <- as.numeric(h$normalizedScore)
    h$query_line <- ln
    h$query_mip  <- ln_mips$nb.id[i]
    rows[[length(rows)+1L]] <- h
  }
  hits_list[[ln]] <- if (length(rows)) bind_rows(rows) else tibble()
}
hits_raw <- bind_rows(hits_list)
hits_raw$root_783 <- sub("^flywire_fafb:v783:", "", hits_raw$publishedName)
# Live numbers (cap=10 MIPs/line):
#   SS32423   6148 hits      R65D05   9845        R65D06  14314
#   R65D07   10515           R65E01   8807        VT019900 10038
#   total: 59667 FlyWire hit-rows across the panel.

hits_best <- hits_raw |>
  group_by(query_line, root_783) |>
  slice_max(normalizedScore, n = 1, with_ties = FALSE) |>
  ungroup()

Per-line top-N with the elbow cutoff helper (drop candidate k if score < 75% of k-1 or < 20% of top score; cap at 25). Reused from abdominal_peripheral_targets.Rmd.

top_with_elbow <- function(df, cap = 25,
                           drop_frac  = 0.75,
                           floor_frac = 0.20) {
  df <- df[order(df$normalizedScore, decreasing = TRUE), , drop = FALSE]
  if (!nrow(df)) return(df)
  keep <- TRUE
  if (nrow(df) > 1) {
    ratios <- df$normalizedScore[-1] / df$normalizedScore[-nrow(df)]
    keep <- c(TRUE,
              cumprod(ratios >= drop_frac) == 1 &
              df$normalizedScore[-1] / df$normalizedScore[1] >= floor_frac)
  }
  head(df[keep, , drop = FALSE], cap)
}

hits_top <- hits_best |>
  group_by(query_line) |>
  group_modify(~ top_with_elbow(.x, cap = 25)) |>
  ungroup()
# 150 line-neuron pairs across 148 unique candidate FlyWire neurons.

Step 3: Soft soma-neuropil annotation (not a hard filter)

The soma-DCV feather classifies each vesicle by neuropil. Three practical points: regions can be comma-joined on neuropil boundaries ("AL_R,GNG,SAD"); outside_<NEUROPIL> tags mark the cortex rind where somas actually sit (so a strict neuropil == "GNG" filter would drop most genuine GNG-soma neurons); and ~⅓ of FAFB-v783 neurons have no soma DCVs at all (ascending cells with VNC somas). We tokenise on commas, accept inner SEZ neuropils plus their outside_* rinds, score by fraction of soma-DCVs hitting any SEZ token, and keep borderline / unknown cases rather than silently dropping them. The distribution is bimodal so a 0.5 threshold is safe and 0.1 is a generous safety net.

# Periesophageal SEZ — the broad Ito-2014 definition. The narrow
# 6-neuropil set (GNG / SAD / AMMC / FLA only) misses cells like
# CB0108 whose soma rind is `outside_IPS_L` but whose hemilineage
# (LB19) is canonically SEZ. Add PRW, IPS, SPS, VES, WED.
SEZ_INNER <- c("GNG","SAD","AMMC_L","AMMC_R","FLA_L","FLA_R",
               "PRW","IPS_L","IPS_R","SPS_L","SPS_R",
               "VES_L","WED_L","WED_R")
SEZ_TOK   <- c(SEZ_INNER, paste0("outside_", SEZ_INNER))

soma_dcv <- read_feather(SOMA_DCV_FEATHER,
                         col_select = c("root_783", "neuropil"))
sez_tok <- function(np, set) {
  vapply(strsplit(np, ",", fixed = TRUE),
         function(x) any(x %in% set), logical(1))
}
soma_dcv <- soma_dcv |>
  mutate(any_sez   = sez_tok(neuropil, SEZ_TOK),
         inner_sez = sez_tok(neuropil, SEZ_INNER))

soma_class <- soma_dcv |>
  group_by(root_783) |>
  summarise(n_dcv          = n(),
            frac_sez       = mean(any_sez),
            frac_sez_inner = mean(inner_sez),
            top_token      = names(sort(table(neuropil), decreasing = TRUE))[1],
            .groups = "drop") |>
  mutate(soma_zone = case_when(
    frac_sez >= 0.5  ~ "SEZ",
    frac_sez >= 0.1  ~ "SEZ_borderline",
    TRUE             ~ "non_SEZ"
  ))
table(soma_class$soma_zone)
# Live: SEZ ~3960  borderline ~250  non_SEZ ~103,000.

Step 4: DCV-density percentiles (annotation, not gate)

The soma_dcv_density column on the meta is the canonical metric (DCV voxels / soma voxels). Threshold over central-brain neurons only — the optic-lobe DCV distribution is its own thing and would otherwise drag the percentile downwards.

meta <- read_feather(META_FEATHER) |>
  mutate(root_783 = as.character(fafb_783_id))
cb <- meta |> filter(region == "central_brain", !is.na(soma_dcv_density))
dcv_thr <- quantile(cb$soma_dcv_density, probs = 0.90, na.rm = TRUE)
ecdf_cb <- ecdf(cb$soma_dcv_density)
meta <- meta |> mutate(dcv_pct  = ecdf_cb(soma_dcv_density),
                       dcv_rich = soma_dcv_density >= dcv_thr)

Step 5: Score and rank

Score per neuron, then sort. The hard requirement is in_ss32423 plus soma-zone-OK; everything else (consensus, DCV-rich, R65D05 membership) is a nudge added to rank_score.

cand <- hits_top |>
  left_join(soma_class, by = "root_783") |>
  left_join(meta |> select(root_783, cell_class, cell_type, super_class,
                           hemilineage, side, neurotransmitter_predicted,
                           neuropeptide_verified,
                           soma_dcv_density, dcv_pct, dcv_rich),
            by = "root_783")

ranked <- cand |>
  mutate(in_ss32423 = query_line == "SS32423",
         in_R65D05  = query_line == "R65D05") |>
  group_by(root_783) |>
  summarise(
    n_lines       = n_distinct(query_line),
    in_ss32423    = any(in_ss32423),
    in_R65D05     = any(in_R65D05),
    lines         = paste(sort(unique(query_line)), collapse = ","),
    best_score    = max(normalizedScore, na.rm = TRUE),
    score_sum     = sum(pmin(normalizedScore, 50000), na.rm = TRUE),
    soma_zone     = soma_zone[1], frac_sez = frac_sez[1],
    cell_type = cell_type[1], hemilineage = hemilineage[1],
    super_class = super_class[1], nt = neurotransmitter_predicted[1],
    np_verified = neuropeptide_verified[1],
    soma_dcv_density = soma_dcv_density[1],
    dcv_pct = dcv_pct[1], dcv_rich = dcv_rich[1],
    .groups = "drop"
  ) |>
  mutate(soma_zone = ifelse(is.na(soma_zone), "unknown", soma_zone),
         dcv_rich  = ifelse(is.na(dcv_rich), FALSE, dcv_rich),
         sez_ok    = soma_zone %in% c("SEZ","SEZ_borderline","unknown"),
         rank_score = (n_lines * 1.0)
                    + (best_score / 50000)
                    + ifelse(in_ss32423, 0.5, 0)
                    + ifelse(dcv_rich, 0.5, 0)
                    + ifelse(in_R65D05, 0.5, 0)) |>
  filter(sez_ok) |>
  arrange(desc(in_ss32423), desc(n_lines), desc(rank_score))

Sanity check — did we keep SS32423’s own hits?

If after all the joins and filters none of SS32423’s NB hits survived, the pipeline has a bug. Always print this before trusting the cross-line consensus.

ss <- ranked |> filter(in_ss32423)
cat(sprintf("SS32423 candidates retained: %d\n", nrow(ss)))
cat(sprintf("  ... SEZ-soma %d  borderline %d  unknown %d  non_SEZ %d\n",
            sum(ss$soma_zone == "SEZ"),
            sum(ss$soma_zone == "SEZ_borderline"),
            sum(ss$soma_zone == "unknown"),
            sum(ss$soma_zone == "non_SEZ")))
cat(sprintf("  ... DCV-rich (>=p90): %d\n", sum(ss$dcv_rich, na.rm = TRUE)))
stopifnot("Pipeline lost all SS32423 hits — soma_zone filter probably too tight." = nrow(ss) > 0)
# Live: 22 SS32423 candidates retained — 18 SEZ, 0 borderline, 4 unknown.

Step 6: The ranked candidates

The 9 SEZ-soma SS32423 candidates from the live run (cap=10 MIPs/line). The SS32423-membership column is sorted to the top, then number of consensus lines, then the rank score:

# Top SS32423 ∩ SEZ-soma candidates (cap=10):
   cell_type   hemilineage      super_class             n_lines lines             best_score dcv_pct dcv_rich np
 1 CB0602      putative_primary central_brain_intrinsic       2 R65D05,SS32423         42344   0.866   FALSE    -
 2 CB0239      LB11             central_brain_intrinsic       2 SS32423,VT019900       41537   0.402   FALSE    -
 3 DNg22       putative_primary descending                    1 SS32423                37532   1.000   TRUE     FMRFa
 4 CB0456      putative_primary central_brain_intrinsic       1 SS32423                44783   0.839   FALSE    -
 5 CB0544      putative_primary central_brain_intrinsic       1 SS32423                39122   0.296   FALSE    -
 6 DNge046     LB5__prim        descending                    1 SS32423                36493   0.437   FALSE    -
 7 CB1475      LB23             central_brain_intrinsic       1 SS32423                35328   0.648   FALSE    -
 8 CB3901      MX0__prim        central_brain_intrinsic       1 SS32423                33566   0.753   FALSE    -
 9 CB3902      LB0_anterior     central_brain_intrinsic       1 SS32423                33229   0.487   FALSE    -

A further 16 SS32423 hits have non-SEZ somas (e.g. soma in SMP / LH / SLP cortex rind) and are kept in the full ranked CSV but excluded from the SEZ-OK figures. These are real SS32423 hits but a poor fit for “soma in the SEZ” — typically off-target labelling of the line.

Two graphical summaries — first the dot plot of best NB score versus consensus, second the ranked bar with DCV-rich highlight:

Top 15 candidate match scores by consensus and DCV percentile

Cross-line consensus, top 15

The driver-panel hit matrix shows which of our six AstA-locus drivers hit each top-25 candidate — useful for seeing at a glance who’s a consensus winner and who’s a one-line hit:

Driver-panel hit matrix, top 25

Step 7: Reading the table — what do these candidates mean?

A few things stand out:

CB0602 is the only neuron hit by both SS32423 and R65D05 (the canonical AstA-GAL4) at high score. It’s a central_brain_intrinsic cell with soma in the SEZ (frac_sez = 1.0), putative_primary hemilineage, predicted ACh, and dcv_pct = 0.866 (just below the strict p90 DCV-rich threshold). FlyWire root 720575940632295751 (left side).
CB0108 (LB19) is hit by SS32423 alone but is massively DCV-rich (soma_dcv_density = 16.4, dcv_pct ≈ 0.99); its soma sits at the dorsal outside_IPS_L rind. With the broad peri-esophageal SEZ definition (above) it classifies as SEZ. FlyWire root 720575940620093019 (right side).
These two cell types are a bilateral-pair match for the 4-cell SEZ AstA pattern Hergarden et al. 2012 reported (PMC3309792): the meta has both left+right copies of CB0602 and CB0108, and all four show up in our SS32423 raw cache. The right CB0602 and left CB0108 just slip below the per-line elbow cap (best score ~10–15k vs SS32423’s MIP top of ~50k) — see inst/extdata/asta_sez/cache/hits_*.rds for the unfiltered hits. See Step 7b below.
CB0239 (LB11 hemilineage, ACh, dcv_pct = 0.402) is hit by both SS32423 and VT019900 (SS32423’s AD half). Lower DCV signal, but LB11 is an SEZ-relevant lineage. Worth visually verifying. FlyWire root 720575940634548194.
DNg22 turns up in SS32423 alone — confirmed np_verified = "FMRFa", very DCV-rich (dcv_pct = 1.0, soma_dcv_density = 50.1). This is a known peptidergic descending neuron rather than the canonical Hergarden-type SEZ AstA cell, but the SS32423 colour-MIP almost certainly contains the DNg22 dendritic arbour and that’s why it ranks. Treat as a secondary hit, not the target.
A cluster of central-brain intrinsic cells in SEZ-relevant hemilineages (CB1475 in LB23, CB3901 in MX0__prim, CB3902 in LB0_anterior, CB0456 / CB0544 in putative_primary) sit in the next rank tier. These are all SEZ-soma cells with various NTs and mid-range DCV percentiles — worth a 3-D look for any morphology match.

Step 7b: The 4-cell SEZ AstA pattern (Hergarden 2012)

Hergarden, Tayler & Anderson (2012; PMC3309792) describe roughly four AstA-positive neurons clustered around the SEZ — two bilateral pairs. Our SS32423 hits map to this directly:

CB0602 — ventral SEZ pair (putative_primary lineage, ACh):
- left = 720575940632295751 (top hit, SS32423 ∩ R65D05)
- right = 720575940640469848 (in SS32423 + R65D05 + R65D06 raw caches; falls below the per-line elbow cap)
CB0108 — dorsal/IPS pair (LB19 lineage, ACh):
- right = 720575940620093019 (top non-SEZ-strict hit, dcv_pct 0.99)
- left = 720575940624028855 (in SS32423 + R65D06 raw caches)

Render all four meshes (inst/scripts/asta_sez_4cell.R):

The 4 canonical SEZ AstA cells — CB0108 dorsal pair above CB0602 ventral pair

The bilateral symmetry is unambiguous and the dorsal-ventral split matches the figures in Hergarden 2012. CB0602 + CB0108 together = the 4 SEZ-AstA cells that motivated SS32423 in the first place.

Of course, none of this is news to the FlyWire community: these large projection neurons are already annotated as the AstA1 cell type in codex. The point of the pipeline above isn’t to discover AstA1 from scratch, but to show that SS32423 + AstA-locus driver consensus + DCV-density soma scoring converges on AstA1 without using the AstA1 annotation as input — the same recipe should generalise to other peptidergic populations that don’t yet have a canonical FlyWire label.

The 16 SS32423 hits with non-SEZ somas (visible in asta_sez_ranked_full.csv) belong to a different cluster — see Step 10 below.

The ranked CSV with all 100+ candidates and full annotations is in inst/extdata/asta_sez/asta_sez_ranked_full.csv. The top-30 cut is asta_sez_ranked_top30.csv.

Step 8: Visualise the top two candidates with `nat.ggplot`

nat.ggplot::geom_neuron() plots a neuron / neuronlist / mesh as a 2-D ggplot — the same idiom the lee-lab DCV repo uses for its FAFB figures (R/visualise/fafb_flange.R, fig_2_dcv_predictions_fafb.Rmd).

We fetch the FlyWire meshes for the top 2 candidates (~22 MB each, ~10 s/neuron from CloudVolume) and project their vertices to 2-D as a low-alpha point cloud — that produces a denser, MIP-like volumetric render than the L2 skeleton paths. The FAFB14 brain and SEZ neuropil meshes from elmr provide context.

library(nat); library(nat.ggplot); library(elmr); library(fafbseg)
library(ggplot2); library(patchwork)
options(fafbseg.flywire_dataset = "flywire_fafb_public")

CB0602 <- "720575940632295751"
CB0239 <- "720575940634548194"
m <- fafbseg::read_cloudvolume_meshes(c(CB0602, CB0239))

brain    <- elmr::FAFB14.surf
sez_surf <- subset(elmr::FAFB14NP.surf,
                   c("GNG","SAD","FLA_R","FLA_L","AMMC_R","AMMC_L"))

# Project mesh vertices to 2-D as a low-alpha point cloud — this is
# what produces the volumetric "MIP-like" arbour density.
mesh_xy <- function(mesh) {
  v <- nat::xyzmatrix(mesh)
  data.frame(X = v[,1], Y = v[,2], Z = v[,3])
}
mk_panel <- function(mesh, title, neuron_col) {
  ggplot() +
    geom_neuron(brain,    cols = "grey75", alpha = 0.30) +
    geom_neuron(sez_surf, cols = "#3182bd", alpha = 0.40) +
    geom_point(data = mesh_xy(mesh), aes(x = X, y = Y),
               colour = neuron_col, alpha = 0.04, size = 0.15) +
    coord_fixed() + scale_y_reverse() +
    labs(title = title) +
    theme_void(base_size = 11) +
    theme(plot.title = element_text(hjust = 0.5))
}
(mk_panel(m[[CB0602]], "CB0602 — SS32423 ∩ R65D05",   "magenta") |
 mk_panel(m[[CB0239]], "CB0239 — SS32423 ∩ VT019900", "darkorange"))

The result (inst/images/asta_sez_candidates_natggplot.png):

CB0602 (magenta) and CB0239 (orange) over FAFB14 brain with SEZ neuropils

CB0602 has the soma in the SEZ neuropil (left side) and arbours extending dorso-medially into central brain — exactly the bilateral SEZ-soma → SMP-arbour pattern visible across the SS32423 montage. CB0239 is a tighter, more SEZ-local cell.

Step 9: Side-by-side colour-depth MIP

Stack the SS32423 NB brain MIP next to depth-encoded renders of the two SEZ AstA cell types — CB0602 (ventral SEZ pair) and CB0108 (dorsal/IPS pair) — using the same blue→cyan→green→yellow→orange→red→magenta ramp NeuronBridge uses, so the three panels read consistently:

SS32423 NB MIP next to CB0602 and CB0108 depth-encoded skeletons

The reproducer for this figure (and the geom_neuron panel above) is inst/scripts/asta_sez_meshes.R — fetches the FlyWire meshes once (~10 s/neuron, ~22 MB each, cached to inst/extdata/asta_sez/cache/meshes_*.rds) and renders both the 2-panel geom_neuron figure and this 3-row depth-encoded panel.

For a true NeuronBridge-style colour-depth MIP from a FlyWire mesh (i.e. the Janelia ColorMIP algorithm applied to a re-rendered NRRD in the NeuronBridge-compatible JRC2018U_HR grid: 1210 × 566 × 174 at 0.519 × 0.519 × 1.0 µm), nrrd_to_mip_direct() does the whole thing in pure R — no FIJI launch required:

sez_hits <- c(CB0602 = "720575940632295751",   # ventral SEZ pair
              CB0108 = "720575940620093019")   # dorsal IPS pair
savefolder <- "~/asta_sez_mips"

# 1) Render each FlyWire mesh into a JRC2018U_HR-sized NRRD
for (id in sez_hits) {
  root_id_to_nrrd(id,
                  reference  = "JRC2018U_HR",
                  savefolder = savefolder)
}

# 2) Colour-depth MIP every NRRD in the folder, in pure R
nrrd_to_mip_direct(savefolder, target_space = "brain", format = "png")
# Output: <savefolder>/color_mips/*_colormip.png

The colormip_direct_vs_fiji vignette walks through the algorithm, contrasts nrrd_to_mip_direct() against the FIJI-launching nrrd_to_mip(), and validates byte-for-byte against Jasper Phelps’s Python port in the BANC connectome package.

Step 10: Beyond the SEZ — the other AstA neurons SS32423 labels

Inspecting the SS32423 montage (Step 1) more carefully, row 1 columns 2–4 show clear bright cells outside the SEZ — in the upper-medial brain. Hentze et al. 2015 mapped AstA expression to several non-SEZ groups (median protocerebrum / pars intercerebralis, pars lateralis, optic lobe), so we should expect SS32423 to label some of those too even if the canonical AstA-GAL4 (R65D05) doesn’t.

Pulling the 16 SS32423 hits with non-SEZ somas from asta_sez_ranked_full.csv confirms the pattern: they cluster overwhelmingly in the SMP / SLP with several bilateral cell-type recurrences (strong evidence these are real, consistent SS32423 targets rather than colour-MIP noise):

cell_type	hemilineage	NT	DCV pct	recurrence
`SMP202`	`putative_primary`	ACh	0.95–0.97	bilateral pair
`SMP540`	`SMPpd2`	glutamate	0.87–0.92	bilateral pair (×3 hits)
`CB2082`	`SMPp&v1_posterior`	glutamate	0.86–0.93	bilateral pair
`CL176`	`SMPp&v1_posterior`	glutamate	0.80–0.88	bilateral pair

(CB0108 initially appeared here too because the strict tokeniser classified its outside_IPS_L soma rind as non-SEZ, but biologically it belongs to the SEZ AstA group — see Step 7b — and we’ve moved it back there.)

Render the strongest non-SEZ candidates with the same mesh-vertex point-cloud pipeline (see inst/scripts/asta_sez_meshes.R for the mk_panel() definition):

others <- c(
  SMP202_R = "720575940626412563",
  SMP202_L = "720575940628248326",
  SMP540_R = "720575940623972792",
  SMP540_L = "720575940616330011",
  CB2082_R = "720575940623952679"
)
m_other <- fafbseg::read_cloudvolume_meshes(unname(others))
# ...same mk_panel() as in step 8...

Non-SEZ SS32423 candidates — SMP202 (×2), SMP540 (×2), CB2082

The figure makes two things clear:

SMP202 (bilateral, ACh) has somas in the upper-medial brain (SMP cortex rind) and dense bilateral arbours fanning across SMP / SLP — the exact cell pattern visible in row 1 cols 2–4 of the SS32423 montage.
SMP540 (bilateral, glutamate) is a tighter dorsal-medial cell in the SMPpd2 hemilineage — co-labels in the same SS32423 pattern, alongside SMP202.

Important caveat on AstA-specificity for the SMP group

None of these non-SEZ candidates are hit by R65D05 (the canonical Hergarden AstA-GAL4 — see the heatmap in step 7). That means we cannot assert from this analysis alone that they are AstA-positive; only that they are consistent SS32423 targets which happen to be DCV-rich (hence likely peptidergic). The interpretation is one of:

They are AstA-positive non-SEZ cells (consistent with the PI/PL groups Hentze 2015 reported), and R65D05 simply doesn’t capture them because that Pfeiffer fragment is a sub-region of AstA flanking sequence selected to label SEZ-AstA cells. Other AstA reporters (the Hentze 2015 P{AstA-GAL4.2.1}, CRISPR AstA[SK1], MiMIC T2A-GAL4) would be the test — none are in NB, so this requires direct immunostaining or LM imagery from those lines.
They are off-target cells of SS32423: one or both hemidrivers (VT019900, R65D06) drive in the SMP independently of AstA expression, and the split happens to intersect them. Bilateral cell-type recurrence makes this less likely (random off-target hits don’t usually mirror left↔︎right) but it’s still on the table.

The next step in either direction is to render these candidates as proper colour-depth MIPs against the SS32423 NB MIPs from row 1 cols 2–4 of the montage and judge visual agreement.

Caveats

SS32423 is a “weak” labeller of the AstA SEZ neuron — the colour-depth scores are only mid-range, and the bright cell at the top of the SS32423 MIP is a different (stronger) cell of the line’s pattern. Trust the visual MIP / 3-D match, not the rank order alone.
DCV density is a peptidergic proxy, not an AstA-specific marker. CB0602 sitting at dcv_pct = 0.866 is consistent with peptide release but does not pin down which neuropeptide. The np_verified column will catch AstA-positive neurons that have already been annotated; absence there is “not yet verified”, not “negative”.
Soma-neuropil via DCV-vesicle tokenisation is approximate. Comma-joined boundary labels and the outside_* rind classes are both included on purpose, and unknown (no soma DCVs) is kept rather than dropped. The hard filter is in_ss32423 plus cross-line consensus; soma-zone is a nudge.
MIP cap = 5/line is a quadratic-cost trade-off. With more MIPs per line we’d expect more cross-line consensus (currently only 2 candidates have n_lines >= 2). If you need that, raise the cap in step 1 and tolerate a longer pipeline run; the inst/extdata/asta_sez/cache/ per-line .rds files mean a re-run only does the new lines.
NeuronBridge data version. This vignette is written against v3_9_0, the first release with the corrected FlyWire-v783 alignment. Earlier releases will give worse top-N hit lists.
AstA drivers not in NeuronBridge. P{AstA-GAL4.2.1} (Hentze 2015), AstA[SK1] CRISPR, MiMIC T2A-GAL4 — all not in NB and so cannot contribute to the consensus here.
Hemidriver self-reinforcement. SS32423’s two halves are themselves AstA-locus enhancers and appear in the panel — i.e. the cross-line consensus partly self-reinforces. This is fine for finding the AstA cell, but the test for “is this an AstA neuron?” is whether R65D05 (the canonical AstA-GAL4) hits it, not just whether SS32423’s halves do. By that test, CB0602 wins.

Citations

NeuronBridge: Clements, J., Goina, C., Kazimiers, A., Otsuna, H., Svirskas, R., Rokicki, K. (2020). NeuronBridge Codebase.
SS32423 / aMulet: Sterne, G.R., Otsuna, H., Dickson, B.J., Scott, K. (2021). eLife 10:e71679.
AstA-GAL4 (R65D05): Hergarden, A.C., Tayler, T.D., Anderson, D.J. (2012). PNAS 109(10):3967–3972.
AstA in feeding circuits: Pool, A.-H. et al. (2014). Neuron 83(1):164–177.
Janu-AstA / thirst: Landayan, D., Wang, B.P., Zhou, J., Wolf, F.W. (2021). eLife 10:e66286.
AstA biology: Chen, J. et al. (2016) PLoS Genet 12(8):e1006346; Hentze, J.L. et al. (2015).
FlyWire / FAFB-v783: Dorkenwald, S. et al. (2024); Schlegel, P. et al. (2024). Nature.
DCV detection / soma DCV density: work from Wei-Chung Allen Lee’s lab (Harvard Medical School), in preparation.