FlyWire and R

FlyWire is an open neuron reconstruction environment for the full adult fly brain ( FAFB ) EM data set.

Unlike with more manual reconstruction environments like CATMAID, users concatenate volumetric segmentations to build neurons as 3D meshes. The segmentation was produced by the Seung group. With the package hemibrainr we can read flywire neurons as both meshes and skeletons. Meshes can be read directly from flywire, but skeletons need to be built from these meshes. Our package fafbseg can enable users to skeletonise neurons in R (e.g. fafbseg::skeletor), whereas hemibrainr exists to help users get hold of neuron data and match flywire neurons to hemibrain neurons.

On the hemibrainr Google team drive for the Drosophila Connectomics Group. If you do not have access to this team drive and would like to use it, to make the most our of hemibrainr, please get in contact. You will need top to have access to the drive and have Google filestream mounted. Then you will be able to:

Read thousands of pre-skeletonised flywire neurons from Google Drive
Read flywire NBLASTs and NBLASTs to hemibrain neurons
Read flywire neurons that are pre-transformed into a variety of brainspaces

Which is all useful stuff. In order to connect R to this Google drive, you have a few options. Please see this article. The pipeline that produced this data can be found here. It is run nightly on a machine at the MRC LMB.

Authorisation

To access flywire data on our hemibrainr Google team drive, you only need access to the drive.

To access flywire data directly from FlyWire, you will need to have an account with them and you will need to use your ‘secret’ associated with that account. Visit https://globalv1.flywire-daf.com/auth/api/v1/refresh_token to get your token. Note that it changes each time you visit this link.

Copy it and then:

# remotes::install_github("natverse/fafbseg"), if you don't have the package
fafbseg::flywire_set_token("PASTE_TOKEN_HERE")

It is best if you have access to the flywire production node. Otherwise, you can look at just the base segmentation (useful but a lot less useful).

And now, let’s test by looking at some few information available through hemibrainr:

# Load package
library(hemibrainr)

# Else, it wants to see it on the mounted team drive, here
options("remote_connectome_data")

# See matches, you can do this without hemibrain Google Team Drive access
View(hemibrain_matched)

# Get fresh matches, you cannot do this without access
## You will be prompted to log-in through your browser
hemibrain_matched_new <- hemibrain_matches() 
## NOTE: includes hemibrain<->flywire matches!
### You require access to this Google sheet: https://docs.google.com/spreadsheets/d/1OSlDtnR3B1LiB5cwI5x5Ql6LkZd8JOS5bBr-HTi0pOw/edit#gid=2090252460

Tutorial

We’ll need a few more packages:

# Load some libraries to help with these transformations
if (!require("remotes")) install.packages("remotes")
if (!require("nat.jrcbrains")) remotes::install_github("flyconnectome/nat.jrcbrains")
if (!require("fafbseg")) remotes::install_github("flyconnectome/fafbseg")
library(nat.jrcbrains)
library(fafbseg)

Get flywire data

With hemibrainr you can easily get thousands of skeletons for flywire neurons. These skeletons have been built using skeletor, which is a python module. You can also use it directly in R with fafbseg::skeletor. It has been run on thousands of neurons, which have then been stored on Google Drive, at: hemibrainr/flywire_neurons/ as nat::neuronlistfh objects. This means that you can use either get flywire skeletons for neurons using the drive, or by making them from meshes yourself. Let’s demonstrate:

Access flywire data on the hemibrainr Google drive

From the the hemibrainr Google drive we can see which flywire IDs have been read from FlyWire and skeletonised. We can also see some meta data about them, and get a data frame that tells us which users have built these neurons - useful for assigning credit!

# All flywire IDs for neurons that have a split precomputed
fw.ids = flywire_ids(sql=FALSE)
length(fw.ids)

# For these flywire IDs, their meta data:
fw.meta = flywire_meta()
head(fw.meta)

# For flywire IDs, which users contributed what:
fw.edits = flywire_contributions()
head(fw.edits)

# For flywire IDs which do not have available skeletons on the google drive:
## but have been flaghged for processing (i.e. they errored)
fw.failed = flywire_failed()
head(fw.failed)

Now, excitingly we can read the neurons themselves! And not only that, we can read them mirrored (flipped to the other hemisphere of the brain) or in a different brainspace (from a small selection) if we wish. We read these neurons as nat::neuronlistfh objects. This means that a data frame for the neurons and information specifying where to find each neuron’s data is read into R - but not the whole, huge nat::neuronlist object. This saves on memory for your R session. When an operation that requires the actual neuron data is performed, neurons are read into R.

# Get all aflywire neurons
fw.neurons = flywire_neurons() # these are in FlyWire space, which is ~FAFB14 space
length(fw.neurons) # That's a lot of neurons!
head(fw.neurons) # There is some meta data already attached

# Let's see a selection. All the neurons in the DL1_dorsal lineage
dl1.dorsal = subset(fw.neurons, ito_lee_hemilineage == "DL1_dorsal")
## There may be some mistakes early on here
### As the project develops, they should get ironed out ....

# And plot!
nat::nopen3d()
plot3d(dl1.dorsal, soma = 5000, col = side) # soma gives the size to plot the root, radius in nm
plot3d(FAFB14, alpha = 0.1, col = "grey", add = TRUE)

# We can also get the mirrored neurons!
fw.neurons.m = flywire_neurons(mirror = TRUE)
dl1.dorsal.m = fw.neurons.m[intersect(names(fw.neurons.m),names(dl1.dorsal))]
plot3d(dl1.dorsal.m, col = 'lightgrey', soma = 5000)

# Let's plot in hemibrain space....
fw.neurons.hemi = flywire_neurons(brain = "JRCFIB2018F")
dl1.dorsal.hemi = fw.neurons.hemi[intersect(names(fw.neurons.hemi),names(dl1.dorsal))]
nat::nopen3d()
plot3d(dl1.dorsal.hemi, col = 'darkgrey', soma = 5000)
plot3d(hemibrain_microns.surf, alpha = 0.1, col = "grey", add = TRUE)

Get flywire neuron IDs

You might want to look at specific flywire neurons. You could choose them based on the meta data that you can see with flywire.neurons[,]. However, you may also want to choose neurons based on a specific location in the data set. You can do this by giving locations to functions in fafbseg and turning them into coordinates. You can either give points in nanometres or raw voxel space. Note that the flywire dataset has a different registration than FAFB14 (used in CATMAID and also some other published data sources).

# Set the package fafbseg to look at the flywire segmentation
fafbseg::choose_segmentation("flywire")
# OR: choose_segmentation("flywire-sandbox") for the base segmentation

# Let's say we want the ID for a neuron at these voxel coordinates:
pos = matrix(c(165630,24412,3765),ncol=3) # Can be copied
# from the button at the top of flywire https://ngl.flywire.ai/
pos.nm = pos*c(4,4,40) # this is what it would be in nanometres

# Find the flywire ID!
ids = fafbseg::flywire_xyz2id(pos, rawcoords = TRUE)
# OR: ids = fafbseg::flywire_xyz2id(pos.nm, rawcoords = FALSE)

You can also get flywire IDs from coordinates in FAFB14 space. This is the coordinate space of a popular CATMAID project for FAFB data:

# What if we want to find the flywire ID associated with a neuron
## From FAFBv14, e.g. from the 'walled garden' CATMAID instance?
### Transform and fine IDs
nx = nat.templatebrains::xform_brain(elmr::dense_core_neurons, ref="FlyWire", sample="FAFB14")
xyz = nat::xyzmatrix(nx)
ids = unique(fafbseg::flywire_xyz2id(xyz[sample(1:nrow(xyz),100),]))

Skeletonise flywire neurons yourself

We can also skeletonise neurons ourselves using skeletor pipeline in R. This pipeline reads meshes from flywire, contracts them for accurate skeletonisation, and produces a skeletonised mesh. In R, we also try to re-root the neuron at it’s estimated soma location.

You will need to install the python modules related to skeletor to get it to work. Please follow the instructions here for the python version. Here, we provide a guide to get everything working in R (in short reticulate is used to run python code in a specific, stable conda environment we make for R to use).

Once everything is installed, this ought to work:

neurons = fafbseg:::skeletor(segments = ids, brain = elmr::FAFB14.surf) # The brain arguments helps
## Re-rootd to the soma, roughly

# Plot in 3D!
plot3d(neurons) # note, in flywire space
plot3d(nx, col="black", lwd  =2) # note, in flywire space

Add flywire neurons to the Google Drive

These functions can take a long time to run. For example, if adding to the hemibrain-flywire NBLAST, all ~25k flywire neurons need to be read in order to calculate the NBLAST. You can also request that your flywire neurons are added to the precomputations we try to make en mass periodically for the Google drive, if you do not need your answer ASAP. You can do this as so:

# you can add xyz locations as so:
flywire_request(request = pos)

# You can also give a neuron in flywire space
flywire_request(request = neurons)

# Or you can give an ID, but then fafbseg::skeletor is called
flywire_request(request = ids)

# Now look at the Google sheet, and you should see new entries with
## the fields fw.x, gw.y and fw.z fileld out. Don't worry about the rest:
### Ghseet: https://docs.google.com/spreadsheets/d/1rzG1MuZYacM-vbW7100aK8HeA-BY6dWAVXQ7TB6E2cQ/edit#gid=0

Match flywire neurons to each other and the hemibrain

The neuron matching pipeline reads neuron skeletons from Google Drive and and displays them. FlyWire skeletons are shown in dark grey (their mirrored equivalents in light grey) and potential hemibrain_matches in red. This depends also on the precomputed NBLASTs on the hemibrain team drive. For full details, see our match_making vignette.

Matching pipelines

Matches are stored on a Google sheet. This sheet can be modified using interactive pipelines in R. These pipelines use the results of a nightly NBLAST of thousands of neurons, to visually present you with the best candidate matches in 3D:

# As simple as:
fafb_matching(ids, repository = "flywire", overwrite = "mine")

# Or if you just want to go after entries flagegd for your User:
fafb_matching(repository = "flywire", overwrite = "mine_empty")

# You can also match Left/Right side flywire neurons to their other-side cognates:
LR_matching(overwrite = "mine_empty")

# See the results of matching
hemibrain_matched_new = hemibrain_matches()

What if the flywire neuron you want, is not available to match?

For it to be matchable, we need to add the skeleton to Google drive and then add this entry into the NBLAST computation. Those steps were covered above. Because there are a lot of flywire neurons and hemibrain neurons, they are rather slow and cumbersome. For this reason, we also run a ~daily compute of skeletons and NBLASTs and put them on our Google drive. In order to flag a neuron in flywire for skeletonisation and NBLASTing, its coordinates need to be either in the FAFB tab of em_matching or flywire_interest. We use XYZ coordinates in flywire raw voxel space, rather than flywire IDs, because the latter frequently change as a consequence of merging and splitting neurons. FlyWire is, at the time of writing (pandemic 2020), an active tracing project.

Manage the Google sheet

Matches between FAFB neurons (both from FAFBv14 CATMAID and flywire) are recorded on our master Google sheet for matching, em_matching. See the match_making vignette for details. Because flywire IDs change frequently as a consequence of tracing, and because we keep adding new skeletons to our Google drive dump, it is nice to update the master Google sheet for matching every now and then. We also want to make sure that FAFBv14->flywire matches are recapitulated on the ‘hemibrain’ matches sheet. This can be done as follows, but can take a long time to run to best to leave it over night:

flywire_matching_rewrite() # bring all the flywire information up to date

Full Example

Let us get some neurons from the ‘flywire interest’ googlesheet:

# Get all the flywire IDs (updated regularly based on position information) users have flagged
gs = googlesheets4::read_sheet(ss = options()$flywire_flagged_gsheet, sheet = "flywire")

# Get IDs for User 'Tots'
gs.chosen = subset(gs, User == "Tots")

# Load processed flywire neurons
fw = flywire_neurons()

# Get those for Tots
fw.tots = fw[names(fw)%in%gs.chosen$root_id]

# Examine their meta data
fw.tots[,]