R-CMD-check Codecov test coverage Lifecycle: experimental

The goal of hemibrainr is to provide useful code for preprocessing and analysing data from the Janelia FlyEM hemibrain project. It makes use of the natverse R package, neuprintr to get hemibrain data from their connectome analysis and data hosting service neuprint. The dataset has been described here. Using this R package in concert with the natverse ecosystem is highly recommended.

The hemibrain connectome comprises the region of the fly brain depicted below. It is ~21,662 ~full neurons, 9.5 million synapses and is about ~35% complete in this region:

hemibrain

Get started with hemibrainr

Installation

# install
if (!require("remotes")) install.packages("remotes")
remotes::install_github("natverse/hemibrainr")

# use 
library(hemibrainr)

Using hemibrainr

hemibrainr contains tools with which to quickly work with hemibrain and FlyWire neurons, and match up neurons within and between data sets.

If you can connect to the hemibrainr google shared drive, this package puts thousands of hemibrain and FlyWire neurons at your fingertips, as well as information on their compartments (e.g. axons versus dendrites), synapses and connectivity and between data set neuron-neuron matches. You can:

  • Read thousands of pre-skeletonised FlyWire/hemibrain neurons from Google Drive
  • Read FlyWire/hemibrain NBLASTs and NBLASTs to hemibrain neurons
  • Read FlyWire/hemibrain neurons that are pre-transformed into a variety of brainspaces

Which is all useful stuff. You can explore our articles for more detailed information on what the package can do, and how to set it up with the data stored on Google drive - but can take a quick tour here:

# Load package
library(hemibrainr)

# Else, it wants to see it on the mounted team drive, here
options("Gdrive_hemibrain_data")

# We can load meta data for all neurons in hemibrain
db = hemibrain_neurons()

# And quickly read them from the drive, when we try to plot/analyse them!
hemibrain_view()
plot3d(hemibrain.surf, col = "grey", alpha = 0.1)
plot3d(db[1:10])

See which neurons have been matched up:

# See matches, you can do this without hemibrain Google Team Drive access
View(hemibrain_matched)

# Get fresh matches, you cannot do this without access
## You will be prompted to log-in through your browser
hemibrain_matched_new <- hemibrain_matches() 
## NOTE: includes hemibrain<->FlyWire matches!

neuPrint authentication

In order to use neuprintr, which fetches data we want to use with hemibrainr, you will need to be able to login to a neuPrint server and be able to access it underlying Neo4j database.

You may need an authenticated accounted, or you may be able to register your @gmail address without an authentication process. Navigate to a neuPrint website, e.g. https://neuprint.janelia.org, and hit ‘login’. Sign in using an @gmail account. If you have authentication/the server is public, you will now be able to see your access token by going to ‘Account’:

access your bearer token

To make life easier, you can then edit your .Renviron file to contain information about the neuPrint server you want to speak with, your token and the dataset hosted by that server, that you want to read. A convenient way to do this is to do

usethis::edit_r_environ()

and then edit the file that pops up, adding a section like

neuprint_server="https://neuprint.janelia.org"
# nb this token is a dummy
neuprint_token="asBatEsiOIJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6ImIsImxldmVsIjoicmVhZHdyaXRlIiwiaW1hZ2UtdXJsIjoiaHR0cHM7Ly9saDQuZ29vZ2xldXNlcmNvbnRlbnQuY29tLy1QeFVrTFZtbHdmcy9BQUFBQUFBQUFBDD9BQUFBQUFBQUFBQS9BQ0hpM3JleFZMeEI4Nl9FT1asb0dyMnV0QjJBcFJSZlI6MTczMjc1MjU2HH0.jhh1nMDBPl5A1HYKcszXM518NZeAhZG9jKy3hzVOWEU"

Make sure you have a blank line at the end of your .Renviron file. For further information try about neuprintr login, see the help for neuprint_login().

Finally you can also login on the command line once per session, like so:

conn = neuprintr::neuprint_login(server= "https://neuprint.janelia.org/",
   token= "asBatEsiOIJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6ImIsImxldmVsIjoicmVhZHdyaXRlIiwiaW1hZ2UtdXJsIjoiaHR0cHM7Ly9saDQuZ29vZ2xldXNlcmNvbnRlbnQuY29tLy1QeFVrTFZtbHdmcy9BQUFBQUFBQUFBDD9BQUFBQUFBQUFBQS9BQ0hpM3JleFZMeEI4Nl9FT1asb0dyMnV0QjJBcFJSZlI6MTczMjc1MjU2HH0.jhh1nMDBPl5A1HYKcszXM518NZeAhZG9jKy3hzVOWEU")

This is also the approach that you would take if you were working with more than two neuPrint servers.

Connect to hemibrainr Google team drive

For this, you need access to th hemibrainr google team drive. Authentication is through an email account. Once you have access, there are two basic ways to mount the data for use:

Option 1, mount your Google drives using Google filestream. However, for this to work you will need Google Workspace, Google’s monthly subscription offering for businesses and organizations. One the Google filestream application is run, you should be able to see your drives mounted like external hard drive, as so:

google_filestream

Then, this should work:

# Set a new Google drive, can be the team drive name or a path to the correct drive
hemibrainr_set_drive("hemibrainr") # No need to run this each time though, this is the default. Use if you want to use a different name drive.

# Now just get the name of your default team drive.
## This will be used to locate your team drive using the R package googledrive
hemibrainr_team_drive()

Option 2, this is free. You still need authenticated access to the hemibrainr Gogle team drive. It can then be mounted using rclone. First, download rclone for your operating system. You can also download from your system’s command line (e.g. from terminal) and then configure it for the drive:

# unix/macosx
curl https://rclone.org/install.sh | sudo bash
rclone config

And now check this has worked:

# mounts in working directory
hemibrainr_rclone()

# Now hemibrain neurons are read from this mount
db = hemibrain_neurons() # read from the google drive
length(db)
plot3d(hemibrain_neurons[1:10])

# Specifically, from here
options("Gdrive_hemibrain_data")

# unmounts
hemibrainr_rclone_unmount()

# And now we are back to:
options("Gdrive_hemibrain_data")

For more detailed instructions, see this article.

Example: ‘splitting’ neurons

Let’s get started with a useful function for splitting a neuron into its axon and dendrite:

# Choose neurons
## These neurons are some 'tough' examples from the hemibrain:v1.0.1
### They will split differently depending on the parameters you use.
tough = c("5813056323", "579912201", "5813015982", "973765182", "885788485",
"915451074", "5813032740", "1006854683", "5813013913", "5813020138",
"853726809", "916828438", "5813078494", "420956527", "486116439",
"573329873", "5813010494", "5813040095", "514396940", "665747387",
"793702856", "451644891", "482002701", "391631218", "390948259",
"390948580", "452677169", "511262901", "422311625", "451987038"
)

# Get neurons
neurons = neuprint_read_neurons(tough)

# Now make sure the neurons have a soma marked
## Some hemibrain neurons do not, as the soma was chopped off
neurons.checked = hemibrain_skeleton_check(neurons, meshes = hemibrain.rois)

# Split neuron
## These are the recommended parameters for hemibrain neurons
neurons.flow = flow_centrality(neurons.checked, polypre = TRUE,
mode = "centrifugal",
split = "distance")

# Plot the split to check it
nat::nopen3d()
nlscan_split(neurons.flow, WithConnectors = TRUE)

Data fields

Here are some of the most useful column entries across these data. Please let me know if I have missed something important:

column description data entity
flywire.xyz A cardinal point in raw FlyWire voxel space that defines a neuron many
flywire.id The unique 16 digit ID for the flywire neuron, changes when neuron edited many
flywire.svid The supervoxel ID that corresponds to flywire.xyz flywire_meta
cell.type If neuron matched to a hemibrain neuron we get its cell type flywire_meta
side The hemisphere of the brain that the neuron’s soma, or else root-point, is thought to be on flywire_meta
status A manually assigned ‘status’ from a Drosophila Connectomics Group tracer, indicating the quality of the neuron flywire_meta
skid The neuron’s skeleton ID in CATMAID, if it exists. The CATMAID neuron may have manualyl annotated synapses flywire_meta
FAFB.xyz A cardinal point in FAFB14 voxel space that defines a neuron flywire_meta
ItoLee_Hemilineage A meaningful biological category, the developmental hemilineage to which we think the neuron belongs, in the nomenclature of Ito et al. 2013 flywire_meta
ItoLee_Lineage A meaningful biological category, the developmental lineage (comprises multiple hemilineages) to which we think the neuron belongs, in the nomenclature of Ito et al. 2013 flywire_meta
Hartenstein_Hemilineage A meaningful biological category, the developmental hemilineage to which we think the neuron belongs, in the nomenclature of Lovick/Wong et al. 2013, can be matched to similar names in the larval animal flywire_meta
Hartenstein_Lineage A meaningful biological category, the developmental (comprises multiple hemilineages) to which we think the neuron belongs, in the nomenclature of Lovick/Wong et al. 2013, can be matched to similar names in the larval animal flywire_meta
gsheet A Drosophila Connectomics Group googlesheet on which the flywire.id ID is recorded flywire_meta
hemibrain.match A semi-manually matched hemibrain neuron’s bodyID flywire_meta
hemibrain.match.quality A manually assigned quality for the hemibrain match flywire_meta
FAFB.hemisphere.match A cardinal point for a neuron on the other hemisphere that has semi-manually been found to match the given neuron flywire_meta
FAFB.hemisphere.match.quality A manually assigned quality for the FAFB hemisphere match flywire_meta
dataset The dataset to which the given neuron belongs hemibrain_matches
total.outputs The total number of output links / postsynapses from the neuron flywire_meta
axon.outputs The total number of axonal output links / postsynapses from the neuron flywire_meta
dend.outputs The total number of dendritic output links / postsynapses from the neuron flywire_meta
total.outputs.density The total number of output links / postsynapses from the neuron, per micron of cable flywire_meta
axon.outputs.density The total number of axonal links / postsynapses from the neuron, per micron of cable flywire_meta
dend.outputs.density The total number of dendritic links / postsynapses from the neuron, per micron of cable flywire_meta
total.outputs The total number of input links / postsynapses from the neuron flywire_meta
axon.inputs The total number of axonal input links / postsynapses from the neuron flywire_meta
dend.inputs The total number of dendritic input links / postsynapses from the neuron flywire_meta
total.inputs.density The total number of input links / postsynapses from the neuron, per micron of cable flywire_meta
axon.inputs.density The total number of axonal links / postsynapses from the neuron, per micron of cable flywire_meta
dend.inputs.density The total number of dendritic links / postsynapses from the neuron, per micron of cable flywire_meta
total.length The total cable length, in microns, for the neuron A cardinal point in raw FlyWire voxel space that defines a neuro flywire_meta
axon.length The axonal cable length, in microns, for the neuron A cardinal point in raw FlyWire voxel space that defines a neuro flywire_meta
dend.length The dendritic cable length, in microns, for the neuron A cardinal point in raw FlyWire voxel space that defines a neuro flywire_meta
pd.length The primary dendrite (linker) cable length, in microns, for the neuron A cardinal point in raw FlyWire voxel space that defines a neuro flywire_meta
pnt.length The primary neurite (cell body fibre) cable length, in microns, for the neuron A cardinal point in raw FlyWire voxel space that defines a neuro flywire_meta
segregation_index An entropy score for how segregated the neuron’s synapses are into axon and dendrite, see Schneider-Mizell et al 2016, eLife flywire_meta
root The treenode ID (position in .swc file) of the neuron’s root flywire_meta
nodes The number of nodes in the neuron flywire_meta
segments The number of segments in the neuron flywire_meta
banchpoints The number of banchpoints in the neuron flywire_meta
endpoints The number of endpoints in the neuron flywire_meta
nTrees The number of trees in the neuron, should be 1 flywire_meta
connectors The number of synapses in the neuron flywire_meta
postsynapse_side_index Side side of the neuron that receives the most input, calculated as: (no. postsynaptic links on right - no. on left)/total flywire_meta
presynapse_side_index Side side of the neuron that receives the most output, calculated as: (no. presynaptic links on right - no. on left)/total flywire_meta
axon_postsynapse_side_index Same as postsynapse_side_index, but only for the axon flywire_meta
axon_presynapse_side_index Same as presynapse_side_index, but only for the axon flywire_meta
dendrite_postsynapse_side_index Same as postsynapse_side_index, but only for the dendrite flywire_meta
dendrite_presynapse_side_index Same as presynapse_side_index, but only for the dendrite flywire_meta
top.nt The most prominent predicted transmitter for the neurons’ presynpses (the modal across all presynaptic links for: the best nt prediction for each synapses above cleft_score of 50) flywire_meta
offset The index for the Buhmann synapse in the original .sql table connectors
x,y,z The position of the connection in FlyWire space connectors
scores The Buhmamnn prediction score for the synapse, unsure definition connectors
top.p The probability of the top transmitter prediction for the presynaptic end of this connection connectors
top.nt The top transmitter prediction for the presynaptic end of this connection connectors
gaba The synister prediction score for gaba connectors
glutamate The synister prediction score for glutamate connectors
acetylcholine The synister prediction score for acetylcholine connectors
octopamine The synister prediction score for octopamine connectors
serotonin The synister prediction score for serotonin connectors
dopamine The synister prediction score for dopamine connectors
prepost Whether the synapse is pre- (0, i.e. output synapse) orr post (1, i.e. input) connectors
segmentid_pre The segment (?) for the presynaptic side of the link connectors
segmentid_pre The segment (?) for the postsynaptic side of the link connectors
pre_svid The flywire supervoxel ID for the presynaptic side of the link connectors
post_svid The flywire supervoxel ID for the postsynaptic side of the link connectors
pre_id The flywire.id for the presynaptic side of the link connectors
post_id The flywire.id for the postsynaptic side of the link connectors
treenode_id The treenode in the corresponding swc/d to which this synapse is best attached connectors
strahler_order The strahler order of the branch on which the synapse is positioned connectors
Label The comparment of the neuron on which the synapse is positioned, 2 = axon, 3 - dendrite, 4 = primary.dendrite, 7 = primary.neurite, 1 = soma connectors
PointNo The ID for each point in the neuron swc/d
X,Y,Z The FlyWire coordinate of the point swc/d
W The width of the point, generally not used swc/d
parent The Parent node, -1 means root swc/d
post The number of postsynapses attachec to node swc/d
pre The Pnumberf of presynapses attached to node swc/d
flow.cent The synaptic flow at this node, see Schneider-Mizell et al. 2016 swc/d
strahler_order The strahler order of this node swc/d

Data

Acknowledging the tools

neuPrint comprises a set of tools for loading and analyzing connectome data into a Neo4j database. Analyze and explore connectome data stored in Neo4j using the neuPrint ecosystem: neuPrintHTTP, neuPrintExplorer, Python API.

This package was created by Alexander Shakeel Bates and Gregory Jefferis. You can cite this package as:

citation(package = "hemibrainr")

Bates AS, Jefferis GSXE (2020). hemibrainr: Code for working with data from Janelia FlyEM’s hemibrain project. R package version 0.1.0. https://github.com/natverse/hemibrainr