Somas have been manually tagged by the FlyEM project. However, in some cases the roots are wrong, or somas are outside of the volume. The Fly Connectome team at the University of Cambridge has manually tagged better root points for these neurons. Soma locations, updated by this work, and saved as hemibrain_somas in this package. A user can read somas from here, or get a bleeding edge version from the Google Sheet. This information is used by hemibrain_read_neurons to re-root hemibrain neurons read from neuprint.

hemibrain_somas

Format

An object of class data.frame with 22560 rows and 9 columns.

Value

a data.frame with the 'best case' root points for each large hemibrain neuron. This is typically the soma, where one has been tagged by the FlyEM project. However, for many neurons their soma has been truncated and a leaf node on the cell body fiber has been chosen. The columns to this data.frame denote:

  • "bodyid" - the unique body ID identifier for a neuron.

  • "position" - the node ID of the rootpoint. This is for the skeleton associated with bodyid, as read from neuPrint.

  • "X" - x coordinate for the root point.

  • "Y" - y coordinate for the root point.

  • "Z" - z coordinate for the root point.

  • "soma.edit" - logical. If the soma is 300 voxels away from the soma/root for the naive neuprint skeleton, we consider the soma to have been 'edited' for the better. I.e. these are the somas that were fixed by Cambridge Drosophila Connectomics using hemibrain_adjust_saved_somas.

  • "fixed" - notes if there was a problem identifying the soma, eg, is the neuron only a fragment, or just the tract to the soma is visible.

  • "cellBodyFiber" - The cell body fiber for a neuron, as read from neuPrint and annotated by a team under Kei Ito. Where the cell body fiber is unknown, possible because it is an older neuron born in the embryo, course clustering was done to group them into loose groups. Here, "unknown_number" denotes the cluster.

Details

An interactive root point fixing pipeline hemibrain_adjust_saved_somas was used to identify correct and incorrect soma positions within a group. Specifically, the DBSCAN algorithm is used. All neurons within the hemibrain data set are either grouped by their CBF, or, when no CBF annotation is available, an all by all NBLAST clustering was done for neurons without a CBF and this was used to group neurons into similar morphological groups. DBSCAN within one of these groups then clusters the positions of the somas, returning one or more clusters of actual soma positions, and a cluster of ‘noise’ positions which do not fit any cluster. The user can then choose to manually fix either the noise somas, or the entire cluster, or subset of clusters if multiple potential soma clusters are returned.

In the instance a manual error has been made when labelling a soma position, the somas can be fixed using:

hemibrainr::adjust_saved_somas(bodyid = #####)

And selecting the neuron method when prompted and run through the pipeline instructions provided. When asked if you wish to try using DBSCAN on the neurons say no. The pipeline will then try to suggest a soma position based on a global clustering of those neurons which are already identified as correct.