Accessing the chunked graph server
Source:vignettes/articles/accessing-graphene-server.Rmd
accessing-graphene-server.Rmd
tl;dr
To set up programmatic access to FlyWire, use
flywire_set_token()
as follows:
and follow the on-screen prompts.
Background
The main goal of connectomics is to trace (segment) the branching morphology of neurons in any 3D volume (typically based on electron microscopy) and to identify the connections (synapses) between them that together define a connectome.
Neuron segmentation can be either skeleton-based (in which only the centre line defining each part of the neuronal arbour is traced out) or voxel-based, where a filled 3D structure is defined by labelling every voxel with an integer id, representing its membership of particular object; this is typically then displayed as a 3D mesh surface. Although skeleton-based segmentation is more efficient for manual segmentation by humans, voxel-wise segmentation is more natural for machine learning algorithms. Furthermore, it generates a richer representation of the neuron (capturing more of its structural features) and, crucially, also resolves ambiguities about object identity. For example when two users manually trace a neuron skeleton, they will place nodes in different positions and it is not trivial computationally to determine if they are part of the same object or how to join tracing of independent pieces together. In contrast in a voxel-wise segmentation every xyz location in the image is explicitly defined to be part of a specified object.
In a voxel-wise segmentation, each neuron can be considered as a collection of voxels grouped together as connected components. Machine learning based algorithms try to identify such connected components (e.g. using flood filling networks). However machines are not perfect and many segments are wrongly connected. Hence we need a step where humans proofread the segments and alters the connections. These connected components can be readily represented as vertices and the connections between them as edges. The human observer essentially modifies these edges in the graph structure. Note that this graph representation of object segmentation is distinct from the binary tree that represents the neuronal morphology or the connection graph of neurons within a neural network.
This graph based representation of segmentation also has advantages
for synchronous editing, which can be a major technical issue if many
users are to generate or proof-read segmentation at the same time. The
CATMAID
web application (for skeleton based tracing)
approaches this by allowing multiple users to edit all neurons using a
shared database. This implementation reduces conflicts by the simple
expedient of allowing users to see each other’s work in real time.
However an explicit merge step is needed when conflicting annotations
are created (usually because editing has happened in different database)
and this can be computationally complex. The Seung Lab has developed a data
structure system called chunked graph
that lets users
modify segmentations (3D voxel-wise) in real-time and allows access to
earlier versions of the segmentation. A brief overview of the system is
published in medium.
In practice an initial over-segmentation is generated, consisting of many “supervoxels”. These supervoxels, which are small collections of perhaps tens to thousands of individual voxels, are effectively immutable but are almost certain to belong to the same object. It is these supervoxels which are the leaves on the graph database representation of segmented objects. The use of supervoxels reduces the size of the graph database compared with the situation in which individual voxel has to be represented. So long as human proof-reading is based on the same base segmentation, there are strategies to merge proof-reading even if this happens asynchronously. However merging proof-reading based on different base segmentations defined by different underlying supervoxels is still complex; a similar challenge exists if it is necessary to map proof-reading based on one base segmentation to a newer (presumably improved) base segmentation.
Goal
One of the goals in the fafbseg package is to
provide access between different 3D segmentations of the
FAFB
dataset. Currently available segmentations include
FlyWire
(based on 3D CNNs : U-Net) and
GoogleBrain
(based on recursive CNNs : flood-filling
networks). Furthermore to enable location transfer and comparison
with CATMAID
(skeleton based) manual reconstruction. In
order to achieve that one needs access to the 3D segmentations which are
stored in a graph database Graphene/PyChunkedGraph.
They also provide a Python tool called CloudVolume which
provides programmatic access to the graph database.
ChunkedGraph token
Simple Instructions
You gain access to the Graphene/Chunkedgraph
server by
using a token which is granted to you after you authenticate
via a Google account. The simplest way to do this is by doing
This will offer to open your browser, generate a new token and save it in a standard location. Note that this will invalidate your previous token!
Details
The manual steps that flywire_set_token()
tries to
automate are are as follows
- Visit https://globalv1.flywire-daf.com/auth/api/v1/refresh_token,
- Copy the token present there, let’s say it was
"xxyyzz"
- create a file named
cave-secret.json
at location~/.cloudvolume/secrets/cave-secret.json
. For example, in Terminal:
mkdir -p .cloudvolume/secrets/
touch ~/.cloudvolume/secrets/cave-secret.json
open ~/.cloudvolume/secrets/cave-secret.json -e
- Paste in the following contents
- Save the file
- Now you have access to the chunked graph server