Flood-Filling Networks (FFNs) are a class of neural networks designed for instance segmentation of complex and large shapes, particularly in volume EM datasets of brain tissue.
For more details, see the related publications:
This is not an official Google product.
No installation is required. To install the necessary dependencies, run:
pip install -r requirements.txt
The code has been tested on an Ubuntu 16.04.3 LTS system equipped with a Tesla P100 GPU.
FFN networks can be trained with the train.py
script, which expects a
TFRecord file of coordinates at which to sample data from input volumes.
There are two scripts to generate training coordinate files for
a labeled dataset stored in HDF5 files: compute_partitions.py
and
build_coordinates.py
.
compute_partitions.py
transforms the label volume into an intermediate
volume where the value of every voxel A
corresponds to the quantized
fraction of voxels labeled identically to A
within a subvolume of
radius lom_radius
centered at A
. lom_radius
should normally be
set to (fov_size // 2) + deltas
(where fov_size
and deltas
are
FFN model settings). Every such quantized fraction is called a partition.
Sample invocation:
python compute_partitions.py \
--input_volume third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack \
--output_volume third_party/neuroproof_examples/validation_sample/af.h5:af \
--thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9 \
--lom_radius 24,24,24 \
--min_size 10000
build_coordinates.py
uses the partition volume from the previous step
to produce a TFRecord file of coordinates in which every partition is
represented approximately equally frequently. Sample invocation:
python build_coordinates.py \
--partition_volumes validation1:third_party/neuroproof_examples/validation_sample/af.h5:af \
--coordinate_output third_party/neuroproof_examples/validation_sample/tf_record_file \
--margin 24,24,24
We provide a sample coordinate file for the FIB-25 validation1
volume
included in third_party
. Due to its size, that file is hosted in
Google Cloud Storage. If you haven't used it before, you will need to
install the Google Cloud SDK and set it up with:
gcloud auth application-default login
You will also need to create a local copy of the labels and image with:
gsutil rsync -r -x ".*.gz" gs://ffn-flyem-fib25/ third_party/neuroproof_examples
Once the coordinate files are ready, you can start training the FFN with:
python train.py \
--train_coords gs://ffn-flyem-fib25/validation_sample/fib_flyem_validation1_label_lom24_24_24_part14_wbbox_coords-*-of-00025.gz \
--data_volumes validation1:third_party/neuroproof_examples/validation_sample/grayscale_maps.h5:raw \
--label_volumes validation1:third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack \
--model_name convstack_3d.ConvStack3DFFNModel \
--model_args "{\"depth\": 12, \"fov_size\": [33, 33, 33], \"deltas\": [8, 8, 8]}" \
--image_mean 128 \
--image_stddev 33
Note that both training and inference with the provided model are
computationally expensive processes. We recommend a GPU-equipped machine
for best results, particularly when using the FFN interactively in a Jupyter
notebook. Training the FFN as configured above requires a GPU with 12 GB of RAM.
You can reduce the batch size, model depth, fov_size
, or number of features in
the convolutional layers to reduce the memory usage.
The training script is not configured for multi-GPU or distributed training. For instructions on how to set this up, see the documentation on Distributed TensorFlow.
We provide two examples of how to run inference with a trained FFN model.
For a non-interactive setting, you can use the run_inference.py
script:
python run_inference.py \
--inference_request="$(cat configs/inference_training_sample2.pbtxt)" \
--bounding_box 'start { x:0 y:0 z:0 } size { x:250 y:250 z:250 }'
which will segment the training_sample2
volume and save the results in
the results/fib25/training2
directory. Two files will be produced:
seg-0_0_0.npz
and seg-0_0_0.prob
. Both are in the npz
format and
contain a segmentation map and quantized probability maps, respectively.
In Python, you can load the segmentation as follows:
from ffn.inference import storage
seg, _ = storage.load_segmentation('results/fib25/training2', (0, 0, 0))
We provide sample segmentation results in results/fib25/sample-training2.npz
.
For the training2 volume, segmentation takes ~7 min with a P100 GPU.
For an interactive setting, check out
ffn_inference_colab_demo.ipynb
.
This Colab notebook shows how to segment a single object with an explicitly defined
seed and visualize the results while inference is running.
Both examples are configured to use a 3d convstack FFN model trained on the
validation1
volume of the FIB-25 dataset from the FlyEM project at Janelia.
Please see doc/manual.md
.
ffn's People
Forkers
aschampion mimi1942 gregjohnso gregghelt2 haeinkim erhanbas lzbgt lonestar686 yeshwanthv5 keceli oztc shadowkun chenxingqiang cclauss jayarajanjn yamagatm codes-kzhan emmasrh daijicheng benzei negative09 virenjain supersteven svartalv richgit101 apasarkar weihuang527 aashish24 arokem donglaiw hanyu-li jkim1881 pgunn raijinspecial aihill necrodancer hiroalchem yangsenwxy cwindolf arasazimi yooerzf yazici matthewbm leowe michaelkyu animadversio ksuszka moondaiy julienschuermans xiaochengcike liekejiang avinash-chouhan nnu-gisa pint1022 sanger2000 992118085 rishistyping jpgard polarcrab reconstrue pituohai deeplearning-machinelearning tomuram abmclin junmokane chaoswjc sh4zkh4n torms3 griffbad yunoinsky dengjiongshen junmugit neotim yngtodd muskanmahajan37 chinasaur prateek-77 erjel ibex-training isabella232 whaoyu3 mpinb mostaszewski314 boyulyu alexshevelkin jwgim python-repository-hub andreanathansen jackiezhai davidackerman aibolem kainmueller-lab moenigin ghas-results evnkm saurabh2108 spiralsimffn's Issues
TypeError: string indices must be integers
The sentence
import ffn prices = ffn.get('^IBEX', start='2010-12-30', end = '2019-10-29') prices
Returns the error
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_2612/2489336543.py in
1 import ffn
2
----> 3 prices = ffn.get('aapl,msft', start='2010-01-01')
~/anaconda3/envs/enri/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)
233 fun.name = func.name
234 fun.doc = func.doc
~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/utils.py in _memoize(func, *args, **kw)
32 return cache[key]
33 else:
---> 34 cache[key] = result = func(*args, **kw)
35 return result
36
~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/data.py in get(tickers, provider, common_dates, forward_fill, clean_tickers, column_names, ticker_field_sep, mrefresh, existing, **kwargs)
74 # call provider - check if supports memoization
75 if hasattr(provider, "mcache"):
---> 76 data[ticker] = provider(ticker=t, field=f, mrefresh=mrefresh, **kwargs)
77 else:
78 data[ticker] = provider(ticker=t, field=f, **kwargs)
~/anaconda3/envs/enri/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)
233 fun.name = func.name
234 fun.doc = func.doc
~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/utils.py in _memoize(func, *args, **kw)
32 return cache[key]
33 else:
---> 34 cache[key] = result = func(*args, **kw)
35 return result
36
~/anaconda3/envs/enri/lib/python3.9/site-packages/ffn/data.py in yf(ticker, field, start, end, mrefresh)
138 field = "Adj Close"
139
--> 140 tmp = pdata.get_data_yahoo(ticker, start=start, end=end)
141
142 if tmp is None:
~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/data.py in get_data_yahoo(*args, **kwargs)
78
79 def get_data_yahoo(*args, **kwargs):
---> 80 return YahooDailyReader(*args, **kwargs).read()
81
82
~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/base.py in read(self)
251 # If a single symbol, (e.g., 'GOOG')
252 if isinstance(self.symbols, (string_types, int)):
--> 253 df = self._read_one_data(self.url, params=self._get_params(self.symbols))
254 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT'])
255 elif isinstance(self.symbols, DataFrame):
~/anaconda3/envs/enri/lib/python3.9/site-packages/pandas_datareader/yahoo/daily.py in _read_one_data(self, url, params)
151 try:
152 j = json.loads(re.search(ptrn, resp.text, re.DOTALL).group(1))
--> 153 data = j["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
154 except KeyError:
155 msg = "No data fetched for symbol {} using {}"
TypeError: string indices must be integers`
What can be the cause?. I will appreciate help
Share trained weights of SNEMI3D dataset?
Somehow, the website to SNEMI3D is down and I am trying to do a quick test using FFN.
Could anyone share the trained weights based on this dataset? Thanks!
AxisError with numpy 1.18
I got the following error, when I tried to run FFN training with numpy 1.18:
numpy.AxisError: axis 4 is out of bounds for array of dimension 4
Looks like it is due to a recent change in numpy.expand_dims. See https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html.
Here is a patch to fix the error:
diff --git a/ffn/training/inputs.py b/ffn/training/inputs.py
index d1b5c31..b9a9c7e 100644
--- a/ffn/training/inputs.py
+++ b/ffn/training/inputs.py
@@ -152,7 +152,7 @@ def load_from_numpylike(coordinates, volume_names, shape, volume_map,
if data.ndim == 4:
data = np.rollaxis(data, 0, start=4)
else:
- data = np.expand_dims(data, 4)
+ data = np.expand_dims(data, data.ndim)
# Add flat batch dim and return.
data = np.expand_dims(data, 0)
--
Issue with build_coordinates.py
Hello, I installed ffn and downloaded the sample data without issue. I am trying to train the sample model and got an error.
I ran this:
(ffn) user:~/ffn-master$ runffn.sh
#!/bin/bash
python compute_partitions.py
--input_volume ~/third_party/neuroproof_examples/validation_sample/groundtruth.h5:stack
--output_volume ~/third_party/neuroproof_examples/validation_sample/af.h5:af
--thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9
--lom_radius 24,24,24
--min_size 10000
python build_coordinates.py
--partition_volumes ~/third_party/neuroproof_examples/validation_sample/af.h5:af
--coordinate_output ~/third_party/neuroproof_examples/validation_sample/tf_record_file
--margin 24,24,24
compute_partitions.py runs fine, but with build_coordinates I get this:
Traceback (most recent call last):
File "build_coordinates.py", line 110, in
app.run(main)
File "/home/ncmir-lab/.conda/envs/ffn/lib/python3.6/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/home/ncmir-lab/.conda/envs/ffn/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "build_coordinates.py", line 62, in main
name, path, dataset = partvol.split(':')
ValueError: not enough values to unpack (expected 3, got 2)
Qustion about blank areas in inference labels
Hi,
I am working on FFN network training for a couple of weeks. However, the inference result is still not ideal. Large neurons are detected properly while many tiny neurons are left blank. I am wondering are there any method to refill these small blank areas?
Would appreciate any suggestions and insights.
Combine multiple sess.run into one ?
Undefined names: 'sampling' and '_required'
flake8 testing of https://github.com/google/ffn on Python 3.7.1
$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics
./ffn/inference/resegmentation_analysis.py:256:7: F821 undefined name 'sampling'
sampling, result.eval.from_a)
^
./ffn/inference/resegmentation_analysis.py:259:7: F821 undefined name 'sampling'
sampling, result.eval.from_b)
^
./ffn/utils/bounding_box.py:374:17: F821 undefined name '_required'
yield _required(self.start_to_box((x, y, z)))
^
./ffn/utils/bounding_box.py:395:11: F821 undefined name '_required'
_required(self.index_to_sub_box(i)) for i in range(i_begin, i_end))
^
4 F821 undefined name 'sampling'
4
E901,E999,F821,F822,F823 are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. These 5 are different from most other flake8 issues which are merely "style violations" -- useful for readability but they do not effect runtime safety.
- F821: undefined name
name
- F822: undefined name
name
in__all__
- F823: local variable name referenced before assignment
- E901: SyntaxError or IndentationError
- E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree
Realignment and Irregular section substitution
In the paper "Automated Reconstruction of a Serial-Section EM Drosophila Brain with Flood-Filling Networks and Local Realignment" there is mention of a local realignment and an irregular section substitution step before FFN is run. Is that code available somewhere? I couldn't find it in this repository
build_coordinates generating TFRecords so slowly
Why is 'build_coordinates' generating TFRecords so slowly? It takes 4 hours to create TFRecords for 500x500x200 images and 3 days for 2000x2000x200. Moreover, the performance utilization is very low; the server with an A100 GPU and the personal computer with an A2000 GPU have nearly identical speeds.
Training model does not use GPU
Ubuntu16.04
tensorflow-gpu 1.12
cuda 9.0
cudnn 7.4
TF can only detect CPU:
/job:localhost/replica:0/task:0/device:CPU:0
Training model does not use GPU.
syntax error in _required definition
Hi, I am suddenly getting a syntax error when I'm running an FFN inference script with the latest build. I think it originated from this commit: 3608a17
Here is the traceback:
[2019-11-02 16:06:06,460] {docker_operator.py:244} INFO - Traceback (most recent call last):
File "run_inference.py", line 31, in <module>
from ffn.inference import inference
File "/ffn/ffn/inference/inference.py", line 38, in <module>
[2019-11-02 16:06:06,460] {docker_operator.py:244} INFO - from . import align
File "/ffn/ffn/inference/align.py", line 22, in <module>
from ..utils import bounding_box
File "/ffn/ffn/utils/bounding_box.py", line 192
def _required(bbox: Optional[BoundingBox]) -> BoundingBox:
^
SyntaxError: invalid syntax
TensorFlow record files are corrupted
I'm trying to train an FFN, and the first 2 steps (partition and build the coordinate file) seem to go fine, but training throws Key Value errors. On further inspection (using TFRecord Viewer), I get this error:
tensorflow.python.framework.errors_impl.DataLossError: corrupted record at 0
.
Any help would be super appreciated!
TF version 1.13.2, and here are the exact calls I'm making:
For computing partitions:
python ../../ffn/compute_partitions.py \ --input_volume ../training_data_img.h5:label \ --output_volume training_data2.h5:af \ --thresholds 0.025,0.05,0.075,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9 \ --lom_radius 3,3,3 \ --min_size 10000
For building the TFRecord file:
python ../../ffn/build_coordinates.py \ --partition_volumes validation:training_data2.h5:af \ --coordinate_output tf_record_fileyw \ --margin 3,3,3
and for training:
python ../../ffn/train.py \ --train_coords tf_record_fileyw \ --data_volumes validation1:../training_data_img.h5:image \ --label_volumes validation1:../training_data_img.h5:label \ --model_name convstack_3d.ConvStack3DFFNModel \ --model_args "{\"depth\": 2, \"fov_size\": [2, 2, 2], \"deltas\": [2, 2, 2]}" \ --image_mean 72 \ --image_stddev 33
Multi-GPU utilization
I am looking for a way to correctly applying multi-GPU to train and inference.
I am now using multi-GPU to inference a large volume data separately. The labels generated by multi GPU turns out independent, and it's still confusing how I can combine them into a large one.
Really appreciate any help or insights.
What about consensus and agglomeration steps?
Hello,
Can someone please explain about the codes corresponding to the steps consensus and agglomeration (as in the nature paper) that follows the inference.
https://github.com/google/ffn/blob/master/ffn/inference/consensus.py and https://github.com/google/ffn/blob/master/ffn/inference/resegmentation.py seems to have routines associated with these two steps. However, there is no instruction about the usage of these modules. Has anyone succeeded in running these two steps to yield the so called output ffn-c?
Thanks,
Evaluation the segmentation
In the paper "High-Precision Automated Reconstruction of Neurons with Flood-filling Networks" the result is evaluated through edge accuracy and expected run length. I noticed that in doc it mentioned that the segmentation evaluation code is currently not part of the FFN repository. Is that code available somewhere?
Train error,maybe something wrong in bounding_box.py
ffn_inference_colab_demo issue
IndexError Traceback (most recent call last)
in <cell line: 6>()
4 else:
5 vis_update = 1
----> 6 canvas.segment_at((125, 125, 125), dynamic_image=inference.DynamicImage(),vis_update_every=vis_update)
1 frames
/content/ffn/ffn/inference/inference.py in update_at(self, pos, start_pos)
411 end = start + self._input_seed_size
412 logit_seed = np.array(
--> 413 self.seed[[slice(s, e) for s, e in zip(start, end)]])
414 init_prediction = np.isnan(logit_seed)
415 logit_seed[init_prediction] = np.float32(self.options.pad_value)
IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
Parameter Optimisation Recommendations
Hi!
I am running your FFN model with some data from the Allen brain atlas, the dataset is 100x100x100 pixels and look like this:
See the data here: https://drive.google.com/drive/folders/1TnhPA7sqIJj_KKC1bHS4zOrOdKKHY0vm?usp=sharing
Do you have any recommendations for which parameters in the model I should change in the FFN code to accommodate the differences of my dataset compared to the example EM dataset (such as the smaller size)?
Thank you!
Distinct samples of training data
Is it possible to train the ffn with several distinct pieces of data with each having its own bounding box?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.