Giter VIP home page Giter VIP logo

tf_deformable_net's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tf_deformable_net's Issues

After changing the gcc version

When I install gcc 4.9,the old issue was solved, but I meet a new problem following:

gpu2@gpu2-PowerEdge-R730:~/OWFO/TF_Deformable_Net$ python ./faster_rcnn/demo.py --model tf_deformable_net/restore_output/Resnet50_iter_145000.ckpt
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
filename: /home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling.so


/home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling.so
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (gpu2-PowerEdge-R730): /proc/driver/nvidia/version does not exist
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("pool1:0", shape=(?, ?, ?, 64), dtype=float32)
Tensor("bn2a_branch1/batchnorm/add_1:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("bn2a_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("res2a_relu:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("bn2b_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("res2b_relu:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("bn2c_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("res2c_relu:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("bn3a_branch1/batchnorm/add_1:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("bn3a_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res3a_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("bn3b_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res3b_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("bn3c_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res3c_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("bn3d_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res3d_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("bn4a_branch1/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4a_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4a_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4b_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4b_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4c_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4c_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4d_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4d_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4e_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4e_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("bn4f_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4f_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("rpn_conv/3x3/Relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/BiasAdd:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("Reshape_2:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/BiasAdd:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("res4f_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res4f_relu:0", shape=(?, ?, ?, 1024), dtype=float32)
Tensor("res5a_branch2a_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res5a_branch2b_offset/BiasAdd:0", shape=(?, ?, ?, 72), dtype=float32)
Tensor("transpose:0", shape=(?, 512, ?, ?), dtype=float32) Tensor("res5a_branch2b/weights/read:0", shape=(512, 512, 3, 3), dtype=float32) Tensor("transpose_1:0", shape=(?, 72, ?, ?), dtype=float32)
Tensor("bn5a_branch1/batchnorm/add_1:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("bn5a_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("res5b_branch2a_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res5b_branch2b_offset/BiasAdd:0", shape=(?, ?, ?, 72), dtype=float32)
Tensor("transpose_2:0", shape=(?, 512, ?, ?), dtype=float32) Tensor("res5b_branch2b/weights/read:0", shape=(512, 512, 3, 3), dtype=float32) Tensor("transpose_3:0", shape=(?, 72, ?, ?), dtype=float32)
Tensor("res5a_relu:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("bn5b_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("res5c_branch2a_relu:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("res5c_branch2b_offset/BiasAdd:0", shape=(?, ?, ?, 72), dtype=float32)
Tensor("transpose_4:0", shape=(?, 512, ?, ?), dtype=float32) Tensor("res5c_branch2b/weights/read:0", shape=(512, 512, 3, 3), dtype=float32) Tensor("transpose_5:0", shape=(?, 72, ?, ?), dtype=float32)
Tensor("res5b_relu:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("bn5c_branch2c/batchnorm/add_1:0", shape=(?, ?, ?, 2048), dtype=float32)
Tensor("conv_new_1_relu:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
Tensor("conv_new_1_relu:0", shape=(?, ?, ?, 256), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
Tensor("offset_reshape:0", shape=(?, 2, 7, 7), dtype=float32)
Tensor("fc_new_2/fc_new_2:0", shape=(?, 1024), dtype=float32)
Tensor("fc_new_2/fc_new_2:0", shape=(?, 1024), dtype=float32)
Loading network Resnet50_test... Traceback (most recent call last):
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
return fn(*args)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1000, in _run_fn
self._extend_graph()
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1049, in _extend_graph
self._session, graph_def.SerializeToString(), status)
File "/home/gpu2/anaconda3/lib/python3.6/contextlib.py", line 89, in exit
next(self.gen)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))

tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'DeformConvOp' with these attrs. Registered devices: [CPU], Registered kernels:

device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]

 [[Node: res5a_branch2b/DeformConvOp = DeformConvOp[T=DT_FLOAT, data_format="NCHW", deformable_group=4, num_groups=1, padding="SAME", rates=[1, 1, 2, 2], strides=[1, 1, 1, 1]](transpose, res5a_branch2b/weights/read, transpose_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./faster_rcnn/demo.py", line 136, in
saver.restore(sess,ckpt)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1439, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
feed_dict_string, options, run_metadata)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
target_list, options, run_metadata)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'DeformConvOp' with these attrs. Registered devices: [CPU], Registered kernels:
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]

 [[Node: res5a_branch2b/DeformConvOp = DeformConvOp[T=DT_FLOAT, data_format="NCHW", deformable_group=4, num_groups=1, padding="SAME", rates=[1, 1, 2, 2], strides=[1, 1, 1, 1]](transpose, res5a_branch2b/weights/read, transpose_1)]]

Caused by op 'res5a_branch2b/DeformConvOp', defined at:
File "./faster_rcnn/demo.py", line 126, in
net = get_network(args.demo_net)
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/factory.py", line 36, in get_network
return Resnet50_test()
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/Resnet50_test.py", line 21, in init
self.setup()
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/Resnet50_test.py", line 221, in setup
.deform_conv(3, 3, 512, 1, 1, biased=False, rate=2, relu=False, num_deform_group=4, name='res5a_branch2b')
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/network.py", line 41, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/network.py", line 192, in deform_conv
dconv = trans2NHWC(dconvolve(data, kernel, offset))
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/network.py", line 182, in
i, k, o, strides = [1, 1, s_h, s_w], rates=[1, 1, rate, rate], padding=padding, num_groups=num_groups, deformable_group=num_deform_group)
File "", line 45, in deform_conv_op
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'DeformConvOp' with these attrs. Registered devices: [CPU], Registered kernels:
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]

 [[Node: res5a_branch2b/DeformConvOp = DeformConvOp[T=DT_FLOAT, data_format="NCHW", deformable_group=4, num_groups=1, padding="SAME", rates=[1, 1, 2, 2], strides=[1, 1, 1, 1]](transpose, res5a_branch2b/weights/read, transpose_1)]]

It seems cuda has error , Do you have some advices?

training problems

Normalizing targets
done
Solving...
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

data (1, 600, 687, 3)
im_info (1, 3)
gt_boxes (4, 5)
gt_ishard (4,)
dontcare_areas (0, 4) []
2018-02-08 15:40:32.394938: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: ValueError: could not broadcast input array from shape (4) into shape (0)
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: could not broadcast input array from shape (4) into shape (0)
[[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois/_97, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_4/_99)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_net.py", line 116, in
restore=bool(int(args.restore)))
File "/home/commaai02/TF_Deformable_Net/lib/fast_rcnn/train.py", line 412, in train_net
sw.train_model(sess, max_iters, restore=restore)
File "/home/commaai02/TF_Deformable_Net/lib/fast_rcnn/train.py", line 264, in train_model
cls_prob, bbox_pred, rois = sess.run(fetches=fetch_list, feed_dict=feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: could not broadcast input array from shape (4) into shape (0)
[[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois/_97, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_4/_99)]]

Caused by op 'roi-data/PyFunc', defined at:
File "train_net.py", line 106, in
network = get_network(args.network_name)
File "/home/commaai02/TF_Deformable_Net/lib/networks/factory.py", line 29, in get_network
return VGGnet_train()
File "/home/commaai02/TF_Deformable_Net/lib/networks/VGGnet_train.py", line 17, in init
self.setup()
File "/home/commaai02/TF_Deformable_Net/lib/networks/VGGnet_train.py", line 80, in setup
.proposal_target_layer(n_classes,name = 'roi-data'))
File "/home/commaai02/TF_Deformable_Net/lib/networks/network.py", line 41, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "/home/commaai02/TF_Deformable_Net/lib/networks/network.py", line 372, in proposal_target_layer
[tf.float32,tf.float32,tf.float32,tf.float32,tf.float32])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): ValueError: could not broadcast input array from shape (4) into shape (0)
[[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois/_97, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_4/_99)]]

Batchsize issue

Is it possible to have more than one sample per batch? Data with batch size more than one are asserted in lib/rpn_msr/proposal_layer.py (line 67)

can it be used in g++4.8?

i meet this error in g++4.8:
/usr/local/lib/python2.7/dist-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:24:31: fatal error: cuda/include/cuda.h: No such file or directory

look forward to hearing from you
@Zardinality @JiahuiYu

A question when training

Hey, much thanks for your great work. I follow your work now.
But i met a problem when training, like :
iter: 0 / 100000, total loss: 5.0174, rpn_loss_cls: 1.2912, rpn_loss_box: 0.4618, loss_cls: 3.1485, loss_box: 0.1159, lr: 0.001000
speed: 3.746s / iter
cudaCheckError() failed : invalid device function.
That is say , after the firsr iter , it throw an error.
I do not know how to deal with it! Can you help me?

device: K80, CUDA8.0, cudnn5.1

Training error while running faster_rcnn/train_net.py

I was running the following command:
python faster_rcnn/train_net.py --gpu 0 --weights ./data/pretrain_mod
el/Resnet50.npy --imdb voc_2007_trainval --iters 70000 --cfg ./experiments/cfgs/faster_rcnn_end2end_resnet.yml --network Resnet50_train --set EXP_DIR exp_dir

when I got this error:
File "faster_rcnn/train_net.py", line 29, in
from lib.datasets.factory import get_imdb
File "/home/aditi/Documents/PhD/Steel_Seg/Deformable-ConvNets/TF_Deformable_Net-master/lib/datasets/init.py", line 13, in
from .imagenet3d import imagenet3d
File "/home/aditi/Documents/PhD/Steel_Seg/Deformable-ConvNets/TF_Deformable_Net-master/lib/datasets/imagenet3d.py", line 1, in
from . import imagenet3d
ImportError: cannot import name imagenet3d

in lib folder, make runs fine with a few warnings.

deform_conv.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringB5cxx11Ev

does anyone meet this problem?


#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
NSYNC_INC=$TF_INC"/external/nsync/public"
# please modify $ARCH according to the following list and your gpu model.
ARCH=sm_37
echo $TF_INC


# If coming across: cudaCheckError() failed : invalid device function. change -arch=sm_xx accordingly.

# Which CUDA capabilities do we want to pre-build for?
# https://developer.nvidia.com/cuda-gpus
#   Compute/shader model   Cards
#   6.1		      P4, P40, Titan X so CUDA_MODEL = 61
#   6.0                    P100 so CUDA_MODEL = 60
#   5.2                    M40
#   3.7                    K80
#   3.5                    K40, K20
#   3.0                    K10, Grid K520 (AWS G2)
#   Other Nvidia shader models should work, but they will require extra startup
#   time as the code is pre-optimized for them.
# CUDA_MODELS=30 35 37 52 60 61



CUDA_HOME=/usr/local/cuda/

if [ ! -f $TF_INC/tensorflow/stream_executor/cuda/cuda_config.h ]; then
    cp ./cuda_config.h $TF_INC/tensorflow/stream_executor/cuda/
fi

cd roi_pooling_layer

nvcc -std=c++11 -c --expt-relaxed-constexpr -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
	-I $TF_INC -I $NSYNC_INC -D GOOGLE_CUDA=1 -L $CUDA_HOME/lib64 -x cu -Xcompiler -fPIC -D GOOGLE_CUDA -arch=$ARCH

## if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below
#g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
#	roi_pooling_op.cu.o -I $TF_INC -I $NSYNC_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64 -L $TF_LIB -ltensorflow_framework -D_GLIBCXX_USE_CXX11_ABI=0 

# for gcc5-built tf
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
 	roi_pooling_op.cu.o -I $TF_INC -I $NSYNC_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64 -L $TF_LIB -ltensorflow_framework  -D_GLIBCXX_USE_CXX11_ABI=0
cd ..


# a debuilding psroi_pooling layer
cd psroi_pooling_layer
nvcc -std=c++11 -c --expt-relaxed-constexpr -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \
	-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -D GOOGLE_CUDA -arch=$ARCH


## if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below
#g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \
#	psroi_pooling_op.cu.o -I $TF_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64
# for gcc5-built tf
g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \
 	psroi_pooling_op.cu.o -I $TF_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64 -D_GLIBCXX_USE_CXX11_ABI=0

cd ..

cd deform_psroi_pooling_layer
nvcc -std=c++11 -c --expt-relaxed-constexpr -o deform_psroi_pooling_op.cu.o deform_psroi_pooling_op_gpu.cu.cc \
	-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -D GOOGLE_CUDA -arch=$ARCH

## if you install tf using already-built binary, or gcc version 4.x, uncomment the three lines below
#g++ -std=c++11 -shared -o deform_psroi_pooling.so deform_psroi_pooling_op.cc deform_psroi_pooling_op.cu.o -I \
#    $TF_INC -fPIC -lcudart -L $CUDA_HOME/lib64 -D GOOGLE_CUDA=1 -Wfatal-errors -I \
#    $CUDA_HOME/include
# for gcc5-built tf
g++ -std=c++11 -shared -o deform_psroi_pooling.so deform_psroi_pooling_op.cc deform_psroi_pooling_op.cu.o \
   -I $TF_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64 -D_GLIBCXX_USE_CXX11_ABI=0
cd ..

cd deform_conv_layer
nvcc -std=c++11 -ccbin=/usr/bin/g++-5 -c --expt-relaxed-constexpr -o deform_conv.cu.o deform_conv.cu.cc -I $TF_INC -I $NSYNC_INC -D\
          GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -L /usr/local/cuda-8.0/lib64/ --expt-relaxed-constexpr -arch=$ARCH
## if you install tf using already-built binary, or gcc version 4.x, uncomment the three lines below
#g++ -std=c++11 -shared -o deform_conv.so deform_conv.cc deform_conv.cu.o -I\
#      $TF_INC -I $NSYNC_INC -fPIC -lcudart -L $CUDA_HOME/lib64 -D GOOGLE_CUDA=1 -Wfatal-errors \
#      -L $TF_LIB -ltensorflow_framework -D_GLIBCXX_USE_CXX11_ABI=0 
# for gcc5-built tf
g++ -std=c++11 -shared -o deform_conv.so deform_conv.cc deform_conv.cu.o \
   -I $TF_INC -I $NSYNC_INC -fPIC -D GOOGLE_CUDA -lcudart -L $CUDA_HOME/lib64 -L $TF_LIB -ltensorflow_framework -D_GLIBCXX_USE_CXX11_ABI=0

cd ..

Can we add deformable convolution at different places?

Hello,

Can we add deformable convolution at different places? More specifically, for ResNet50, can we add it to the 4th stage, i.e. res4a, 4b, 4c, etc. Do we simply add the deformable convolution operation, or do we have to move RPN and PsRoI Proposal to before the 4th stage also? Thank you.

compile error

This is my make.sh
cd deform_conv_layer
nvcc -std=c++11 -ccbin=/usr/bin/g++-4.8 -c -o deform_conv.cu.o deform_conv.cu.cc -I $TF_INC -I $NSYNC_INC -D \GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -L /usr/lib32/ --expt-relaxed-constexpr -arch=$ARCH
g++-4.8 -std=c++11 -shared -o deform_conv.so deform_conv.cc deform_conv.cu.o -I
$TF_INC -I $NSYNC_INC -fPIC -lcudart -L $CUDA_HOME/lib32 -D GOOGLE_CUDA=1 -Wfatal-errors -L $TF_LIB -ltensorflow_framework -D_GLIBCXX_USE_CXX11_ABI=0

Errors are as follows:
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/lib/bfloat16/bfloat16.h(484): error: no suitable constructor exists to convert from "float" to "tensorflow::bfloat16"

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/lib/bfloat16/bfloat16.h(485): error: no suitable constructor exists to convert from "float" to "tensorflow::bfloat16"

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/tensorflow/core/lib/bfloat16/bfloat16.h(486): error: no suitable constructor exists to convert from "float" to "tensorflow::bfloat16"

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/absl/strings/string_view.h(496): error: constexpr function return is non-constant

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/google/protobuf/arena_impl.h(55): warning: integer conversion resulted in a change of sign

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/google/protobuf/arena_impl.h(309): warning: integer conversion resulted in a change of sign

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/google/protobuf/arena_impl.h(310): warning: integer conversion resulted in a change of sign

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/absl/strings/str_cat.h(259): error: expression must have a constant value

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/absl/strings/str_cat.h(259): error: expression must have a constant value

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(177): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::lgamma_impl::run(Scalar) [with Scalar=double]"
(2060): here
instantiation of "Eigen::internal::lgamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::lgamma(const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(32): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(1230): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::polygamma_impl::run(Scalar, Scalar) [with Scalar=float]"
(2078): here
instantiation of "Eigen::internal::polygamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::polygamma(const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(67): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(1230): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::polygamma_impl::run(Scalar, Scalar) [with Scalar=double]"
(2078): here
instantiation of "Eigen::internal::polygamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::polygamma(const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(74): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(411): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::erf_impl::run(Scalar) [with Scalar=double]"
(2084): here
instantiation of "Eigen::internal::erf_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::erf(const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(87): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(444): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::erfc_impl::run(Scalar) [with Scalar=float]"
(2090): here
instantiation of "Eigen::internal::erfc_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::erfc(const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(94): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(444): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::erfc_impl::run(Scalar) [with Scalar=double]"
(2090): here
instantiation of "Eigen::internal::erfc_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::erfc(const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(101): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(2096): here
instantiation of "Eigen::internal::igamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma(const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(110): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=double, mode=Eigen::internal::VALUE]"
(2096): here
instantiation of "Eigen::internal::igamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma(const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(120): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::DERIVATIVE]"
(2102): here
instantiation of "Eigen::internal::igamma_der_a_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma_der_a(const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(127): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=double, mode=Eigen::internal::DERIVATIVE]"
(2102): here
instantiation of "Eigen::internal::igamma_der_a_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma_der_a(const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(135): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::SAMPLE_DERIVATIVE]"
(2108): here
instantiation of "Eigen::internal::gamma_sample_der_alpha_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::gamma_sample_der_alpha(const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(143): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(820): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=double, mode=Eigen::internal::SAMPLE_DERIVATIVE]"
(2108): here
instantiation of "Eigen::internal::gamma_sample_der_alpha_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::gamma_sample_der_alpha(const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(154): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(721): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igammac_impl::run(Scalar, Scalar) [with Scalar=float]"
(2114): here
instantiation of "Eigen::internal::igammac_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igammac(const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(163): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(721): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::igammac_impl::run(Scalar, Scalar) [with Scalar=double]"
(2114): here
instantiation of "Eigen::internal::igammac_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igammac(const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(173): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(1279): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::betainc_impl::run(Scalar, Scalar, Scalar) [with Scalar=float]"
(2120): here
instantiation of "Eigen::internal::betainc_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::betainc(const Scalar &, const Scalar &, const Scalar &) [with Scalar=float]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(181): here

/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(1279): error: static assertion failed with "THIS_TYPE_IS_NOT_SUPPORTED"
detected during:
instantiation of "Scalar Eigen::internal::betainc_impl::run(Scalar, Scalar, Scalar) [with Scalar=double]"
(2120): here
instantiation of "Eigen::internal::betainc_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::betainc(const Scalar &, const Scalar &, const Scalar &) [with Scalar=double]"
/home/yangfan/anaconda2/lib/python2.7/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(191): here

22 errors detected in the compilation of "/tmp/tmpxft_00008541_00000000-7_deform_conv.cu.cpp1.ii".
g++-4.8: error: deform_conv.cu.o: No such file or directory

How can I solve this?

Testing problem

Hello,

I follow the tutorial to train the network with ResNet50 setting. The training runs from 125K iterations and the losses are reduced (final loss is around 1.9). But when I test the model, there are negative APs on multiple classes:

Results:
-1.000
-1.000
-1.000
-1.000
0.000
-1.000
0.003
-1.000
-1.000
-1.000
-1.000
-1.000
-1.000
-1.000
0.045
-1.000
-1.000
-1.000
-1.000
-1.000
-0.848

Do you what could be the problem? Thank you very much.

Problems in installation

Thanks a lot for the great lib! I'm trying to follow your work, yet facing some problems. When I tried to compile, an error notice is shown as following:
/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include
/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(390): error: identifier "hexp" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(400): error: the global scope has no "hlog"

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(413): error: identifier "hsqrt" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(435): error: identifier "hfloor" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(442): error: identifier "hceil" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(530): error: the global scope has no "hlog"

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/PacketMathHalf.h(291): error: identifier "h2log" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/PacketMathHalf.h(296): error: identifier "h2exp" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/PacketMathHalf.h(301): error: identifier "h2sqrt" is undefined

/home/siyangyuan/anaconda3/envs/tf14/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/PacketMathHalf.h(306): error: identifier "h2rsqrt" is undefined

10 errors detected in the compilation of "/tmp/tmpxft_00003e41_00000000-7_deform_conv.cu.cpp1.ii".

Which is very confusing. Do you have any thoughts about what might cause this? I'm using gcc 4.9.3, tensorflow 1.4 installed with binary file, cuda 8.0, cudnn 6, Titan X. Thanks a lot!

DCN V2

Recently MSRA published their paper about DCN v2 and released their code about v2 op, would you like to change it to tf version? What's more, when I init the DCN deeplab model with parameter learned with origin deeplab model, the result is different which is abnormal due to the initized 0 offset values. I only replace the conv2d with DCN, and I wonder if the implementation of dcn op is different from the origin in mxnet.

Offsets Subnet Details

Hi,

I was wondering about the details of the subnet that predicts the offsets used by the sampling operation, as I couldn't dilucidate them from the paper.
This subnet takes the feature map and adds parallel (to the core network) conv layers to predict the offsets that will be used:

  1. How many conv layers are used to get the offsets from the features?
  2. Which is their kernel size and activation function?
  3. Is batch norm used in this subnet?

Thanks for your great work.

How to use this on Windows

@Zardinality First I want to thank you for your work. Then I have a question. I want to use deformable convolution and deformable pooling on Windows 10. I saw you using these codes to load a .so file as module.
filename = osp.join(osp.dirname(__file__), 'deform_conv.so')
_deform_conv_module = tf.load_op_library(filename)
deform_conv_op = _deform_conv_module.deform_conv_op
deform_conv_grad_op = _deform_conv_module.deform_conv_backprop_op
I know that the .so file is compiled from C++ or cuda files. But I have no idea how to compile them on Windows so that it could be loaded just as the way it is on Linux.
Any suggestions will be appreciated.

About the data format in deformable psroi pooling

in lib/networks/network.py, the function "deform_psroi_pool"
I think the inputs data format have been converted into [NCHW], so the output data format of this function is still [NCHW]?
If so, the following code in "fc" function
if data_format=="NCHW":
feed_in = tf.reshape(tf.transpose(input,[0,3,1,2]), [-1, dim])
will convert the data to [NWCH]?

Is it compatible with cuda9.1 and cudnn7?

As the title.

I am trying to make it work on cuda9.1 and cudnn7. Tried to use gcc-4.9 and -D_GLIBCXX_USE_CXX11_ABI=0

when loading generated .so file, I get error:
undefined symbol: _ZTIN10tensorflow8OpKernelE

Do you have any clues on this?

Thanks

Question about the psroi_pooling.so

Hi, Thank you for your amazing work
According to the guide ,When I run the demo,I meet an issue following:
gpu2@gpu2-PowerEdge-R730:~/OWFO/TF_Deformable_Net$ python ./faster_rcnn/demo.py --model mypath
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
filename: /home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling.so


/home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling.so
Traceback (most recent call last):
File "./faster_rcnn/demo.py", line 21, in
from lib.networks.factory import get_network
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/VGGnet_train.py", line 2, in
from .network import Network
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/networks/network.py", line 11, in
from ..psroi_pooling_layer import psroi_pooling_op as psroi_pooling_op
File "/home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling_op.py", line 8, in
_psroi_pooling_module = tf.load_op_library(filename)
File "/home/gpu2/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: /home/gpu2/OWFO/TF_Deformable_Net/lib/psroi_pooling_layer/psroi_pooling.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE

And the other operating ,such training and testing,the same error as above, I see your description about installation ,I wonder if I need do more about installation ,especially about the psroi_pooling

Compile Error

@leinxx @Zardinality @JiahuiYu
At the beginning, these two errors appear.
image

After changing the two header files, the following error occurs
image
This problem has puzzled me for a long time. Is there any suggestion? Thank you.

cannnot locate cuda/cuda_config.h with binary installed tensorflow.

cannnot locate cuda/cuda_config.h with binary installed tensorflow when compiling deform_conv.cc. I tried to locate cuda_config.h with sudo find / -name cuda_config.h and yield nothing. I suspect cuda_config.h file is only available when compiled from source, generated from this template. Do you use self-complied tf?

➜  deform_conv_layer git:(master) ✗ g++ -std=c++11 -shared -o deform_conv.so deform_conv.cc deform_conv.cu.o -I$TF_INC -fPIC -lcudart -L$CUDA_PATH/lib64 -DGOOGLE_CUDA=1 -Wfatal-errors -D_GLIBCXX_USE_CXX11_ABI=0
In file included from /home/jiamingsun/anaconda3/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/default/stream_executor.h:26:0,
                 from /home/jiamingsun/anaconda3/lib/python3.6/site-packages/tensorflow/include/tensorflow/core/platform/stream_executor.h:24,
                 from deform_conv.cc:41:
/home/jiamingsun/anaconda3/lib/python3.6/site-packages/tensorflow/include/tensorflow/stream_executor/dso_loader.h:32:30: fatal error: cuda/cuda_config.h: No such file or directory
compilation terminated.

BTW there's a typo in make.sh
https://github.com/Zardinality/TF_Deformable_Net/blob/master/lib/make.sh#L45
Where $CUDA_HOME should be $CUDA_PATH, so as the followed instances.

Thanks!

Error when executing command: make

make.sh: 35: make.sh: nvcc: not found
g++: error: roi_pooling_op.cu.o: No such file or directory
make.sh: 50: make.sh: nvcc: not found
g++: error: psroi_pooling_op.cu.o: No such file or directory
make.sh: 64: make.sh: nvcc: not found
g++: error: deform_psroi_pooling_op.cu.o: No such file or directory
make.sh: 77: make.sh: nvcc: not found
g++: error: deform_conv.cu.o: No such file or directory

config.py paras problems

preclude rois intersected with dontcare areas above the value

__C.TRAIN.DONTCARE_AREA_INTERSECTION_HI = 0.5
__C.TRAIN.PRECLUDE_HARD_SAMPLES = True

what is about it, what is dontcare area?

Meaning of num_groups in offsets

What is the meaning of num_groups? As I understand, the output of res5a_branch2b_offset has the shape of (?, 14, 14, 72), where (14, 14) is the spatial dimension of the input and 72 = 2 (due to x and y) * 3 * 3 (kernel dimension is 3x3) * 4 (num_groups). But I can't understand the meaning of num_groups variable here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.