Giter VIP home page Giter VIP logo

Comments (33)

sukritiramesh avatar sukritiramesh commented on May 19, 2024

Hi @chenghuige, a couple of quick clarifications:

  • Is this the entire failure log? Could you also run with bazel with --verbose_failures to see if the error message is more helpful?
  • Also, can you check the bazel version you are using? (You could use bazel version to do this)

from serving.

chenghuige avatar chenghuige commented on May 19, 2024

@sukritiramesh like below
gezi:/other/serving$ bazel build tensorflow_serving/... --verbose_failures
......
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.io/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
INFO: Found 168 targets...
ERROR: /home/gezi/other/serving/tensorflow_serving/session_bundle/example/BUILD:34:1: Executing genrule //tensorflow_serving/session_bundle/example:half_plus_two failed: bash failed: error executing command
(cd /home/gezi/.cache/bazel/_bazel_gezi/90eddd4c58c33600d0766ab4c9609dbb/serving &&
exec env -
PATH=/usr/local/cuda/bin:/home/gezi/bin:/home/gezi/tools:/home/gezi/tools/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/gezi/.local/bin:/home/gezi/bin
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; rm -rf /tmp/half_plus_two; bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two; cp -r /tmp/half_plus_two/* bazel-out/local-fastbuild/genfiles/tensorflow_serving/session_bundle/example/half_plus_two'): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 245.
INFO: Elapsed time: 14.431s, Critical Path: 7.75s
gezi:
/other/serving$ bazel version
Build label: 0.2.2
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Apr 21 13:01:41 2016 (1461243701)
Build timestamp: 1461243701
Build timestamp as int: 1461243701

from serving.

chenghuige avatar chenghuige commented on May 19, 2024

It seems someone else also face this problem,
tensorflow/tensorflow#559
he mentioned it at last comment.

from serving.

vinuraja avatar vinuraja commented on May 19, 2024

@chenghuige Just a wild stab. Could you try doing
pip install protobuf
and see if it works for you.

from serving.

chenghuige avatar chenghuige commented on May 19, 2024

hi @vinuraja I have already installed protobuf 3.0 and I can bazel build tensorflow from source.

from serving.

chrisolston avatar chrisolston commented on May 19, 2024

Hi,

This may be an instance of the bazel issue described here:
bazelbuild/bazel#665

Perhaps try the bazel option --genrule_strategy=standalone ?

Also, did you verify that you have a bash binary at /bin/bash?

-Chris

from serving.

chenghuige avatar chenghuige commented on May 19, 2024

@chrisolston Well I add this option, but still the same error.
/bin/bash exists.

from serving.

chrisolston avatar chrisolston commented on May 19, 2024

Hmmmm. Sorry, I'm a bit of out ideas. Perhaps you can try running those commands manually (the cd, rm, bash, ...) and try to narrow down which ones is failing and why.

from serving.

agupta83 avatar agupta83 commented on May 19, 2024

Hi,
[commit hash b4e9815]
I am having similar issue as @chenghuige. In order to narrow down the issue, I tried commenting out example:half_plus_two from all the BUILD files and compilation succeeded. Then I tried running bazel test tensorflow_serving/... and below is the log (2 fail).

..........
INFO: Found 122 targets and 44 test targets...
INFO: From Compiling external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc: In member function 'virtual tensorflow::Status tensorflow::GrpcServer::Start()':
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:213:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc: In member function 'virtual tensorflow::Status tensorflow::GrpcServer::Stop()':
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:233:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc: In member function 'virtual tensorflow::Status tensorflow::GrpcServer::Join()':
external/tf/tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:250:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
INFO: From Compiling tensorflow_serving/core/basic_manager_test.cc:
In file included from ./tensorflow_serving/core/basic_manager.h:32:0,
                 from tensorflow_serving/core/basic_manager_test.cc:16:
./tensorflow_serving/core/loader_harness.h: In function 'std::ostream& tensorflow::serving::operator<<(std::ostream&, tensorflow::serving::LoaderHarness::State)':
./tensorflow_serving/core/loader_harness.h:254:1: warning: no return statement in function returning non-void [-Wreturn-type]
 }
 ^
FAIL: //tensorflow_serving/session_bundle:gc_test (see /home/ashisgupta/.cache/bazel/_bazel_ashisgupta/bdc8575e25b4488b7808cf506fd82fca/serving/bazel-out/local_linux-fastbuild/testlogs/tensorflow_serving/session_bundle/gc_test/test.log).
FAIL: //tensorflow_serving/session_bundle:exporter_test (see /home/ashisgupta/.cache/bazel/_bazel_ashisgupta/bdc8575e25b4488b7808cf506fd82fca/serving/bazel-out/local_linux-fastbuild/testlogs/tensorflow_serving/session_bundle/exporter_test/test.log).
INFO: Elapsed time: 283.523s, Critical Path: 272.38s
//tensorflow_serving/batching:basic_batch_scheduler_test                 PASSED in 0.2s
//tensorflow_serving/batching:batch_scheduler_retrier_test               PASSED in 0.4s
//tensorflow_serving/batching:batch_scheduler_test                       PASSED in 0.1s
//tensorflow_serving/batching:shared_batch_scheduler_test                PASSED in 0.8s
//tensorflow_serving/batching:streaming_batch_scheduler_test             PASSED in 4.1s
//tensorflow_serving/batching/test_util:puppet_batch_scheduler_test      PASSED in 0.2s
//tensorflow_serving/core:aspired_versions_manager_benchmark             PASSED in 29.8s
//tensorflow_serving/core:aspired_versions_manager_builder_test          PASSED in 0.6s
//tensorflow_serving/core:aspired_versions_manager_test                  PASSED in 2.2s
//tensorflow_serving/core:availability_helpers_test                      PASSED in 0.1s
//tensorflow_serving/core:basic_manager_test                             PASSED in 3.2s
//tensorflow_serving/core:caching_manager_test                           PASSED in 0.4s
//tensorflow_serving/core:eager_load_policy_test                         PASSED in 0.2s
//tensorflow_serving/core:eager_unload_policy_test                       PASSED in 0.2s
//tensorflow_serving/core:loader_harness_test                            PASSED in 0.2s
//tensorflow_serving/core:manager_test                                   PASSED in 0.1s
//tensorflow_serving/core:servable_data_test                             PASSED in 0.2s
//tensorflow_serving/core:servable_id_test                               PASSED in 0.2s
//tensorflow_serving/core:servable_state_monitor_test                    PASSED in 0.2s
//tensorflow_serving/core:simple_loader_test                             PASSED in 0.2s
//tensorflow_serving/core:source_adapter_test                            PASSED in 1.1s
//tensorflow_serving/core:source_router_test                             PASSED in 2.2s
//tensorflow_serving/core:static_manager_test                            PASSED in 0.1s
//tensorflow_serving/core:static_source_router_test                      PASSED in 0.2s
//tensorflow_serving/core:storage_path_test                              PASSED in 0.2s
//tensorflow_serving/resources:resource_tracker_test                     PASSED in 0.3s
//tensorflow_serving/resources:resource_util_test                        PASSED in 0.2s
//tensorflow_serving/servables/hashmap:hashmap_source_adapter_test       PASSED in 0.2s
//tensorflow_serving/session_bundle:signature_test                       PASSED in 0.1s
//tensorflow_serving/sources/storage_path:file_system_storage_path_source_test PASSED in 0.2s
//tensorflow_serving/sources/storage_path:static_storage_path_source_test PASSED in 0.2s
//tensorflow_serving/util:any_ptr_test                                   PASSED in 0.1s
//tensorflow_serving/util:cleanup_test                                   PASSED in 0.1s
//tensorflow_serving/util:event_bus_test                                 PASSED in 0.1s
//tensorflow_serving/util:fast_read_dynamic_ptr_benchmark                PASSED in 27.4s
//tensorflow_serving/util:fast_read_dynamic_ptr_test                     PASSED in 0.5s
//tensorflow_serving/util:inline_executor_test                           PASSED in 0.1s
//tensorflow_serving/util:observer_test                                  PASSED in 0.4s
//tensorflow_serving/util:optional_test                                  PASSED in 0.1s
//tensorflow_serving/util:periodic_function_test                         PASSED in 0.7s
//tensorflow_serving/util:threadpool_executor_test                       PASSED in 0.4s
//tensorflow_serving/util:unique_ptr_with_deps_test                      PASSED in 0.1s
//tensorflow_serving/session_bundle:exporter_test                        FAILED in 1.3s
  /home/ashisgupta/.cache/bazel/_bazel_ashisgupta/bdc8575e25b4488b7808cf506fd82fca/serving/bazel-out/local_linux-fastbuild/testlogs/tensorflow_serving/session_bundle/exporter_test/test.log
//tensorflow_serving/session_bundle:gc_test                              FAILED in 1.3s
  /home/ashisgupta/.cache/bazel/_bazel_ashisgupta/bdc8575e25b4488b7808cf506fd82fca/serving/bazel-out/local_linux-fastbuild/testlogs/tensorflow_serving/session_bundle/gc_test/test.log

Executed 44 out of 44 tests: 42 tests pass and 2 fail locally.

Below is content from failed log file.

  • tensorflow_serving/session_bundle/exporter_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
-----------------------------------------------------------------------------
external/bazel_tools/tools/test/test-setup.sh: line 52: 22315 Segmentation fault      (core dumped) "$@"
  • tensorflow_serving/session_bundle/gc_test/test.log
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
-----------------------------------------------------------------------------
external/bazel_tools/tools/test/test-setup.sh: line 52: 22453 Segmentation fault      (core dumped) "$@"

Now when I run bazel-bin/tensorflow_serving/example/mnist_export /tmp/mnist_model, I get
Segmentation fault (core dumped)

Not sure how to go forward from here.

from serving.

kirilg avatar kirilg commented on May 19, 2024

Hmm, it's odd that none of us are able to reproduce this issue with half_plus_two. What OS/version are you using? What's your Bazel version?

Just to check, are you able to build and run export_half_plus_two? In a clean client without any modifications, try the following:
bazel build tensorflow_serving/session_bundle/example:export_half_plus_two
rm -rf /tmp/half_plus_two
bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two

Does this work? Does it generate the right files in /tmp/half_plus_two?

If you're able to run the above commands, let's try running the genrule on its own and include the full subcommand bazel uses (please paste the full output in your response):
bazel build --subcommands tensorflow_serving/session_bundle/example:half_plus_two

from serving.

agupta83 avatar agupta83 commented on May 19, 2024

OS:

Kernel        : Linux 3.16.0-71-generic (x86_64)
Compiled      : #92~14.04.1-Ubuntu SMP Thu May 12 23:31:46 UTC 2016
Default C Compiler  : GNU C Compiler version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3) 
Distribution        : Ubuntu 14.04.4 LTS

bazel:

Build label: 0.2.1
Build target: bazel-out/local_linux-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Mar 31 19:30:01 2016 (1459452601)
Build timestamp: 1459452601
Build timestamp as int: 1459452601

I was able to run (in a fresh clone)
bazel build tensorflow_serving/session_bundle/example:export_half_plus_two with following output (omitted some INFOs)

.........
INFO: Found 1 target...
INFO: From Compiling external/tf/tensorflow/contrib/tensor_forest/core/ops/finished_nodes_op.cc:

Target //tensorflow_serving/session_bundle/example:export_half_plus_two up-to-date:
  bazel-bin/tensorflow_serving/session_bundle/example/export_half_plus_two
INFO: Elapsed time: 375.691s, Critical Path: 313.90s

Failed to run
bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two.
Error: Segmentation fault (core dumped)

Output from
bazel build --subcommands tensorflow_serving/session_bundle/example:half_plus_two
The log is too large (8k+ lines). Please see the gist file

from serving.

kirilg avatar kirilg commented on May 19, 2024

Ok, so if bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two crashes, it suggests you actually can't export the example model, and it's not a problem with the genrule or bazel. Is "Error: Segmentation fault (core dumped)" the only output you get? On my machine I get

$ bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two
copying asset files to: /tmp/half_plus_two/00000123-tmp/assets
copying asset file: hello2.txt
copying asset file: hello1.txt

That target just runs this simple Python script and I don't see why it would seg fault. Since exporter_test was failing, my guess is it's something in exporter.py. I'm using Bazel 0.2.2 and Ubuntu 14.04 and I can't reproduce this failure. Jenkins uses Bazel 0.2.1 and also passes. Could you try playing around with those two python files and try to isolate what could be the source of the segfault?

(Not a fix, but just to unblock you, I think if you go through the docker setup, it should hopefully work for you. Though if possible to debug and isolate the root cause, it would be helpful for us to fix)

from serving.

agupta83 avatar agupta83 commented on May 19, 2024

The only message I get while running the command is Segmentation fault (core dumped). I will mess around with exporter.py and the test file. However, in mean time I will follow your suggestion and try docker.

Thank you.

from serving.

GilesColclough avatar GilesColclough commented on May 19, 2024

I have found exactly the same error:

Having run ./configure in the tensorflow directory, and setting up for python3.5; CUDA8.0 on Mac OSX El Cap, then attempting to install with bazel release 0.2.3-homebrew

$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

.
.
.
ERROR: /Users/gilesc/src/tensorflow/tensorflow/contrib/session_bundle/example/BUILD:38:1: Executing genrule //tensorflow/contrib/session_bundle/example:half_plus_two failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 245.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.8.0.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.8.0.dylib locally

from serving.

kmhofmann avatar kmhofmann commented on May 19, 2024

I opened a similar issue a few days ago:
tensorflow/tensorflow#2816
Which could be related...

from serving.

parnell avatar parnell commented on May 19, 2024

I'm also having compilation problems that come from half_plus_two.

OS: Darwin 15mbp-10584 15.5.0 Darwin Kernel Version 15.5.0
Python 3.5.1
CUDA 7.5, CUDNN 5
TensorFlow: From GitHub as of 2 days ago whichever version that is

Exact Error

ERROR: /Users/parn/src/tensorflow/tensorflow/contrib/session_bundle/example/BUILD:38:1: Executing genrule //tensorflow/contrib/session_bundle/example:half_plus_two failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 245.

from serving.

nathanielatom avatar nathanielatom commented on May 19, 2024

I'm getting the same error on

OS X 10.11.5,
using zsh,
Anaconda Python 2.7.11,
CUDA 7.5, CUDNN 5.1 RC
Protobuf 3.0.0b2.post2
Bazel 0.2.3-homebrew
Tensorflow master as of Sat June 25th

builds successfully for CPU only

using --genrule_strategy=standalone doesn't work

using --verbose_failures gives:

`ERROR: /Users/Atom/tensorflow/tensorflow/contrib/session_bundle/example/BUILD:38:1: Executing genrule //tensorflow/contrib/session_bundle/example:half_plus_two failed: bash failed: error executing command

(cd /private/var/tmp/_bazel_Atom/27cd00aa56ce2cc50980e2ae03b3938f/tensorflow &&
exec env -
PATH=/Users/Atom/.anaconda/bin:/Library/TeX/Distributions/TeXLive-2014-Basic.texdist/Contents/Programs/x86_64:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/local/sbin:/opt/X11/bin:/usr/local/git/bin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/local/git/bin:/usr/texbin
TMPDIR=/var/folders/h0/j056qfyd7_s0_n1hj2z_716h0000gp/T/
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; rm -rf /tmp/half_plus_two; /Users/Atom/.anaconda/bin/python bazel-out/host/bin/tensorflow/contrib/session_bundle/example/export_half_plus_two; cp -r /tmp/half_plus_two/* bazel-out/local_darwin-opt/genfiles/tensorflow/contrib/session_bundle/example/half_plus_two'):

com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 245.

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.7.5.dylib locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.5.dylib locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.7.5.dylib locally

Target //tensorflow/tools/pip_package:build_pip_package failed to build`

Looking at the line the build fails on, and trying to break it up, it seems like /tmp/half_plus_two never gets written. It seems like bazel-out/host/bin/tensorflow/contrib/session_bundle/example/export_half_plus_two is meant to generate the file but fails silently. Therefore cp -r /tmp/half_plus_two/* bazel-out/local_darwin-opt/genfiles/tensorflow/contrib/session_bundle/example/half_plus_two gets an error since nothing exists.

Digging a little further it looks like bazel-out/host/bin/tensorflow/contrib/session_bundle/example/export_half_plus_two calls bazel-out/host/bin/tensorflow/contrib/session_bundle/example/export_half_plus_two.runfiles/org_tensorflow/tensorflow/contrib/session_bundle/example/export_half_plus_two.py where /tmp/half_plus_two actually gets written with tensorflow_serving.session_bundle.export.export.

Adding some print statements, it looks like import tensorflow as tf fails.

The printed output of import tensorflow as tf on a Linux box using google's pre-built GPU binary for tensorflow-0.9, cuda-7.5, cudnn-4 is:

`I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
`

Comparing to the output on OS X (above in the giant messy verbose output), it looks like libcuda.so and libcurand.so are not successfully opened.

Hope this is helpful.

Update June 27 2016
I did a fresh clone of the tensorflow repo, and downloaded cuDNN 5 to replace cuDNN 5.1 RC.
Same error.

Also I tried bagel build tensorflow_serving/session_bundle/example:export_half_plus_two at the repo root and got:
ERROR: no such package 'tensorflow_serving/session_bundle/example': BUILD file not found on package path.

from serving.

Mazecreator avatar Mazecreator commented on May 19, 2024

I am having the same problem with the latest MASTER. I have Ubuntu 15.10, CUDA7.5, cuDNN (4 & 5.1 tried). I have no problem compiling the "r0.9" branch of "Tensorflow", but with the Tensorflow-serving I cannot complete a build like described above.

I did do a few tests. I can build export_half_plus_two without GPU support:
bazel build tensorflow_serving/session_bundle/example:export_half_plus_two

Once I add the cuda flag:
bazel build -c opt --config=cuda tensorflow_serving/session_bundle/example:export_half_plus_two

I get an error that shows where the build crashes:
ERROR: /home/greg/.cache/bazel/_bazel_greg/46040455ac0d0cc3199d34afa462e3f6/external/org_tensorflow/tensorflow/core/kernels/BUILD:779:1: undeclared inclusion(s) in rule '@org_tensorflow//tensorflow/core/kernels:resize_nearest_neighbor_op_gpu':
this rule is missing dependency declarations for the following files included by 'external/org_tensorflow/tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.cu.cc':
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_runtime.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/host_config.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/builtin_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/host_defines.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/driver_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/surface_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/texture_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/vector_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/channel_descriptor.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_runtime_api.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_device_runtime_api.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/driver_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/vector_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/vector_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/common_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/math_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/math_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/math_functions_dbl_ptx3.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/math_functions_dbl_ptx3.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_surface_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_texture_types.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_atomic_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_atomic_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_double_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_double_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_20_atomic_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_20_atomic_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_32_atomic_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_32_atomic_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_35_atomic_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_20_intrinsics.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_20_intrinsics.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_30_intrinsics.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_30_intrinsics.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_32_intrinsics.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_32_intrinsics.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/sm_35_intrinsics.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/surface_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/surface_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/texture_fetch_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/texture_fetch_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/texture_indirect_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/texture_indirect_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/surface_indirect_functions.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/surface_indirect_functions.hpp'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/device_launch_parameters.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda_fp16.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/math_constants.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_kernel.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_discrete.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_precalc.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_mrg32k3a.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_mtgp32_kernel.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/cuda.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_mtgp32.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_philox4x32_x.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_globals.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_uniform.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_normal.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_normal_static.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_lognormal.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_poisson.h'
'/usr/local/cuda-7.5/targets/x86_64-linux/include/curand_discrete2.h'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
Target //tensorflow_serving/session_bundle/example:export_half_plus_two failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 163.412s, Critical Path: 157.43s

Reading on the "tensorflow/tensorflow" github site, this may be caused by a CROSSTOOL issue where these files are not include. This problem has been fixed on tensorflow but may have been pulled into tensorflow-serving.
tensorflow/tensorflow#2109

I am stuck right now trying to get this to compile. Has anyone found a solution?

UPDATE: This is the exact error I get when building tensorflow_serviing

ERROR: /home/greg/serving/tensorflow_serving/session_bundle/example/BUILD:36:1: Executing genrule //tensorflow_serving/session_bundle/example:half_plus_two failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
Traceback (most recent call last):
File "/home/greg/.cache/bazel/_bazel_greg/46040455ac0d0cc3199d34afa462e3f6/execroot/serving/bazel-out/host/bin/tensorflow_serving/session_bundle/example/export_half_plus_two.runfiles/tf_serving/tensorflow_serving/session_bundle/example/export_half_plus_two.py", line 32, in
import tensorflow as tf
File "/home/greg/tensorflow/_python_build/tensorflow/init.py", line 23, in
from tensorflow.python import *
File "/home/greg/tensorflow/_python_build/tensorflow/python/init.py", line 48, in
from tensorflow.python import pywrap_tensorflow
File "/home/greg/tensorflow/_python_build/tensorflow/python/pywrap_tensorflow.py", line 28, in
_pywrap_tensorflow = swig_import_helper()
File "/home/greg/tensorflow/_python_build/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: libcudart.so.7.5: cannot open shared object file: No such file or directory
INFO: Elapsed time: 16.823s, Critical Path: 12.37s

from serving.

danieleghisi avatar danieleghisi commented on May 19, 2024

Exactly the same error on Mac OS X 10.10.
Anyone has a clue?

Last lines of log:
ERROR: /Users/danieleghisi/LibrariesAndStuff/tensorflow/tensorflow/contrib/session_bundle/example/BUILD:38:1: Executing genrule //tensorflow/contrib/session_bundle/example:half_plus_two failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 245.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.7.5.dylib locally
Target //tensorflow/tools/pip_package:build_pip_package failed to build

from serving.

Mazecreator avatar Mazecreator commented on May 19, 2024

My issue with half_plus_two was resolved with the last update to the master.

I still had compile issues with GPU option enabled. I added the following around like 62 of /third_party/gpus/crosstool/CROSSTOOL and that fixed the GPU issue. You may need to modify the path depending upon the error you get from bazel.

cxx_builtin_include_directory: "/usr/local/cuda-7.5/include"
cxx_builtin_include_directory: "/usr/local/cuda-7.5/targets/x86_64-linux/include"

from serving.

danieleghisi avatar danieleghisi commented on May 19, 2024

Still not working for me.
I'm on Mac OS, so I tried to add
cxx_builtin_include_directory: "/usr/local/cuda/include"
which exists (I have found no "target" subfolder on the other hand.)
Re-configured, and re-bazeled, still getting the same error.

from serving.

kirilg avatar kirilg commented on May 19, 2024

Thanks everyone for reporting these issues to help track down the root cause.

General notes:
There seem to be multiple different issues here and some confusion about tensorflow/contrib/session_bundle vs. tensorflow_serving/session_bundle which are in different repos.
tensorflow_serving/session_bundle is the original session_bundle that we copied over to tensorflow/contrib since it's a more appropriate location for it long-term. In the near future we'll update all uses of tensorflow_serving/session_bundle and delete it, keeping only the version in contrib. The two should be functionally identical.

It's possible that at least some of these problems are caused by proto version issues that @kmhofmann mentioned (thanks!). Please try the solution mentioned here

Several people mentioned they're using OS X which we currently don't support. It's still worth investigating since the problems reported are session_bundle related, but just mentioning it here because even once fixed, there may still be other Mac specific issues that are outside the scope of this issue.

Several people mentioned compiling with CUDA. Please make sure to update Tensorflow's CROSSTOOL file as mentioned in this issue. Specifically, add the line cxx_builtin_include_directory: "/usr/local/cuda-7.0/include" (or the path where your cuda is installed) to tensorflow/third_party/gpus/crosstool/CROSSTOOL. This issue tracks automatically configuring the CROSSTOOL file, but for now you'll have to change it manually.


@parnell I recall there were issues with genrules using Python 3.5 necessitating Jenkins to update to Ubuntu 16.04, but don't know the details. Perhaps Python3.4 will work for this genrule (it does in tensorflow/contrib), though you may hit some gRPC issues that we see in tensorflow_serving since they don't support python 3 at all, so perhaps 2.7 would work even better for you.

@nathanielatom it looks like you're compiling the tensorflow repo (tensorflow/contrib/session_bundle) and not tensorflow_serving/session_bundle, is that correct? This would explain why it can't find tensorflow_serving/session_bundle/example:export_half_plus_two. The errors you mentioned (can't run import tensorflow as tf and libcuda.so and libcurand.so not being found) would impact more than just session_bundle, and is worth checking with the Tensorflow team (tensorflow/tensorflow repo).

@Mazecreator You're hitting the CUDA issue mentioned above. Did you modify the crosstool file to point to the right CUDA directory? If not, please give that a try (instructions mentioned earlier in this post).

from serving.

kirilg avatar kirilg commented on May 19, 2024

Just noticed the last two posts, sorry.

You should only need the line
cxx_builtin_include_directory: "/usr/local/cuda-7.5/include"
@danieleghisi "/usr/local/cuda/include" may not work if it's a symlink, so please use the specific one like "/usr/local/cuda-7.5/include" or whatever version you're using.

from serving.

kirilg avatar kirilg commented on May 19, 2024

One more thing to try:
Please try cloning Tensorflow directly and compiling:
bazel build tensorflow/contrib/session_bundle/example:all

If that succeeds when tensorflow_serving fails, we'll have something to go on and we can work on a fix. In that case I think we'd need to configure tensorflow_serving/session_bundle/example/BUILD to be more like Tensorflow's (or better yet, speed up deprecation of serving/session_bundle and switch to contrib). We could try something like:

  1. Add srcs_version = "PY2AND3", to export_half_plus_two
  2. Add $(PYTHON_BIN_PATH) to the genrule which will ensure the genrule is run with the configured python version. Since PYTHON_BIN_PATH is configured by the ./configure script, we'll want to manually replace it with the contents of tensorflow/tools/bazel.rc (e.g. PYTHON_BIN_PATH=/usr/bin/python) but not yet sure how we can do this in a more generic way.
  3. We'll probably need to copy some additional params from tensorflow/tools/bazel.rc to tensorflow_serving/tools/bazel.rc like
    build --define PYTHON_BIN_PATH=/usr/bin/python
    test --define PYTHON_BIN_PATH=/usr/bin/python
    run --define PYTHON_BIN_PATH=/usr/bin/python
    the multiple --host_force_python=py$PYTHON_MAJOR_VERSION entries, and possibly others.
    Again, this one is also hard to make generic since the bazel.rc file is configured by the ./configure script.

from serving.

Mazecreator avatar Mazecreator commented on May 19, 2024

Just an update:
I loaded the latest Tensorflow_serving and had this problem again after adding the CROSSTOOL corrections.

I needed to run the following command to pre-compile TensorFlow before TensorFlow_serving compiled properly:
bazel build tensorflow/contrib/session_bundle/example:all

Also, I do not have "tensorflow_serving/tools/bazel.rc" on my development box, there is just a "docker" directory in the tools folder. Not sure if this is part of the issue?

What would it take to deprecate this distribution and combine this into the Tensorflow distribution? Is this something which I can help you?

from serving.

kirilg avatar kirilg commented on May 19, 2024

Within the TF Serving repo it's just under tools/bazel.rc. I added those steps just as an fyi listing what could be done if compiling tensorflow/contrib/session_bundle works (from within the TF repo, not from tensorflow_serving) but not tensorflow_serving.

I'm a bit confused by the pre-compile statement. Precompiling shouldn't matter since Bazel would compile dependencies first anyway using the same command, so I think something else likely changed. Is it the case that you can now compile tensorflow_serving/session_bundle/...? You mentioned before that you could compile tensorflow_serving/session_bundle/example just fine, but it failed with --config=cuda. In you latest comment you used bazel build, did you mean to use bazel build --config=cuda?

from serving.

Mazecreator avatar Mazecreator commented on May 19, 2024

Okay, found the bazel.rc file.

I compiled "Tensorflow" as I have that installed as the development source code. I had never compiled any of the examples (or code) just the PIP package in the past.

It was right after I ran the bazel command in the "tensorflow" repository that "serving" then compiled and went past the "half_plus_two" failure. I have not tried "serving" this time with the --config=cuda flag so it was the basic I was trying to get to succeed:
bazel build tensorflow_serving/...

What needs to be done to merge this into the Tensorflow repository? Can I help? It seems like with each update the two projects seem to fall out of sync.

from serving.

kirilg avatar kirilg commented on May 19, 2024

There is still some internal cleanup for us to do before we can delete tensorflow_serving/session_bundle and move to using session_bundle in tensorflow/contrib. This would need to be done inside Google, so unfortunately there is no way to help with that. It's not really a merge since right now the two are roughly equivalent, it's just a change of dependencies that should be a no-op.

We'd appreciate help in tracking down this issue though since none of us are able to reproduce locally and session_bundle works in both repos. Are you able to compile bazel build tensorflow_serving/... now?

from serving.

Mazecreator avatar Mazecreator commented on May 19, 2024

Yes, I can build tensorflow_serving now and compile my code.

Let me know if there is anything I can do to expedite the integration. I might be able to try a test version of TF with session_bundle integrated. Just let me know.

from serving.

kirilg avatar kirilg commented on May 19, 2024

We moved some more things to tensorflow/contrib and are reusing some of the code from there (e.g. manifest.proto and half_plus_two), instead of having copies in both. This should hopefully mean that the half_plus_two genrule runs with the right python environment configured by TensorFlow.

Please sync TensorFlow Serving (making sure to recursively also update submodules) to head and try the previously failing tests again, which will hopefully work now.

from serving.

drcege avatar drcege commented on May 19, 2024

Hi, I still cannot compile the lastest code with gpu support.
I have added

cxx_builtin_include_directory: "/usr/local/cuda-7.5/include"
cxx_builtin_include_directory: "/usr/local/cuda-7.5/targets/x86_64-linux/include"

to CROSSTOOL file, and I used bazel build -c opt --config=cuda tensorflow_serving/... on ubuntu 14.04 server.

What else should I do ?

from serving.

kirilg avatar kirilg commented on May 19, 2024

@gecece not sure if it's a related issue, but let's keep the current issue strictly about issues compiling session_bundle. If this isn't related to that could you please open a new Github issue? Include the Bazel output you get including all errors and what version of Bazel you're using.

from serving.

kirilg avatar kirilg commented on May 19, 2024

As an update, we finished moving session_bundle to tensorflow/contrib and it's no longer in the TensorFlow Serving repo. Closing this bug since there were no followup reports of any issues, but please re-open or file a new issue if you see the same problem at head in tensorflow/contrib/session_bundle.

from serving.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.