Comments (4)
Steps and Issues encountered while installing CPU PJRT Plugin,
01: Install torch_xla
[Success]
pip install torch_xla
02: Build or Install cpu
Plugin [Failed]
# Build wheel
pip wheel plugins/cpu -v
# Or install directly
pip install plugins/cpu -v
Similar issue was encountered as mentioned in #7184 (comment)
03: Install bazel
[Success]
brew install bazel
04: Resolve bazel
version mismatch [Success]
ERROR: The project you're trying to build requires Bazel 6.5.0 (specified in /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelversion), but it wasn't found in /opt/homebrew/Cellar/bazel/7.1.2/libexec/bin.
cd "/opt/homebrew/Cellar/bazel/7.1.2/libexec/bin" && curl -fLO https://releases.bazel.build/6.5.0/release/bazel-6.5.0-darwin-arm64 && chmod +x bazel-6.5.0-darwin-arm64
05: C++ standard version mismatch [Success]
Following was added to .bazelrc
build --cxxopt=-std=gnu++17
build --host_cxxopt=-std=gnu++17
06: [Failed]
$ pip install plugins/cpu -v
Using pip 23.3.1 from /Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip (python 3.11)
Processing ./plugins/cpu
Running command pip subprocess to install build dependencies
Collecting setuptools
Using cached setuptools-70.0.0-py3-none-any.whl.metadata (5.9 kB)
Using cached setuptools-70.0.0-py3-none-any.whl (863 kB)
Installing collected packages: setuptools
Successfully installed setuptools-70.0.0
Installing build dependencies ... done
Running command Getting requirements to build wheel
bazel build //plugins/cpu:pjrt_c_api_cpu_plugin.so --symlink_prefix=/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/bazel- --remote_default_exec_properties=cache-silo-key=dev
INFO: Options provided by the client:
Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'build' from /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc:
Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc:
'build' options: --announce_rc --nocheck_visibility --enable_platform_specific_config --experimental_cc_shared_library --define=no_aws_support=true --define=no_hdfs_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true -c opt --config=short_logs --action_env=CC=gcc --action_env=CXX=g++ --spawn_strategy=standalone --incompatible_strict_action_env --noremote_upload_local_results --java_runtime_version=remotejdk_11 --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1 --define framework_shared_object=false --define tsl_protobuf_header_only=false --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --define=with_xla_support=true --noincompatible_remove_legacy_whole_archive --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility --cxxopt=-std=gnu++17 --host_cxxopt=-std=gnu++17
INFO: Found applicable config definition build:short_logs in file /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
Loading:
Loading:
Loading: 0 packages loaded
INFO: Build options --cxxopt and --host_cxxopt have changed, discarding analysis cache.
Analyzing: target //plugins/cpu:pjrt_c_api_cpu_plugin.so (0 packages loaded, 0 targets configured)
INFO: Analyzed target //plugins/cpu:pjrt_c_api_cpu_plugin.so (1 packages loaded, 10840 targets configured).
checking cached actions
INFO: Found 1 target...
[1 / 5] [Prepa] BazelWorkspaceStatusAction stable-status.txt
[249 / 1,676] Compiling llvm/lib/Demangle/RustDemangle.cpp [for tool]; 1s local ... (7 actions, 6 running)
[381 / 1,889] Compiling src/google/protobuf/compiler/zip_writer.cc [for tool]; 1s local ... (7 actions, 6 running)
[1,621 / 3,679] Compiling src/google/protobuf/compiler/code_generator.cc [for tool]; 2s local ... (7 actions, 6 running)
[2,685 / 5,997] Compiling src/google/protobuf/compiler/python/helpers.cc [for tool]; 2s local ... (6 actions running)
[2,952 / 6,589] Compiling xla/ef57.cc; 2s local ... (7 actions running)
[2,956 / 6,589] Compiling src/google/protobuf/compiler/python/pyi_generator.cc [for tool]; 3s local ... (5 actions running)
[6,588 / 6,589] Linking plugins/cpu/pjrt_c_api_cpu_plugin.so; 0s local
ERROR: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/BUILD:17:14: Linking plugins/cpu/pjrt_c_api_cpu_plugin.so failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //plugins/cpu:pjrt_c_api_cpu_plugin.so) external/local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/plugins/cpu/pjrt_c_api_cpu_plugin.so-2.params
ld: unknown options: --version-script --no-undefined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Target //plugins/cpu:pjrt_c_api_cpu_plugin.so failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 9.527s, Critical Path: 4.55s
INFO: 27 processes: 2 internal, 25 local.
FAILED: Build did NOT complete successfully
Traceback (most recent call last):
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "/private/var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/pip-build-env-3vcjdfr7/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 10, in <module>
File "/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/../../build_util.py", line 67, in bazel_build
subprocess.check_call(bazel_argv, stdout=sys.stdout, stderr=sys.stderr)
File "/Users/tej/anaconda3/envs/PyTorch/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['bazel', 'build', '//plugins/cpu:pjrt_c_api_cpu_plugin.so', '--symlink_prefix=/Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/bazel-', '--remote_default_exec_properties=cache-silo-key=dev']' returned non-zero exit status 1.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /Users/tej/anaconda3/envs/PyTorch/bin/python /Users/tej/anaconda3/envs/PyTorch/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py get_requires_for_build_wheel /var/folders/wc/rkrv7ck92zd4f_3qgk8q2gn00000gn/T/tmpn1xqffzo
cwd: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Machine Specs
$ python -V
Python 3.11.9
$ pip list
Package Version
------------------------- --------------
accelerate 0.30.0.dev0
aiohttp 3.9.5
aiosignal 1.3.1
anyio 4.3.0
appnope 0.1.4
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-lru 2.0.4
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
beautifulsoup4 4.12.3
bitsandbytes 0.42.0
bleach 6.1.0
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
comm 0.2.2
contourpy 1.2.1
cycler 0.12.1
datasets 2.19.1
debugpy 1.8.1
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.8
executing 2.0.1
fastjsonschema 2.19.1
filelock 3.13.4
fonttools 4.51.0
fqdn 1.5.1
frozenlist 1.4.1
fsspec 2024.3.1
h11 0.14.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.22.2
idna 3.7
ipykernel 6.29.4
ipython 8.24.0
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.3
joblib 1.4.2
json5 0.9.25
jsonpointer 2.4
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
jupyter_client 8.6.1
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.0
jupyter_server_terminals 0.5.3
jupyterlab 4.1.8
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.1
kiwisolver 1.4.5
lazy_loader 0.4
librosa 0.10.2
llvmlite 0.42.0
MarkupSafe 2.1.5
matplotlib 3.8.4
matplotlib-inline 0.1.7
mistune 3.0.2
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nbclient 0.10.0
nbconvert 7.16.3
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.3
notebook 7.1.3
notebook_shim 0.2.4
numba 0.59.1
numpy 1.26.4
overrides 7.7.0
packaging 24.0
pandas 2.2.2
pandocfilters 1.5.1
parso 0.8.4
pexpect 4.9.0
pillow 10.3.0
pip 23.3.1
platformdirs 4.2.1
pooch 1.8.1
prometheus_client 0.20.0
prompt-toolkit 3.0.43
psutil 5.9.8
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 16.0.0
pyarrow-hotfix 0.6
pycparser 2.22
Pygments 2.17.2
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-json-logger 2.0.7
pytube 15.0.0
pytz 2024.1
PyYAML 6.0.1
pyzmq 26.0.2
referencing 0.35.0
regex 2024.5.10
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.18.0
safetensors 0.4.3
scikit-learn 1.4.2
scipy 1.13.0
seaborn 0.13.2
Send2Trash 1.8.3
sentencepiece 0.2.0
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
soundfile 0.12.1
soupsieve 2.5
soxr 0.3.7
stack-data 0.6.3
sympy 1.12
terminado 0.18.1
threadpoolctl 3.5.0
tinycss2 1.3.0
tokenizers 0.19.1
torch 2.3.0
torch-xla 1.0
torchaudio 2.3.0
torchvision 0.18.0
tornado 6.4
tqdm 4.66.2
traitlets 5.14.3
transformers 4.40.2
types-python-dateutil 2.9.0.20240316
typing_extensions 4.11.0
tzdata 2024.1
uri-template 1.3.0
urllib3 2.2.1
wcwidth 0.2.13
webcolors 1.13
webencodings 0.5.1
websocket-client 1.8.0
wheel 0.41.2
xgboost 2.0.3
xxhash 3.4.1
yarl 1.9.4
$ system_profiler SPSoftwareDataType SPHardwareDataType
Software:
System Software Overview:
System Version: macOS 14.4.1 (23E224)
Kernel Version: Darwin 23.4.0
Boot Volume: Macintosh HD
...
Hardware:
Hardware Overview:
Model Name: MacBook Air
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 8 GB
...
from xla.
/assigntome
from xla.
@duncantech May I know if this is what's expected? Or is there something wrong with what I'm doing?
from xla.
real error seems to be
ERROR: /Users/tej/Documents/GitHub-Repositories/MachineLearning/Docathon-2024/xla/plugins/cpu/BUILD:17:14: Linking plugins/cpu/pjrt_c_api_cpu_plugin.so failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //plugins/cpu:pjrt_c_api_cpu_plugin.so) external/local_config_cc/cc_wrapper.sh @bazel-out/darwin_arm64-opt/bin/plugins/cpu/pjrt_c_api_cpu_plugin.so-2.params
ld: unknown options: --version-script --no-undefined
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I asked bard and it told me
"
Platform incompatibility:
These options might be specific to certain platforms or linkers. For example, --no-undefined is generally used with the GNU linker, and it may not be supported on other linkers like the one Apple uses for macOS. Similarly, --version-script is used to control symbol versions and might not be available on all platforms.
"
I am guessing ARM CPU build does not work out of the box and require us tweaking the build config.
from xla.
Related Issues (20)
- RuntimeError: isDifferentiableType(variable.scalar_type()) INTERNAL ASSERT FAILED when using torch.repeat HOT 2
- In-place operations on an DLPack aliased XLA tensor does not propagate. HOT 8
- [RFC] PR Cherrypicking Process After a Release Branch Cut HOT 1
- Incomplete Checkpoints for Non-Sharded Parameters During SPMD Training in PyTorch XLA HOT 4
- Delete main branch HOT 4
- TPU Initialization Failed HOT 3
- How to convert hlo.pb to hlo text? HOT 3
- The combination of inplace ops and custom op resulted in incorrect results HOT 3
- 2.4 backport PR request list HOT 15
- [torchbench] `drq` training fails to run on non-dynamo. HOT 1
- [RFC] PyTorch/XLA eager mode as default HOT 3
- [RFC] torch_xla2 dynamo integration HOT 5
- Error: Check Failed: `it != inputs_.end()` When Using `torch` and `torch_xla` Nightly Version (Post-20240527) with SPMD HOT 4
- xla gpu train ResizeBicubic is not supported HOT 4
- SPMD on TPU Pod with Multiple Machines and Randomness/Seed HOT 6
- Sharing tensor storage (with DLPack) results in unexpected behavior. HOT 3
- grid sampler op need to register fp32 autocast HOT 4
- dear teachers, i can connect the internet, but i can not download it the torch_xla HOT 16
- Support non-traceable Custom Ops with opaque arguments HOT 4
- GRU loss not converging on TPU HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xla.