Comments (11)
Hi! We've released thinc 8.1.2 today which restricts blis to <0.8.0
again instead of <0.10.0
. Thanks again for your report!
from thinc.
Hi @danieldk and @njsmith, thanks for the quick responses. That's a nice idea re blis 0.7.8, we shall do that now and come back to you.
from thinc.
I've run some more CI runs here. The results are as follows:
- CI #430 to #460 is with
thinc 8.1.1
andblis 0.7.8
. There were no failures in 30x runs. - CI #462 to #491 is with
thinc 8.1.1
andblis 0.9.1
. There were 5x failures in 30x runs.
The error is rather intermittent so not 100% conclusive but this does suggest an issue with more recent blis
versions.
from thinc.
@ascillitoe Could you reproduce the crash with the following env variable and post the
stdout
output of the run:BLIS_ARCH_DEBUG=1
? This will help us narrow down which BLIS kernel is faulty - you should something along the lines oflibblis: selecting sub-configuration 'haswell'.
in the log.
Sure thing, I'll have a go!
from thinc.
Thanks for the quick response! Yeah, the OOB access is in one of the haswell
kernels, which confirms our suspicion.
from thinc.
As further info, looking at the traceback again:
File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\spacy\ml\staticvectors.py", line 56 in forward
@RobertSamoilescu has noticed the failure appears to originate in staticvectors.py
here. We're therefore wondering if the recent change to gemm
(explosion/cython-blis#72) is the cause? Please let us know if it would be better to open an issue over on cython-blis
, or if we can provide more info.
from thinc.
@RobertSamoilescu has noticed the failure appears to originate in
staticvectors.py
here. We're therefore wondering if the recent change togemm
(explosion/cython-blis#72) is the cause?
Unlikely, using uninitialized memory erroneously wouldn't cause segfaults, only garbage output. We found a bunch of issues in BLIS 0.9 where gemm
would read out-of-bound. These were fixed upstream and the fixes were added to cython-blis
. Since we thought the issues in BLIS were fixed, we changed the upper pin to include cython-blis
0.9.x in Thinc 8.1.1. But it seems like there are more memory issues 😢.
Thanks for reporting this!
from thinc.
Can you check whether you still see the same issue in an environment that has thinc 0.8.1 + blis 0.7.8? That would let us confirm for sure whether the issue is coming from thinc or blis.
from thinc.
Thanks for the update @svlandeg!
from thinc.
@ascillitoe Could you reproduce the crash with the following env variable and post the stdout
output of the run: BLIS_ARCH_DEBUG=1
? This will help us narrow down which BLIS kernel is faulty - you should something along the lines of libblis: selecting sub-configuration 'haswell'.
in the log.
from thinc.
Hi again. I've set some more CI runs going here (508 to 518).
Interesting, it looks like all the runs that fail (like this one) have:
libblis: selecting sub-configuration 'haswell'.
whereas the runs that pass (like this one) have:
libblis: Hardware has 2 FMA units; using 'skx' sub-config.
libblis: selecting sub-configuration 'skx'.
So it looks like it might be the haswell
kernels?
from thinc.
Related Issues (20)
- `SparseLinear` does not look up weights correctly
- Improving type definitions for forward/backward in models
- Wrong entry-point fix_random_seed
- Type checkers don't like non-literal `__all__` in `thinc.api` for exposing the public interface HOT 2
- Allow Pydantic >= 1.10 HOT 2
- Is there a way to check if `prefer_gpu` has been called? HOT 1
- Tracking issue for Apple M-series GPU support
- llvm error out of memory HOT 1
- 'GLIBC_2.29' not found HOT 2
- Is there anyway to load a variable in a config file as a list? HOT 3
- Is there anyway to add multiline strings in a config file? HOT 1
- `reduce_mean` with zero-length sequence fails on NumPy backend HOT 2
- Tensorflow dependency slows spacy down HOT 2
- NumPy asarray op does not respect dtype for lists HOT 1
- Documentation of Loss.get_grad confusingly describes independent variable HOT 1
- core dump on powerpc64le HOT 1
- Spacy fails with "module 'cupy' has no attribute 'cublas'" HOT 3
- Wrong blis version on Termux HOT 1
- Support for cython 3.x
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thinc.