Comments (1)
I suggest the following to solve the problem:
We could take a peek onto the memory size of one vector and determine a better suited chunk size from that figure. This would result in better alignment of the chunks to cache sizes commonly found in modern processor architectures, resulting in more consistent performance.
A defensive approach for that would be to just assume that 1M of cache memory should be available to absolutely most machines around nowadays. Then we determine the stride size from that:
numStrideSize = max(1, 1048576 // np.empty((self.numCols, ), dtype=self.dtype).nybtes)
Since np.empty
does not initialize the data section of that vector, the overhead should be neglectible, making it preferrable over hard-to-read explicit lowlevel-architecture bean-counting in this context.
For better adaptability, the magic number 1048576 should go into the flags object, such that this can be controlled by the user, perhaps even initialized from reading out the actual cache sizes.
Opinions? Objections?
from fastmat.
Related Issues (20)
- may you share link to simple intro to FISTA algorithms ? HOT 1
- Build wheels for python 3.8+ and windows
- Add detection of LFSRCirculant sequence period
- Deprecation Warning: distutils and setuptools
- Migrate CI from travis-ci.org to travis-ci.com HOT 3
- Wrong results for large Fourier with Bluestein algorithm HOT 3
- Expand test cases for Toeplitz
- Importing fastmat fails when compiling cython source files with Cython 0.29.20 HOT 1
- Error while building the doc locally HOT 2
- Running edge detection demo
- Error with running demos: compOmpIsta, lowRankApprox and sparseReco HOT 6
- PyPI package outdated HOT 3
- Reintroduce demos as sphinx examples
- Numpy bool deprecation warning HOT 1
- Update python version indicators
- Polish pypi landing page
- Add exception tests
- Handle cpdef deprecation for cython 3
- fastmat 0.2 seems to be incopatible with certain numpy versions HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastmat.