Comments (9)
Holy cow, that's indeed not good!
I was afraid of this, but quite honestly didn't think to test it on such large volumes. Thanks for this input.
I very much agree with your diagnosis of the problem, for sure the issue is keeping everything we scanned in memory. I like your proposed direction. How about keeping a certain "radius" of the tree in memory... a depth level most users won't need to go into (eg. 5, but maybe that's too much/too little... we'd have to play around with it to find out) and triggering a rescan of that branch if the user goes into it (or into one folder before it, ideally)?
I'm very happy to hear you like diskonaut. Your contributions are always very welcome. Will be looking forward to them (for this or other issues). Give me a shout for absolutely anything you need.
from diskonaut.
Just a thought - would an sqlite in-memory db help offload? I have scanned a huge tree of files/dirs and the db file is ~200mb. You can store path/parent/size in three columns, add indexes and then compute dir-size based on the "parent". Or even inodes. Sorry if that's exactly what your app does.
from diskonaut.
Hmm, interesting idea. I wonder how sqlite would deal with the String issue. :)
Tbh though, I feel this can be solved without introducing another dependency. Thanks for the suggestion though.
from diskonaut.
or use an on disk sqlite DB, I dont see that the in memory one helps, simply swapping one form of memory storage for another. On disk sqlite gives you ultra fast access with seemless spill onto disk.
from diskonaut.
@pm100 an on-disk dependency means the app will require to deal with DB versions and migrations and what not. My suggestion was to help with using a battle-tested piece of software that has very efficient data structure for storage and querying (as long as it is configured and used correctly).
@imsnif what String issue? The multibyte characters (#54)?
from diskonaut.
I will make a on disk sqlite version for kicks, I know sqlite v well.
from diskonaut.
@pm100 Feel free to fork this repo and do whatever you feel is best.
from diskonaut.
Ah no, sorry. I was meaning what OP mentioned, where this issue likely is coming from.
What (we assume) the issue here is, is that the string names from all the files are kept in memory and for extremely large volumes (which I believe OP was talking about) this can take up quite some memory.
We may be able to solve this with sqlite. But it will surely need to solve this problem as well (same memory and same strings, after all). While I don't know how/if sqlite solves an issue like this, I assume it involves some sort of caching on the HD. I think it would serve us better to implement something similar ourselves (as I suggested in my comment above), seeing as it would allow us to specialize and even improve the app: eg. allow for an HD cache per subfolder that can also be used between runs and busted when the last scan for that particular subfolder is older than X (or upon user request - we can show "time since last scan" for each subfolder in the UI as an example).
That said, I make a lot of assumptions here. I'm of course willing to take a look at a version with sqlite, but I must honestly say that I'm not convinced about its benefits in this case.
from diskonaut.
I just had this happen too.
from diskonaut.
Related Issues (20)
- thread 'main' panicked at 'index out of bounds: the len is 3600 but the index is 3600 HOT 12
- Bug: When deleting files or entering folders with multibyte characters, the app might crash
- Feature: implement `--apparent-size` HOT 1
- Bug: Small files hidden from view HOT 7
- Option to delete without confiration? HOT 2
- failure="0.1" not available, the current version is 0.1.8 HOT 4
- An option not to cross filesystems HOT 1
- most dirs are missing HOT 4
- Feature Request: Give an option to exclude directory HOT 1
- offer prebuild deb files in a ppa
- Update jwalk dependency to 0.6
- aarch64 and armhf doest pass tests ?
- The folder takes up too much space (7091231513.3G) HOT 2
- Feature Request: Ability to not cross filesystem boundaries. HOT 2
- Graphics break down when running over SSH to Windows HOT 1
- Feature: 'explode' a directory
- Numbers aren't adding up
- Feature: option `--ignore` to skip certain directory/directories HOT 2
- OneDrive on-demand - files showing full size
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from diskonaut.