Comments (7)
MooseFS should be deleting chunks even when in high speed rebalance. Possible reasons for not deleting chunks in the system include:
- a disconnected chunk server in maintenance mode
- a connecting/disconnecting chunk server (still in registration/de-registration of chunks phase - be aware de-registration is done by the master, so even if the chunk server process is already shut down, it might take the master a couple of minutes to finish, if the chunk server had a lot of chunks to begin with)
- operation limit is reached: either deletion limit is set very low and you don't see the effect of deleting chunks as new chunks ready for deletion appear, or general number of other jobs is so high, that there are not enough resources for deletions (deletions are quite low on the priority scale)
Are you sure none of the above applies in your case? If you are, please tell us the version of MooseFS you are using and any config settings that differ from the defaults (you can omit custom instance name and custom pathnames).
from moosefs.
Yes, absolutely sure. Latest MooseFS release (3.0.117). One disk marked for rebalance <
in preparation for its removal (to move its chunks to other disks).
from moosefs.
The one disk with <
is in the same chunk server that has high speed rebalance on or in another one? You wrote "A chunk server is not deleting", but just making sure: you have only one chunk server in high speed rebalance? Or more? If yes, how many? And how many chunk servers in total in this instance?
I want to re-create your setup in our lab and run tests.
from moosefs.
The same chunkserver, obviously. Only one chunkserver is in high-speed rebalance -- the very chunkserver that is not deleting chunks (until rebalance is finished).
A dozen chunkservers total, but only one is in active high-speed replication mode, because one of its HDDs is marked <
to empty it by relocating its chunks to other HDDs, altogether with HDD_HIGH_SPEED_REBALANCE_LIMIT = 3
.
Total number of chunkservers hardly matters, as long as there is more than one...
A particular chunkserver is busy with load highlighted with <N>
in "Servers" view, as well as corresponding "Server Chart" indicating internal high-speed rebalance.
from moosefs.
Hi, a just barely related question:
If one HDD (or more) is marked <
to empty them, why is that particular chunkserver involved in having more traffic than other? I would have assumed that MooseFS would copy the chunks from other replicas on other chunkservers, to parallelise the rebalance operation to finish soon?
from moosefs.
@inkdot7 I'm not 100% sure I understand your question, but maybe this will explain: the master is not aware of chunk server's internal going-ons and that includes internal disk rebalance. For the master it's a normal chunk server. If a chunk server is in high speed rebalance mode and the high speed "tempo" is high, the chunk server may say to the master that it is overloaded and the master will try not to send it any tasks for a while.
The only thing master is aware of are marked for removal disks, but it's the *
designation in mfshdd.cfg, not <
or >
. Marked for removal is different, in that the disk is considered damaged in some way and then the replications try not to use it at all i.e. if a chunk needs to be replicated, because it is on MFR disk, but a copy of this chunk also exists elsewhere, this other copy will be used as a source of replication.
MFR (*
) is for replicating data on an endangered disk and the whole instance takes part in that, the disk itself is spared i/o whenever possible. Internal rebalance, whether "organic" or forced using <
and/or >
is done 100% internally and the rest of the system doesn't know and doesn't care about it.
from moosefs.
@onlyjob I tried to replicate your issue, but I can't, my instance deleted all the unnecessary chunks while one of the chunk servers was in high speed rebalance mode. I want to try to test it again on a larger scale (the instance that I used for testing was small and the internal rebalance on the one chunk server was completed in minutes), but I need to wait for completion of some other test we are currently running before I can try to do that.
from moosefs.
Related Issues (20)
- [FEATURE] mfs.cgi "RAM used" - SWAP not included HOT 5
- [FEATURE] hole-punching support HOT 3
- [BUG] Can't mount mfsmeta HOT 2
- [BUG] Slowdown after 160k files written HOT 4
- [BUG] moosefs client: Wasting of cpu time HOT 1
- [BUG] file corruption after read on some (not all!) mfsclient mounts HOT 13
- [BUG] 3.0.117: delayed standard chunks compliance. HOT 8
- [BUG] Outdated config.guess file in the release tarball
- [BUG] disks within a chuckserver are not getting balanced HOT 1
- supports IPv6 HOT 4
- [BUG] The data displayed by mfs has garbled characters HOT 8
- mfsmaster -a restore hangs with 100% CPU usage HOT 5
- [Question] 2 copys of chunks on one chunkserver HOT 1
- [BUG] Performance impact and write amplification with CHANGELOG_SAVE_MODE = 2 HOT 9
- Do the Master and Chunk servers have to be the same architecture? HOT 3
- [BUG] fuse: bad mount point `/matrix/synapse/storage/media-store/': Input/output error HOT 2
- [FEATURE] Official packages of MooseFS / MooseFS Pro for Debian 12 Bookworm HOT 2
- [BUG] mfsbdev and map + unmap + map on /dev/ndb0 = input/output error HOT 1
- [FEATURE] mfsclient mfstimeout default 0 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from moosefs.