How can I add the new storage or metaserver to the cluster? I have t

The following procedure worked for me. Have existing beegfs (4

Scaling out BeeGFS about azurehpc HOT 6 CLOSED

azure commented on July 26, 2024

Scaling out BeeGFS

from azurehpc.

Comments (6)

garvct commented on July 26, 2024 1

Try to maximize the number of disks working on the I/O operation. beegfs-df can help to see what disks/targets are active.

from azurehpc.

garvct commented on July 26, 2024

The following procedure worked for me.

Have existing beegfs (4xL8sv2), but want to increase storage and metadata servers from 4 to 6.
-azhpc-resize beegfssm 6

[hpcadmin@beegfsm beegfs]$ beegfs-check-servers
Management

beegfsm [ID: 1]: reachable at 10.34.4.14:8008 (protocol: TCP)

Metadata

beegfa57e000000 [ID: 1]: reachable at 10.34.4.4:8005 (protocol: TCP)
beegfa57e000004 [ID: 2]: reachable at 10.34.4.8:8005 (protocol: TCP)
beegfa57e000003 [ID: 3]: reachable at 10.34.4.7:8005 (protocol: TCP)
beegfa57e000001 [ID: 4]: reachable at 10.34.4.5:8005 (protocol: TCP)
beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8005 (protocol: TCP)
beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8005 (protocol: TCP)

Storage

beegfa57e000001 [ID: 1]: reachable at 10.34.4.5:8003 (protocol: TCP)
beegfa57e000003 [ID: 2]: reachable at 10.34.4.7:8003 (protocol: TCP)
beegfa57e000004 [ID: 3]: reachable at 10.34.4.8:8003 (protocol: TCP)
beegfa57e000000 [ID: 4]: reachable at 10.34.4.4:8003 (protocol: TCP)
beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8003 (protocol: TCP)
beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8003 (protocol: TCP)

We can see that 2 extra storage and metadata servers have been added.

from azurehpc.

lmiroslaw commented on July 26, 2024

It worked. Thanks. However, this is strange that I don't see the performance improvement when doubling the size of beegfsm. I am testing the performance by copying the 24GB folder between two locations: time cp sim sim3 -R
The folder contains ca. 120 directories with several files in each in MB range (2.2M, 119MB, 47MB).

For small and bigger beegfsm I get the same result.
real 2m26.809s
user 0m0.461s
sys 0m29.615s

vs
real 2m32.859s
user 0m0.440s
sys 0m28.253s

IO Pattern: 55k reads, 50k writes, summing up to 90% of execution time.

I also tried to change the chunk_size with beegfs-ctl --setpattern --chunksize=1m --numtargets=8 /beegfs/chunksize_1m_4t to 1m, 64kB and 4m size with 8, 1, 8 targets, respectively.

This did not affect the results much.

from azurehpc.

garvct commented on July 26, 2024

Have you tried multiple cp's ? Maybe each cp to a different target. May need to determine if the source data is on 4 storage targets or more. Need to determine if reading or writing is slowing the performance.

from azurehpc.

lmiroslaw commented on July 26, 2024

First feedback: This is my first attempt to parallelize cp operation:

for i in {0..N}
do
  cp -r $sourcedir/processor$i/* $destination/processor$i  &
done
wait # wait for cp threads to finish

With this code I was able to reduce the copying time from 1m41sec to 58 secs. Now I will test the same code after doubling the size of the cluster.

from azurehpc.

garvct commented on July 26, 2024

Closed

from azurehpc.

Scaling out BeeGFS about azurehpc HOT 6 CLOSED

Comments (6)

[hpcadmin@beegfsm beegfs]$ beegfs-check-servers
Management

Metadata

Storage

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (6)

[hpcadmin@beegfsm beegfs]$ beegfs-check-servers Management

Metadata

Storage

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

[hpcadmin@beegfsm beegfs]$ beegfs-check-servers
Management