Comments (6)
Try to maximize the number of disks working on the I/O operation. beegfs-df can help to see what disks/targets are active.
from azurehpc.
The following procedure worked for me.
- Have existing beegfs (4xL8sv2), but want to increase storage and metadata servers from 4 to 6.
-azhpc-resize beegfssm 6
[hpcadmin@beegfsm beegfs]$ beegfs-check-servers
Management
beegfsm [ID: 1]: reachable at 10.34.4.14:8008 (protocol: TCP)
Metadata
beegfa57e000000 [ID: 1]: reachable at 10.34.4.4:8005 (protocol: TCP)
beegfa57e000004 [ID: 2]: reachable at 10.34.4.8:8005 (protocol: TCP)
beegfa57e000003 [ID: 3]: reachable at 10.34.4.7:8005 (protocol: TCP)
beegfa57e000001 [ID: 4]: reachable at 10.34.4.5:8005 (protocol: TCP)
beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8005 (protocol: TCP)
beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8005 (protocol: TCP)
Storage
beegfa57e000001 [ID: 1]: reachable at 10.34.4.5:8003 (protocol: TCP)
beegfa57e000003 [ID: 2]: reachable at 10.34.4.7:8003 (protocol: TCP)
beegfa57e000004 [ID: 3]: reachable at 10.34.4.8:8003 (protocol: TCP)
beegfa57e000000 [ID: 4]: reachable at 10.34.4.4:8003 (protocol: TCP)
beegfa57e000006 [ID: 5]: reachable at 10.34.4.12:8003 (protocol: TCP)
beegfa57e000005 [ID: 6]: reachable at 10.34.4.6:8003 (protocol: TCP)
We can see that 2 extra storage and metadata servers have been added.
from azurehpc.
It worked. Thanks. However, this is strange that I don't see the performance improvement when doubling the size of beegfsm. I am testing the performance by copying the 24GB folder between two locations: time cp sim sim3 -R
The folder contains ca. 120 directories with several files in each in MB range (2.2M, 119MB, 47MB).
For small and bigger beegfsm I get the same result.
real 2m26.809s
user 0m0.461s
sys 0m29.615s
vs
real 2m32.859s
user 0m0.440s
sys 0m28.253s
IO Pattern: 55k reads, 50k writes, summing up to 90% of execution time.
I also tried to change the chunk_size with beegfs-ctl --setpattern --chunksize=1m --numtargets=8 /beegfs/chunksize_1m_4t to 1m, 64kB and 4m size with 8, 1, 8 targets, respectively.
This did not affect the results much.
from azurehpc.
Have you tried multiple cp's ? Maybe each cp to a different target. May need to determine if the source data is on 4 storage targets or more. Need to determine if reading or writing is slowing the performance.
from azurehpc.
First feedback: This is my first attempt to parallelize cp operation:
for i in {0..N}
do
cp -r $sourcedir/processor$i/* $destination/processor$i &
done
wait # wait for cp threads to finish
With this code I was able to reduce the copying time from 1m41sec to 58 secs. Now I will test the same code after doubling the size of the cluster.
from azurehpc.
Closed
from azurehpc.
Related Issues (20)
- BEEGFS_LOCAL_SSF #20210102.1 pipeline failure HOT 1
- Tag resource group with the pipeline name HOT 1
- [bug] Pipeline image creation failed - BuildCluster Gen#1, BuildCluster Gen#2
- [bug] Slum_autoscale pipeline failed with headnode connnection refused. HOT 1
- [bug] cc_anf pipeline failed with provisioning failed (InternalServerError) HOT 1
- Unable to create a cluster out of an HPC Image derived from a VHD - package epel-release is not installed epel-release-7-11.noarch HOT 3
- Support OpenPBS 20 HOT 1
- xfs nobarrier is deprecated since kernel 4.13
- [bug] NFS mount fails due issues in nfs.conf HOT 1
- support cyclecloud8 in cc_install.sh HOT 3
- cyclecloud8 config fails on "authorization.check_datastore_permissions"' HOT 1
- Using existing resources: RG, Vnet, Jumpbox etc HOT 4
- [slurm version in AutoScale script] HOT 3
- start_gpu_data_collector.sh script failure when tried to excute HOT 1
- [feature] Add the link of this video in the documentation
- gpu_monitoring: Script returns error on Ubuntu 20.04 LTS [bug]
- [feature] specify subscription through config.json?
- This repo is missing important files
- [bug] "Error with `azhpc-scp` command in `apps/wrf/readme.md` : -r flag unrecognized"
- [bug]: Unable to locate a modulefile for 'spack/spack' in `build-wrf.sh`and `build_wps.sh` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azurehpc.