Giter VIP home page Giter VIP logo

mpifileutils's Introduction

mpiFileUtils

mpiFileUtils provides both a library called libmfu and a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x. It also provides a library that simplifies the creation of new tools or can be used in applications.

Documentation is available on ReadTheDocs.

DAOS Support

mpiFileUtils supports a DAOS backend for dcp, dsync, and dcmp. Custom serialization and deserialization for DAOS containers to and from a POSIX filesystem is provided with daos-serialize and daos-deserialize. Details and usage examples are provided in DAOS Support.

Contributors

We welcome contributions to the project. For details on how to help, see our Contributor Guide

Copyrights

Copyright (c) 2013-2015, Lawrence Livermore National Security, LLC. Produced at the Lawrence Livermore National Laboratory CODE-673838

Copyright (c) 2006-2007,2011-2015, Los Alamos National Security, LLC. (LA-CC-06-077, LA-CC-10-066, LA-CC-14-046)

Copyright (2013-2015) UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the Department of Energy.

Copyright (c) 2015, DataDirect Networks, Inc.

All rights reserved.

Build Status

The current status of the mpiFileUtils master branch is Build Status.

mpifileutils's People

Contributors

0xflotus avatar adammoody avatar adilger avatar akesandgren avatar ana avatar awellington avatar bdevcich avatar bringhurst avatar daltonbohning avatar dinatale2 avatar fwang2 avatar gonsie avatar groffta avatar hawartens avatar kcgthb avatar knightku avatar levatin1 avatar lixi-storage avatar michael-hennecke avatar morrone avatar mtfs avatar ofaaland avatar rezib avatar ryon-jensen avatar sjtbham avatar spakin avatar wiliamhuang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mpifileutils's Issues

dfind build is broken

dfind isn't compiling on ubuntu 13.10.

dfind.c:81:42: error: ‘struct stat’ has no member named ‘st_mtimespec’
pred_add(pred_newer, (void *)(statbuf.st_mtimespec.tv_sec));

Reporting job progress during DCP transfer

During our data migration from ORNL Spider 1 file system to Spider 2, it is not uncommon to have a transfer job that last more than a day. One issue user/Ops guys wonders is: how much has been transferred so far, and how much is left? Is there a solution to this problem? Thanks.

dcmp: support light-weight meta data comparison

Like rsync, let's add a mode to dcmp --sync where we can assume the source and target files are identical if their sizes and mod times are the same, but different otherwise. In this mode, we can avoid reading the full contents of each file.

We could perhaps use this mode by default, again like rsync, and execute the full check if the user throws an additional option.

Determine underlying file systems

Different underlying file systems can require different approaches, especially at scale. We need a way to identify the underlying file system as well as its properties (e.g., default stripe size and width). This will enable optimizations specific to each file system.

For example, we'd like to know whether a target file system is Lustre, GPFS, or NFS, and in the case of Lustre, we'd like to know how many object servers it has. This info will help to maximize parallelism and determine file chunking such that we can mitigate Lustre lock thrashing.

dbcast: make stripe size an option

dbcast takes a parameter that lists the size of the segment at which to slice a file for parallel access. It makes sense to default to the stripe size of a file if the file is on a parallel file system (or otherwise default to some hard coded value). Let's add an option to use this default but allow the user to override it.

dtar build is broken

dtar isn't integrated into autotools yet.

I don't want to rip out the build before moving the tests into the common /test dir.

dbcast: add support to bcast directory recursively

Currently, dbcast broadcasts a single file. Some people want to broadcast entire directories.

A work around is to tar up a directory, bcast the tar file, and then untar in parallel. That works, but it's ugly.

We could walk the directory, and then we'd need the writer rank on each node to recreate the directory structure, from top down. We then need to handle reading and writing files in parallel (efficiently).

LICENSE file needs to be updated with release info.

Currently, we need some kind of confirmation from the folks at the following locations that this is ok to release.

  • LANL
  • LLNL
  • ORNL
  • DDN Japan

Please add anyone to this list who may need something added to the LICENSE file before a public release.

report missing files for items listed on command line

The tools silently ignore missing files on the command line, which can lead the user to believe the command succeeded in updating a file in which they made have had a typo in the pathname. We should print errors or warnings for any items explicitly listed on command line that we can't find, as that's probably a mistake by the user..

dcp: add periodic progress message

Could do this with non-blocking MPI collectives, e.g., every 10 seconds send non-blocking bcast from rank 0 to start a non-blocking all reduce. Use non blocking allreduce to sum files and bytes copied. All procs periodically test outstanding collective calls in their work loops.

drm: don't walk every path if filter option was used

We may not need to walk every path if some can be excluded based on the --match or --exclude options that were specified. For instance, if the user is searching for *.txt files in the current working directory do we need to walk all of the subdirectories?

Broken build ?

All -

Clean head from clone, every dependency has been installed to default /usr/local
The build stop at the following line: (the same result from ./buildme scripts). The platform is ubuntu 12.04 LTS and 14.04 LTS ... since no one else is complaining, I am not sure ...

checking for libDTCMP... /usr/local
checking for library containing DTCMP_Init... no
configure: error: couldn't find a suitable libdtcmp, use --with-dtcmp=PATH
make: *** [config.status] Error 1

github email hook for notification

hi Adam/Jon,

Right now, by watching a github repo, you get notified when issues are opened or comments are made, but not when a new commit is made to the code repo, which I'd like know to properly rebase the code if needed.

Github provides such as hook to enable "git diff" goes to a mailing list. Do you think you can enable this to a -dev mailing list? I hope I am not the only one who likes to get notified on this :-)

Thanks

Feiyi

If you made a commit, add your name to the AUTHORS file

If you've written code or documentation and have committed it to the repo, please add your name to the AUTHORS file. A single line of code is enough to have your name in the AUTHORS file.

Please list it in alphabetical order.

If you have a first and last name, please keep the file in this format:

Lastname, Firstname [email protected]

Otherwise, put it in whatever format you feel is right.

This issue may be closed after the following folks are listed or have confirmed that they don't want their name listed:

buildme dependencies failures

I just ran into the following build issues:

  • --prefix=/ccs/home/fwang2/fileutils/install
    ./buildme_dependencies: line 71: --prefix=/ccs/home/fwang2/fileutils/install: No such file or directory
  • '[' 127 -ne 0 ']'
  • echo 'failed to configure, build, or install libcircle'
    failed to configure, build, or install libcircle
  • exit 1

documentation for contributions

We need documentation for how contributions should be accepted. This will mostly be instructions on how to create a branch, run tests, create a merge request, etc.

dsh: request to filter ls and rm options

For directories that have lots of files, it would be nice to filter entries with wildcarding/regex or allowing the user to limit the number of items printed. We might also provide different sorting methods.

Build error on Cray platform

The following build error seems to be specific to Cray XK login nodes (SUSE-based I hink). I am not sure if their build environment any different, I explicitly switched to GNU tool chain instead of default PGI, and still experience the errors. Any idea?

make[2]: Entering directory /autofs/na3_techint/home/fwang2/fileutils-atlas/build/src/common' mpicc -DHAVE_CONFIG_H -I. -I../../../src/common -I../.. -I/opt/sw/xk6/ompi/1.7.1/sles11.1_gnu4.7.2/include/openmpi/opal/mca/hwloc/hwloc151/hwloc/include -I/opt/sw/xk6/ompi/1.7.1/sles11.1_gnu4.7.2/include/openmpi/opal/mca/event/libevent2019/libevent -I/opt/sw/xk6/ompi/1.7.1/sles11.1_gnu4.7.2/include/openmpi/opal/mca/event/libevent2019/libevent/include -I/opt/sw/xk6/ompi/1.7.1/sles11.1_gnu4.7.2/include -I/opt/sw/xk6/ompi/1.7.1/sles11.1_gnu4.7.2/include/openmpi -I/opt/cray/xe-sysroot/4.1.40/usr/include -I/opt/cray/xe-sysroot/4.1.40/usr/include -I/ccs/techint/home/fwang2/fileutils-atlas/install/include/ -std=gnu99 -ggdb -W -pedantic -Wall -Wextra -Wconversion -Wformat=2 -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wuninitialized -Wunknown-pragmas -Wstrict-aliasing -Wfloat-equal -Wundef -Wbad-function-cast -Wcast-qual -Wcast-align -Wstrict-prototypes -Wmissing-prototypes -Wredundant-decls -Winline -Wdisabled-optimization -Wshadow -Wwrite-strings -I/ccs/techint/home/fwang2/fileutils-atlas/src/common -I/ccs/techint/home/fwang2/fileutils-atlas/install/include -MT libfileutils_common_a-bayer_param_path.o -MD -MP -MF .deps/libfileutils_common_a-bayer_param_path.Tpo -c -o libfileutils_common_a-bayer_param_path.otest -f 'bayer_param_path.c' || echo '../../../src/common/'`bayer_param_path.c
In file included from ../../../src/common/bayer_util.h:17:0,
from ../../../src/common/bayer.h:19,
from ../../../src/common/bayer_param_path.c:1:

cc1: warning: -Wuninitialized is not supported without -O
cc1: warning: -funit-at-a-time is required for inlining of functions that are only called once
In file included from ../../../src/common/bayer_util.h:17,
from ../../../src/common/bayer.h:19,
from ../../../src/common/bayer_param_path.c:1:
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:427: warning: redundant redeclaration of ‘fscanf’
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:430: warning: redundant redeclaration of ‘scanf’
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:432: warning: redundant redeclaration of ‘sscanf’
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:478: warning: redundant redeclaration of ‘vfscanf’
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:483: warning: redundant redeclaration of ‘vscanf’
/opt/cray/xe-sysroot/4.1.40/usr/include/stdio.h:486: warning: redundant redeclaration of ‘vsscanf’
../../../src/common/bayer_param_path.c:12: warning: function declaration isn’t a prototype
../../../src/common/bayer_param_path.c: In function ‘bayer_stat_pack’:
../../../src/common/bayer_param_path.c:38: warning: conversion to ‘size_t’ from ‘long int’ may change the sign of the result
../../../src/common/bayer_param_path.c: In function ‘bayer_stat_unpack’:
../../../src/common/bayer_param_path.c:72: error: ‘blksize_t’ undeclared (first use in this function)
../../../src/common/bayer_param_path.c:72: error: (Each undeclared identifier is reported only once
../../../src/common/bayer_param_path.c:72: error: for each function it appears in.)
../../../src/common/bayer_param_path.c:72: error: expected ‘;’ before ‘val’
../../../src/common/bayer_param_path.c:95: warning: conversion to ‘size_t’ from ‘long int’ may change the sign of the result
make[2]: *** [libfileutils_common_a-bayer_param_path.o] Error 1
make[2]: Leaving directory /autofs/na3_techint/home/fwang2/fileutils-atlas/build/src/common' make[1]: *** [install-recursive] Error 1 make[1]: Leaving directory/autofs/na3_techint/home/fwang2/fileutils-atlas/build/src'
make: *** [install-recursive] Error 1

How to set it up

I've cloned the repo into a Linux red hat machine. I do not know how to set it up and make it run.

Please advise. or point me to some documentation.

Thanks,

Creation of v0.0.1-alpha.2 release

Please use this issue to manage the release v0.0.1-alpha.2.

The bar for a pre-release does not have to be high.

"because I want to" is probably valid enough reason for a pre-release.

Install tools on CORAL EA systems

Verify that mpiFileUtils builds and runs on CORAL EA systems (like ray).

Then prepare a public install in either /usr/global or via a TCE package using 0.6 release.

Update package for 0.7 release when it's ready.

Help users find and restripe files in dstripe

Recursively search directory tree, report, and optionally restripe:

  1. any files that are larger than some threshold that only live on one stripe

  2. any directories in which there are many smallish files which are not well balanced across OSTs -- consider a checkpoint directory in which each process writes a file, roughly some percentage of memory. For this set of files, we should likey ensure each file is on one OST and make sure the set of files in the directory are well balanced across the available OSTs

It'd be nice if dstripe can help users detect and fix up these problems if they exist.

libmfu: Develop cache file format using variable length fields

The current cache file format uses fixed length records. It encodes the full path to each file, and to produce a fixed length field, it uses the longest path name of any file. So when writing the cache file for a large number of files in which there may be a really long file name, the output file size is inflated.

To conserve space, let's store this using variable length records. To support that efficiently, we'll either need a table in the header describing the offset of each record, or we'll need to add some end-of-record marker between records.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.