Giter VIP home page Giter VIP logo

hucx's Introduction

# HUCX README

-   [Introduction]
-   [License]
-   [Requirements]
-   [Installation Instructions]
    -   [Installation(Source)]
    -   [Installation(Binary)]
    -   [Testing HUCX]

-   [Upgrade]
-   [Examples]
-   [Feature and Rollback Description]

## Introduction

HUCX is developed based on ucx and provides algorithms with independent research and development.

Based on performance considerations, HUCX **DO NOT** provide the functionalities related to transmission security.

New algorithms are used for three collective operations:

- MPI_Allreduce
- MPI_Bcast
- MPI_Barrier

New algorithms are as follows:

- Binomial tree
- Ring
- Recursive
- Topo-aware Binomial tree
- Topo-aware K-nomial tree
- Topo-aware Recursive + Binomial(intra)
- Topo-aware Recursive + K-nomial(intra)

Use environment variables to select the certain algorithm, here is the algorithm table.

Bcast:

|  UCX_BUILTIN_BCAST_ALGORITHM   | Algorithm  |
| ---- | ---- |
| 1  | Binomial tree |
| 2  | Topo-aware Binomial tree |
| 3  | Topo-aware K-nomial tree |
| 4  | Topo-aware K-nomial tree + Binomial tree(intra) |

Allreduce:

|  UCX_BUILTIN_ALLREDUCE_ALGORITHM   | Algorithm  |
|  ----  | ----  |
| 1  | Recursive |
| 2  | Topo-aware Recursive + Binomial(intra)(Node) |
| 3  | Topo-aware Recursive + Binomial(intra)(Socket) |
| 4  | Ring |
| 5  | Topo-aware Recursive + K-nomial (intra)(Node) |
| 6  | Topo-aware Recursive + K-nomial (intra)(Socket) |
| 7  | Topo-aware K-nomial(Node) |
| 8  | Topo-aware K-nomial(Socket) |

Barrier:

|  UCX_BUILTIN_BARRIER_ALGORITHM   | Algorithm  |
|  ----  | ----  |
| 1  | Recursive |
| 2  | Topo-aware Recursive + Binomial(intra)(Node) |
| 3  | Topo-aware Recursive + Binomial(intra)(Socket) |
| 4  | Topo-aware Recursive + K-nomial (intra)(Node) |
| 5  | Topo-aware Recursive + K-nomial (intra)(Socket) |
| 6  | Topo-aware K-nomial(Node) |
| 7  | Topo-aware K-nomial(Socket) |

## License

HUCX is licensed as BSD3.

## Requirements

* Operating System: 
  * CentOS 7.6  4.14.0-115.el7a.0.1.aarch64 version
* gnu:
  * gnu7
  * gnu8
  * gnu9

## Installation Instructions

### Installation (Source)

Clone HUCX from Github at the following location:
```
git clone https://github.com/kunpengcompute/hucx.git
```
By default, we usually install HUCX as follows:
```
cd  /where/to/download/ucx
./autogen.sh
./contrib/configure-opt  --prefix=/where/to/install/ucx
make -j32
make -j32 install
```
export /where/to/install/ucx/lib to LD_LIBRARY_PATH.

### Installation (Binary)

Refer to 'Installation (Binary)' in HMPI.  

### Testing HUCX

Refer to 'Testing HMPI' in HMPI.  

## Upgrade

Remove the installation directory:
```
rm -rf /where/to/install/ucx
```
Then refer to 'Installation Instructions' and install again.

## Examples

Refer to 'Testing HMPI' in HMPI.  

## Feature and Rollback Description  

The following scenarios are supported only by some algorithms:


* Scenario 1: Unbalanced PPN

Unbalanced processes per node (PPN) may occur if the PPN is not specified or a rankfile is used when mpirun is used to submit tasks. Node-aware algorithms related to the binomial tree (such as Bcast algorithm 2 and Allreduce algorithm 2) support unbalanced PPN. Other topo-aware algorithms do not support unbalanced process distribution among sockets or nodes
If the specified algorithm does not support unbalanced PPN, the algorithm is changed and the following information is displayed:
```
Current algorithm don't support ppn unbalance case, switch to default algorithm
```

* Scenario 2: Unbalanced Socket Processes

All socket-aware algorithms require that the number of socket processes be balanced. If a socket-aware algorithm is specified and the number of socket processes is unbalanced, the algorithm is changed to a node-aware algorithm and the following information is displayed:
```
Warning: process number in every socket must be same in socket-aware algorithm, please make sure ppn must be even and '--map-by socket' included.
Switch to corresponding node-aware algorithm already.
```

* Scenario 3: Non-Commutative Operations

The MPI standard allows users to define mathematical operations in MPI_Allreduce in addition to built-in mathematical operations such as sum (MPI_SUM) and product (MPI_PROC) provided by the MPI. User-defined operations such as matrix multiplication may be non-commutative. Allreduce algorithms 1 and 4 support these operations.

If the specified algorithm is not algorithm 1 or algorithm 4, HMPI changes the algorithm to algorithm 1 (for small- and medium-sized packets) or algorithm 4 (for large-sized packets) based on the data size and displays the following information:
```
Current algorithm does not support non commutative operation, and switch to Recursive doubling or Ring Algorithm which may have unexpected performance
```

* Scenario 4: Ultra-Long Data Types

The MPI standard allows users to define data types in addition to built-in data types such as integer (MPI_INT) and double (MPI_DOUBLE) provided by the MPI. User-defined data types may be too long and may trigger an error. Allreduce algorithms 1 and 4 support these operations. The following information is displayed:
```
Current algorithm does not support large datatype, and switch to Recursive doubling or Ring Algorithm which may have unexpected performance
```

* Scenario 5: Allreduce Algorithm 8 with Only One Process in the Socket

Due to the special implementation of algorithm 8, when algorithm 8 is specified and there is only one process in the socket, the algorithm is changed to algorithm 7.

* Scenario 6: Topo-Aware Algorithm with Inconsecutive Process IDs

Only non-topo-aware algorithms, Allreduce algorithm 2, and Bcast algorithm 2 support this scenario. If the specified algorithm is not among the preceding algorithms, the algorithm is automatically changed and the following information is displayed:
```
Current algorithm demand rank number is continous. Switch default algorithm whose performance may be not the best
```

hucx's People

Contributors

nsosnsos avatar derekpush avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.