Giter VIP home page Giter VIP logo

oap-mllib's Issues

[Release] Error when installing intel-oneapi-dal-devel-2021.1.1 intel-oneapi-tbb-devel-2021.1.1

#  sh dev/install-build-deps-centos.sh
Installing oneAPI components ...
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 * base: ftp.sjtu.edu.cn
 * epel: epel.mirror.angkasa.id
 * extras: ftp.sjtu.edu.cn
 * updates: mirror.lzu.edu.cn
oneAPI/signature                                                                                                                                                                                                                                                |  287 B  00:00:00
oneAPI/signature                                                                                                                                                                                                                                                | 1.5 kB  00:00:00 !!!
Resolving Dependencies
--> Running transaction check
---> Package intel-oneapi-dal-devel.x86_64 0:2021.1.1-79 will be installed
---> Package intel-oneapi-dal-devel-2021.1.1.x86_64 0:2021.1.1-79 will be installed
--> Processing Dependency: intel-oneapi-dal-2021.1.1 for package: intel-oneapi-dal-devel-2021.1.1-2021.1.1-79.x86_64
--> Processing Dependency: intel-oneapi-common-licensing for package: intel-oneapi-dal-devel-2021.1.1-2021.1.1-79.x86_64
--> Processing Dependency: intel-oneapi-common-vars for package: intel-oneapi-dal-devel-2021.1.1-2021.1.1-79.x86_64
--> Processing Dependency: intel-oneapi-condaindex for package: intel-oneapi-dal-devel-2021.1.1-2021.1.1-79.x86_64
--> Processing Dependency: intel-oneapi-dal-common-devel-2021.1.1 for package: intel-oneapi-dal-devel-2021.1.1-2021.1.1-79.x86_64
---> Package intel-oneapi-tbb-devel.x86_64 0:2021.1.1-119 will be installed
---> Package intel-oneapi-tbb-devel-2021.1.1.x86_64 0:2021.1.1-119 will be installed
--> Processing Dependency: intel-oneapi-tbb-common-devel-2021.1.1 for package: intel-oneapi-tbb-devel-2021.1.1-2021.1.1-119.x86_64
--> Processing Dependency: intel-oneapi-tbb-2021.1.1 for package: intel-oneapi-tbb-devel-2021.1.1-2021.1.1-119.x86_64
--> Running transaction check
---> Package intel-oneapi-common-licensing-2021.1.1.noarch 0:2021.1.1-60 will be installed
---> Package intel-oneapi-common-vars.noarch 0:2021.2.0-195 will be installed
---> Package intel-oneapi-condaindex.x86_64 0:2021.2.0-94 will be installed
---> Package intel-oneapi-dal-2021.1.1.x86_64 0:2021.1.1-79 will be installed
--> Processing Dependency: intel-oneapi-compiler-dpcpp-cpp-runtime for package: intel-oneapi-dal-2021.1.1-2021.1.1-79.x86_64
--> Processing Dependency: intel-oneapi-dal-common-2021.1.1 for package: intel-oneapi-dal-2021.1.1-2021.1.1-79.x86_64
---> Package intel-oneapi-dal-common-devel-2021.1.1.noarch 0:2021.1.1-79 will be installed
---> Package intel-oneapi-tbb-2021.1.1.x86_64 0:2021.1.1-119 will be installed
--> Processing Dependency: intel-oneapi-tbb-common-2021.1.1 for package: intel-oneapi-tbb-2021.1.1-2021.1.1-119.x86_64
---> Package intel-oneapi-tbb-common-devel-2021.1.1.noarch 0:2021.1.1-119 will be installed
--> Running transaction check
---> Package intel-oneapi-compiler-dpcpp-cpp-runtime.x86_64 0:2021.2.0-610 will be installed
--> Processing Dependency: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0 for package: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-610.x86_64
---> Package intel-oneapi-dal-common-2021.1.1.noarch 0:2021.1.1-79 will be installed
---> Package intel-oneapi-tbb-common-2021.1.1.noarch 0:2021.1.1-119 will be installed
--> Running transaction check
---> Package intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0.x86_64 0:2021.2.0-610 will be installed
--> Processing Dependency: intel-oneapi-compiler-shared-runtime-2021.2.0 for package: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-2021.2.0-610.x86_64
--> Processing Dependency: intel-oneapi-common-licensing-2021.2.0 for package: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-2021.2.0-610.x86_64
--> Processing Dependency: intel-oneapi-tbb-2021.2.0 for package: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-2021.2.0-610.x86_64
--> Running transaction check
---> Package intel-oneapi-common-licensing-2021.2.0.noarch 0:2021.2.0-195 will be installed
---> Package intel-oneapi-compiler-shared-runtime-2021.2.0.x86_64 0:2021.2.0-610 will be installed
--> Processing Dependency: intel-oneapi-compiler-shared-common-runtime-2021.2.0 for package: intel-oneapi-compiler-shared-runtime-2021.2.0-2021.2.0-610.x86_64
--> Processing Dependency: intel-oneapi-openmp-2021.2.0 for package: intel-oneapi-compiler-shared-runtime-2021.2.0-2021.2.0-610.x86_64
---> Package intel-oneapi-tbb-2021.2.0.x86_64 0:2021.2.0-357 will be installed
--> Processing Dependency: intel-oneapi-tbb-common-2021.2.0 for package: intel-oneapi-tbb-2021.2.0-2021.2.0-357.x86_64
--> Running transaction check
---> Package intel-oneapi-compiler-shared-common-runtime-2021.2.0.noarch 0:2021.2.0-610 will be installed
---> Package intel-oneapi-openmp-2021.2.0.x86_64 0:2021.2.0-610 will be installed
--> Processing Dependency: intel-oneapi-openmp-common-2021.2.0 for package: intel-oneapi-openmp-2021.2.0-2021.2.0-610.x86_64
---> Package intel-oneapi-tbb-common-2021.2.0.noarch 0:2021.2.0-357 will be installed
--> Running transaction check
---> Package intel-oneapi-openmp-common-2021.2.0.noarch 0:2021.2.0-610 will be installed
--> Processing Conflict: intel-oneapi-compiler-shared-common-runtime-2021.2.0-2021.2.0-610.noarch conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-openmp-common-2021.2.0-2021.2.0-610.noarch conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-2021.2.0-610.x86_64 conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0-2021.2.0-610.x86_64 conflicts intel-oneapi-tbb < 2021.2.0
--> Processing Conflict: intel-oneapi-tbb-2021.2.0-2021.2.0-357.x86_64 conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-tbb-2021.2.0-2021.2.0-357.x86_64 conflicts intel-oneapi-tbb-common < 2021.2.0
--> Processing Conflict: intel-oneapi-openmp-2021.2.0-2021.2.0-610.x86_64 conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-common-vars-2021.2.0-195.noarch conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-compiler-shared-runtime-2021.2.0-2021.2.0-610.x86_64 conflicts intel-oneapi-common-licensing < 2021.2.0
--> Processing Conflict: intel-oneapi-tbb-common-2021.2.0-2021.2.0-357.noarch conflicts intel-oneapi-common-licensing < 2021.2.0
--> Finished Dependency Resolution
Error: intel-oneapi-compiler-shared-common-runtime-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-openmp-common-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0 conflicts with intel-oneapi-tbb-2021.1.1-2021.1.1-119.x86_64
Error: intel-oneapi-common-vars conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-openmp-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-compiler-dpcpp-cpp-runtime-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-compiler-shared-runtime-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-tbb-2021.2.0 conflicts with intel-oneapi-tbb-common-2021.1.1-2021.1.1-119.noarch
Error: intel-oneapi-tbb-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
Error: intel-oneapi-tbb-common-2021.2.0 conflicts with intel-oneapi-common-licensing-2021.1.1-2021.1.1-60.noarch
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

[ALS] Use user ID and item ID instead of matrix indices for ALS

Current ALS data input is using row index and column index to index rating in rating matrix. In real use cases, those row index and column index are usually using user ID and item ID and are represented with Integer or String. For example:

Use string as id

“User1”, “Item1”, 1.0
“User2”, “Item2”, 2.0

Or use integer as id

1234, 4567, 1.0
4321, 5678, 2.0

=> Row / column index for oneDAL

0, 1, 1.0
1, 1, 2.0

Other framework such as Spark MLlib ALS will handle this string/integer ID out of box. We need to do an extra data step to map from UserID/ItemID to row index/column index before calling DAL ALS and map back.

Also refer to oneapi-src/oneDAL#1514

Cannot compile Intel-mllib of master branch.

We try to compile Intel-mllib of master branch and meet the error like:
image
The maven command is "mvn clean package -DskipTests -Pspark-3.1.1" and we've successfully installed oneapi and oneccl.

[Release] Meet hang issue when running PCA algorithm.

In our automatic tests for OAP product, we found that using Intel-MLlib to run PCA and Kmeans algorithms often meet hanging issue which lead to block the whole workflow. The phenomenon is shown in the picture below:
image

[PIP] Misc improvements and refactor code

CI:

Fix CI tests bugs

Code Style:

Fix lint_scala bug
Apply styles for java, scala and c++
Add license header
rename all fit functions to train

Kmeans:

improve data conversion and cache & unpersist
refactor code for each profile

ALS:

Remove redundent code

OneCCL:

improve error message output

[Optimization] Use ccl::all2all to simulate gather

Problem: Currently we used ccl::allgather as Gather. The network bandwidth is wasted due to unnecessary transfer to worker ranks.
Solution: Implement gather using ccl::all2all to avoid unnecessary data transfer, only transfer data to root rank.

Reorganize Spark version specific code structure

To act as the template using profile based compile time source code organization, we need a more standard way to organize the version specific source code. Here is the proposal:

project-root
--src
----main -> common code entry
----test -> common test code entry
--spark-3.0.1 -> version specific entry
----main -> version specific code entry
----test -> version specific test code entry
--spark-3.1.1
----main
----test
...

And we will use build-helper-maven-plugin to add additional version specific source code to build.
Additionally, as part of this refactor, change maven-scala-plugin (old) to scala-maven-plugin (new)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.