Giter VIP home page Giter VIP logo

batchgcd's Introduction

(C++)+GMP BatchGCD algorithm

Description

(C++)+GMP implementation of the Batch GCD algorithm, by Daniel Bernstein. This algorithm, described in How To Find Smooth Parts Of Integers, allows to compute pairwise GCDs of a list of integers in quasilinear time and memory. See e.g. factorable.net, which also provides source code.

Updated patch for GMP binary I/O

For large datasets (~ 8M 2048-bit moduli), current gmp won't swap some intermediate values correctly. The reason is an undocumented raw binary I/O limit of 2**31 bytes.

The implementation of factorable.net patched this for gmp-5.0.5, and we provide a similar patch for the current latest (gmp-6.1.2), which is a more optimized and C++ friendly version. This implementation is slightly faster than the one given in factorable.net. Note that mpz_inp_raw and mpz_out_raw have changed since gmp-5.0.5, which makes the factorable.net patch incompatible with current versions of gmp. This patch has not been tested in 32-bit systems. Both mpz_inp_raw and mpz_out_raw are modified with the patch to admit integers up to 2**63 bytes; consequently, this breaks compatibility with regular gmp binary files.

Requirements

Dependencies

You need g++, curl, m4, gmp, boost::thread:

apt-get install curl m4 libgmp3-dev libboost-thread-dev

Input file

You need a csv file, containing integers in the following format

<ID>,<modulus in base 16>\n

You can also use

<ID>,<modulus in base 10>\n

and set the -base10 option when running.

Usage

These instructions have been tested in Ubuntu 18.04, 19.10 and Debian GNU/Linux 9.9 (stretch).

Install

Clone and cd to the repo. Run

make install

This will download gmp-6.1.2, check the sha1 fingerprint, and apply our updated patch.

Test

Perform a toy test run with

make test

This should expose 8 compromised moduli and 2 duplicates out of the 15 samples.

Run

Compile with make batchgcd and run with

./batchgcd /path/to/csv/file [-base10]

Without the -base10 option, the default is base 16. Results contain no factors, only IDs are stored in compromised.csv, duplicates.csv.

If there are duplicates, you may want to filter them out before running the algorithm, since any number sharing all of its factors may appear as duplicate without really being duplicate. For instance; if n = pq, m = pr, h = qr all of them will appear as duplicates, and more work is needed to factorise (of course, naïve pairwise GCDs will do).

batchgcd's People

Contributors

zugzwang avatar hannob avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.