tilmanneumann / java-math-library Goto Github PK

A Java math library focused on number theory and integer factorization in particular.

License: GNU General Public License v3.0

Java 100.00%

integer-factorization quadratic-sieve elliptic-curve-factorization pollard-rho lehman-factorization hart-factorization squfof cfrac logarithm exponential-function

java-math-library's Introduction

java-math-library

This library is quite focused on number theory and particularly integer factorization, but not necessarily limited to it.

It provides some pretty fast implementations of various factoring algorithms, including the classes

TDiv31Barrett: Trial division for numbers < 32 bit using long valued Barrett reduction
Hart_Fast2Mult: Highly optimized "Hart's one-line factorizer" for numbers <= 62 bit
Lehman_Fast, Lehman_CustomKOrder: Fast Lehman implementations for numbers <= 62 bit
SquFoF31Preload, SquFoF63: SquFoF implementations for numbers <= 52 rsp. 90 bit
PollardRhoBrentMontgomery64_MHInlined: Highly optimized Pollard-Rho for numbers <= 62 bit.
TinyEcm64_MHInlined: Highly optimized Java version of YaFu's tinyEcm.c for numbers <= 62 bit.
CFrac63, CFrac: CFrac implementations working on longs rsp. BigIntegers internally.
SIQS: Single-threaded self-initializing quadratic sieve (SIQS).
PSIQS: Multi-threaded SIQS.
PSIQS_U: Faster multi-threaded SIQS, using native memory access via sun.misc.Unsafe.

The factoring methods are used to implement a fast sumOfDivisors() function.

Another prominent subject in this library is prime generation and testing. For example, you can find

a port of Kim Walisch's primesieve (basic for him, pretty fast for most others)
SSOZJ, a fast twin prime sieve by Jabari Zakiya
a BPSW probable prime test implementation, and
state-of-the-art bound computations for the n.th prime and prime counting functions.

Other noteworthy parts of this library are

sqrt(), nth_root(), ln() and exp() functions for BigDecimals.
Gaussian integer and (Hurwitz) quaternion arithmetics including gcd's
an implementation of Pollack and Treviño's four squares finding algorithm
a fast generator for the partitions of multipartite numbers
implementations of smooth number sequences like CANs (colossally abundant numbers) and SHCNs (superior highly composite numbers).

Releases

v1.3.1: Fixed two bugs that may lead to factoring failures when trying to factor arguments in the range 32..62 bit using class CombinedFactorAlgorithm. These bugs were introduced 2021-07-12 shortly after release 1.1, so they affect release 1.2 and 1.3.
v1.3: Implemented Gaussian integer and quaternion arithmetics including gcd's, and a four-square finder using them.
v1.2: Implemented SIQS with three large primes (but with the current parametrization, 3-partials are not found for N<=400 bit)
v1.1: Faster sieve for large N, speedup close to factor 2 at 360 bit inputs. Improved Gaussian solvers (by Dave McGuigan), including a parallel Gaussian solver that outperforms Block-Lanczos until about 375 bit on a Ryzen 3900X with 20 threads. From now on, Java 10 is required!
v1.0: Integrated and adjusted Dario Alpern's ECM in class CombinedFactorAlgorithm.
v0.9.11: Added SSOZJ, a fast twin prime sieve; guard analysis code by final static booleans, so that the code is removed by the compiler when the boolean is set to false.
v0.9.10: Added port of Ben Buhrow's tinyecm.c.
v0.9.9.3: Added Hart's "one line factorizer"; simplified FactorAlgorithm type hierarchy.
v0.9.9: Significantly faster trial division and Pollard-Rho.
v0.9.8: Fixed bug in SquFoF for N not coprime with multipliers.
v0.9.6: New Pollard-Rho-Brent implementation with Montgomery multiplication in longs; improved Lehman, trial division, EEA31, Gcd31.
v0.9.5: Work on Lehman's algorithm, refactorings.
v0.9.1: Implemented Peter Luschny's swinging prime factorial.
v0.9: Thread-safe AutoExpandingPrimesArray, some refactorings.
v0.8: The first revision containing all the stuff I wanted to add initially.

Getting Started

Clone the repository, create a plain Java project importing it, make sure that 'src' is the source folder of your project, and add the jars from the lib-folder to your classpath.

You will need Java 10 or higher for the project to compile. (Java 10 is required to support intrinsics for Math.multiplyHigh())

There is no documentation and no support, so you should be ready to start exploring the source code.

Testing and comparing factoring algorithms

The main class for this purpose is class FactorizerTest. Here you have many options:

Choose the algorithms to run/compare by commenting in our out the appropriate lines in the constructor.
Choose the number of test numbers, their bit range, step size etc. by setting the static variables N_COUNT, START_BITS, INCR_BITS, MAX_BITS and so on.
Adjusting the static variables TEST_NUMBER_NATURE and TEST_MODE lets you choose the nature of test numbers (random, semi-prime, etc.) and if you want a complete factorization or only the first factor.

The amount of analysis and logging can be influenced by setting the static variables in the GlobalFactoringOptions interface. Typically one wants to have all those options set to false if N_COUNT > 1.

Note that for factoring very large numbers with multi-threaded algorithms like PSIQS, PSIQS_U, CombinedFactorAlgorithm or BatchFactorizer, the number of threads should not exceed the number of physical cores of your computer. The number size bound where this effect sets in seems to depend mostly on the L3 cache of your computer. The cause is explained well in SMT disadvantages.

Factoring records

My current factoring record is the 400 bit (121 decimal digits) hard semiprime 1830579336380661228946399959861611337905178485105549386694491711628042180605636192081652243693741094118383699736168785617 = 785506617513464525087105385677215644061830019071786760495463 * 2330444194315502255622868487811486748081284934379686652689159

Its factorization took less than 22 hours on a Ryzen 9 3900X with 12 sieve threads using jml 1.1. See factoring report on mersenneforum.org.

Authors

Tilman Neumann

License

This project is licensed under the GPL 3 License - see the LICENSE file for details

Credits

Big thanks to

Dario Alpern for the permission to use his Block-Lanczos solver under GPL 3
Graeme Willoughby for his great comments on the BigInteger algorithms in the SqrtInt, SqrtExact, Root and PurePowerTest classes
Thilo Harich for a great collaboration and his immense improvements on the Lehman factoring method
Ben Buhrow for his free, open source tinyecm.c and his comments on mersenneforum.org that helped a lot to improve the performance of my Java port
Dave McGuigan, who contributed a parallel Gaussian solver and even sped up my single-threaded Gaussian solver by a remarkable factor

Some (other) third-party software reused in this library:

Dario Alpern's ECM implementation, refactored by Axel Kramer (and by myself)
Kim Walisch's primesieve (ported by myself)
SSOZJ, a fast twin prime sieve by Jabari Zakiya, Java port by Pascal Pechard

java-math-library's People

Contributors

Stargazers

Watchers

Forkers

tabidots pascal66 kchatr sipln archmageirvine xgnosisx koichitazaki

java-math-library's Issues

Increase the usage of compound assignment operators

👀 Some source code analysis tools can help to find opportunities for improving software components.
💭 I propose to increase the usage of compound operators accordingly.

Would you like to integrate anything from a transformation result which can be generated by a command like the following?
(:point_right: Please check also for questionable change suggestions because of an evolving search pattern.)

lokal$ perl -p -i.orig -0777 -e 's#\b(?<target>\S+)\s*=\s*\k<target>[ \t]*(?<operator>[+\-*/%^]|&(?!&)|\|(?!\|)|<<|>>>?)#$+{target} $+{operator}=#gm' $(find ~/Projekte/java-math-library/lokal -name '*.java')

FactorAlgorithmBase#factor() don't add `BigInteger.ONE`

In FactorAlgorithmBase don't add BigInteger.ONE as factor only add prime factors:

	public SortedMultiset<BigInteger> factor(BigInteger N) {
		SortedMultiset<BigInteger> factors = new SortedMultiset_BottomUp<BigInteger>();
		// first get rid of case |N|<=1:
		if (N.abs().compareTo(I_1)<=0) {
			if (!N.equals(BigInteger.ONE)) {
				factors.add(N);
			}
			return factors;
		}
		// make N positive:

SIQS has unstable behaviour on small numbers

Hello,
I am iterating through a large sequence of primes starting from 5 and ending near 10 bln. and for specific primes call SIQR.factor(prime - 1). But after the very first iterations the loop hangs. After a short debugging found out that when SIQR.factor() is given 180 as a parameter, the algorithm behaves unpredictably. It sometimes returns the correct factorization (but it takes several seconds), but almost every time hangs forever. Probably there is a bug in factor_recurrent.
the code I use is as follows:
SIQS siqs = new SIQS(0.32F, 0.37F, null, 0.16F, new PowerOfSmallPrimesFinder(), new SIQSPolyGenerator(), new Sieve03g(), new TDiv_QS_1Large_UBI(), 10, new MatrixSolver01_Gauss(), false); siqs.factor(BigInteger.valueOf(180))
Are the parameters correct for the range of numbers specified above?

Best choice of CMult and MMult parameters

Under 250 bits
(Gauss is 1 and Lanczos is 2 for matSolver)

Benchmark       (CMult)  (MMult)                                                             (factor)  (matSolver)  (powFinder)  (thread)  Mode  Cnt         Score          Error  Units
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            1            1         5  avgt    3  10346558,322 �  2914226,901  us/op
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            1            2         5  avgt    3  14873205,600 � 26053211,702  us/op
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            1            3         5  avgt    3  16456946,500 � 14497259,632  us/op
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            2            1         5  avgt    3  36106049,100 � 60021813,164  us/op
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            2            2         5  avgt    3  48937724,533 � 92958624,123  us/op
Main.testPSIQS     0.32     0.37  5450273780896470537172691284382862523169832968962275136650623899499            2            3         5  avgt    3  43224890,800 � 23908449,152  us/op

Benchmark       (CMult)  (MMult)                                                    (factor)  (matSolver)  (powFinder)  (thread)  Mode  Cnt        Score         Error  Units
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            1            1         5  avgt    3  1758609,817 � 3993393,600  us/op
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            1            2         5  avgt    3  1701440,302 � 2784514,459  us/op
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            1            3         5  avgt    3  1864215,804 � 1826230,812  us/op
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            2            1         5  avgt    3  2295620,607 � 7476855,735  us/op
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            2            2         5  avgt    3  2361659,479 �  502503,505  us/op
Main.testPSIQS     0.32     0.37  1525677429474767563519480218716359010286470390165764501631            2            3         5  avgt    3  1926058,002 �  600651,394  us/op

Importance of JVM parameters

With "-Xms4g", "-Xmx4g" Combined
->SIQS 1 thread: 2732664,400 µs
->PSIQS 7 threads: 2510790,210 µs
Without
->SIQS 1 thread: 3252015,594 µs
->PSIQS 7 threads: 3249436,100 µs

Using 2x 128bits 189527046769670452665699925639138782629 x 145320076287112015128258773072589293279
"27542084895039556995681746820634403757320668248657547717737190830239511650491"

Mavenenize project

Please create a Maven github action to automatically test and build a JAR for every commit:

https://docs.github.com/en/actions/guides/building-and-testing-java-with-maven

Define Maven pom.xml

Define Maven folder structure:

/src/main/java
/src/main/ressources
/src/test/java
/src/main/ressources

Move JUnit dependencies to Maven test scope only

FactorizerTest

After 63 bits :

2019-10-25 15:12:15,912 INFO  FactorizerTest(208) [main]: Test N with 63 bits, i.e. N >= 4611686018427387904
2019-10-25 15:13:25,361 INFO  FactorizerTest(314) [main]: #1: Algorithm SquFoF63 took 1s, 113ms
2019-10-25 15:13:25,362 INFO  FactorizerTest(314) [main]: #2: Algorithm CFrac63(all_i=true, ks_adjust=3, stop=(5, 1.5), C=0.152, maxSuSmoothExp=0.25, TDiv63-01) took 1s, 785ms
2019-10-25 15:13:25,362 INFO  FactorizerTest(314) [main]: #3: Algorithm HartLA63(C=0.2, maxSuSmoothExp=0.4, TDiv63-01) took 1m, 6s, 469ms
2019-10-25 15:13:25,493 INFO  FactorizerTest(208) [main]: Test N with 64 bits, i.e. N >= 9223372036854775808
2019-10-25 15:13:26,523 ERROR HartLA63(236) [main]: Hart_Fast: Failed to factor N=17530346973326274709. Either it has factors < cbrt(N) needing trial division, or the arrays are too small.
2019-10-25 15:13:27,509 ERROR HartLA63(236) [main]: Hart_Fast: Failed to factor N=16572762435927399431. Either it has factors < cbrt(N) needing trial division, or the arrays are too small.

Excess messages for big numbers

Using PSIQS to factor 330 bit numbers seems to bring in SIQS for internal factoring. Problem is the global ANALYZE causes the sub-SIQS stats to be displayed in endless messages. I put in an override in SIQS to get around the problem.

[Idea] GPU

Can be usefull for opencl gpu
https://www.logicbig.com/tutorials/misc/gpu-programming/aparapi/intro-with-example.html
https://github.com/Syncleus/aparapi

parallel-gauss-solver

Tilman, I have written a new solver that does the gaussian elimination in parallel. Furthermore, I believe it is possible to create a solver that operates asynchronously and distributed over a wide network while the congruences are collected. I am hoping to get some collaboration as integrating it (for demonstrations purposes, it will still be in the single java VM). I haven't been able to find your email so I'm hopping you'll write me at [email protected]. I'd like to get your thoughts. Dave McGuigan

Use EllipticCurveMethod in CombinedFactorAlgorithm.factor()

Use de.tilman_neumann.jml.factor.ecm.EllipticCurveMethod in CombinedFactorAlgorithm.factor()

SIQS 2 times faster below 230 bits

I dont know exactly the limit between 222 and 256 for now.
Can you test it ? (Changing 3 lines in AParamGenerator01

UnSafe Sun package

Maybe the next part to clean ? Or keep for historical purpose ?
With Unsafe :
966983290915691193309978723256242679920691599725908954700676674631843021151 (250 bits) = 2166660942804222727904664493239497749 * 446301159453293757389122758418041256099 (factored in 58s, 152ms)
Without:
966983290915691193309978723256242679920691599725908954700676674631843021151 (250 bits) = 2166660942804222727904664493239497749 * 446301159453293757389122758418041256099 (factored in 57s, 519ms)