Comments (3)
However the following works ok:
for (int i = 0; i < 100; i++) {
// given
Generex generex = new Generex("([0-9]){5}");
// when
Iterator iterator = generex.iterator();
Stream<String> ids = Stream.generate(() -> iterator.next()).limit(100);
// then
Assertions.assertThat(ids.distinct().count()).isEqualTo(100);
}
Is it that this is the only recommended way to generate random sequences?
from generex.
the behavior of the first code is normal , and the test will very likely failed , so let's talk probability :) :
you're using this pattern "([0-9]){5}", then the set of data contains 10^5 elements. So when we take 100 random elements , the probability of getting 100 distinct elements is equal to :
(10^5-100)/10^5 = 99/100
and you are using a loop that repeat the code 100 times , the test will be OK if in all case you got 100 distinct elements , this means that the probability of getting passed test is
(99/100)^100 ~ 0.37
the second code pass the test because the iterator is used to iterate in lexicographical order over possible string that matches your pattern , so it will not return the same string two time.
from generex.
Thanks for rapid feedback and your kind help.
Yes, doing math leads to 99.9% probability for uniqueness. However empirically I get ~95% with the following test:
@Test
public void generatesUniqueSequence() throws Exception {
int sequenceLength = 100;
int tries = 100000;
int successCounter = 0;
for (int i = 0; i < tries; i++) {
Generex generex = new Generex("([0-9]){5}");
Stream<String> ids = Stream.generate(() -> generex.random()).limit(sequenceLength);
if (ids.distinct().count() == sequenceLength) {
successCounter++;
}
}
Assertions.assertThat(successCounter).isEqualTo(tries);
}
For "([0-9]){6}" according to math it's 99.99% while empirically I get ~99.51% with the above test.
from generex.
Related Issues (20)
- Would you guys mind adding a license file? HOT 1
- Hello, can you add a function to support Chinese generation HOT 4
- Positive Lookahead HOT 1
- Pipe with empty character doens't work HOT 1
- ^ and $ are used as a normal characters HOT 6
- random with length does not work HOT 4
- Slow generation of some RegExps
- Too many Underscore is generated.
- Na chuj to wstawiasz jak nie działa.,
- Non valid strings generated HOT 1
- you don't seem to support non capture groups
- Add Generex#random(RandomSource random) so random source can be provided on a call-by-call basis
- The probability of picking the maximum length is very low
- Generex("(नि|😃|丈)+").random(69,70) causes StackOverflow
- Is this project still alive?
- Predefined character class replacement inside square brackets is incorrect HOT 1
- StackOverflowError from pattern input HOT 2
- generex
- java.lang.NoClassDefFoundError: dk/brics/automaton/RegExp
- Bump dependency to the newest `automaton` lib version
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from generex.