Giter VIP home page Giter VIP logo

indexed-string-variation's Introduction

indexed-string-variation

npm version Build Status codecov.io

Experimental JavaScript module to generate all possible variations of strings over an alphabet using an n-ary virtual tree.

Install

With NPM:

npm install --save indexed-string-variation

Usage

Generally useful to create distributed brute-force password recovery tools or other software that might require distributed generation of all possible strings on a given alphabet.

const generator = require('indexed-string-variation').generator;
const variations = generator('abc1');

for (let i=0; i < 23; i++) {
    console.log(i, variations(i)); // generates the i-th string in the alphabet 'abc1'
}

Will print:

0 ''
1 'a'
2 'b'
3 'c'
4 '1'
5 'aa'
6 'ab'
7 'ac'
8 'a1'
9 'ba'
10 'bb'
11 'bc'
12 'b1'
13 'ca'
14 'cb'
15 'cc'
16 'c1'
17 '1a'
18 '1b'
19 '1c'
20 '11'
21 'aaa'
22 'aab'

API

The module indexed-string-variation exposes the following components:

  • generator (also aliased as default for ES2015 modules): the main generator function
  • defaultAlphabet: a constant string that contains the sequence of characters in the defaultAlphabet

As you can see in the usage example, the generator function takes as input the alphabet string (which is optional and it will default to defaultAlphabet if not provided) and returns a new function called variations which can be used to retrieve the indexed variation on the given alphabet. variations takes a non-negative integer as input which represents the index of the variations that we want to generate:

const variations = generator('XYZ');
console.log(variations(7123456789)); // "XYYZYZZZYYYZYZYXYYYYX"

How the algorithm works

The way the generation algorithm work is using an n-ary tree where n is the size of the alphabet. For example, if we have an alphabet containing only a, b and c and we want to generate all the strings with a maximum length of 3 the algorithm will use the following tree:

Sample ternary tree over abc alphabet

The tree is to be considered "virtual", because it's never generated in its integrity, so the used space in memory is minimal.

In brevity we can describe the algorithm as follows:

Given an index i over an alphabet of length n and it's corresponding n-ary tree, the string associated to i corresponds to the string obtained by concatenating all the characters found in the path that goes from the root node to the i-th node.

For example, with the alphabet in the image we can generate the following strings:

i generated string
0
1 a
2 b
3 c
4 aa
5 ab
6 ac
7 ba
8 bb
9 bc
10 ca
11 cb
12 cc

Important note: The alphabet is always normalized (i.e. duplicates are removed)

Use big-integer to avoid JavaScript big integers approximations

Integers with more than 18 digits are approximated (e.g. 123456789012345680000 === 123456789012345678901), so at some point the generator will start to generate a lot of duplicated strings and it will start to miss many cases.

To workaround this issue you can use indexes generated with the module big-integer. Internally the indexed-string-variation will take care of performing the correct operations using the library.

Let's see an example:

const bigInt = require('big-integer'); // install from https://npmjs.com/package/big-integer
const generator = require('indexed-string-variation').generator;
const variations = generator('JKWXYZ');

// generation using regular big numbers (same result)
console.log(variations(123456789012345678901)); // XJZJYXXXYYJKYZZJKZKYJWJJYW
console.log(variations(123456789012345680000)); // XJZJYXXXYYJKYZZJKZKYJWJJYW

// generation using big-integer numbers (correct results)
console.log(variations(bigInt('123456789012345678901'))); // XJZJYXXXYYJKYZZJKZKXZKJZZJ
console.log(variations(bigInt('123456789012345680000'))); // XJZJYXXXYYJKYZZJKZKXZWJJWK

Anyway, keep in mind that big-integers might have a relevant performance impact, so if you don't plan to use huge integers it's still recommended to use plain JavaScript numbers as indexes.

Contributing

Everyone is very welcome to contribute to this project. You can contribute just by submitting bugs or suggesting improvements by opening an issue on GitHub.

License

Licensed under MIT License. © Luciano Mammino.

indexed-string-variation's People

Contributors

ccarcaci avatar lmammino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

indexed-string-variation's Issues

Bug when using big indexes

Sample code:

var isv = require('indexed-string-variation').default;

var i = 3511326342653452173122134435345;
console.log(i);
console.log(isv('ab')(i));

output:

3.511326342653452e+30
C:\Users\Luciano.Mammino\test\node_modules\indexed-string-variation\lib\index.js:38
      throw new TypeError('index must be a non-negative integer');
      ^

TypeError: index must be a non-negative integer
    at C:\Users\Luciano.Mammino\test\node_modules\indexed-string-variation\lib\index.js:38:13
    at Object.<anonymous> (C:\Users\Luciano.Mammino\test\test.js:5:22)
    at Module._compile (module.js:425:26)
    at Object.Module._extensions..js (module.js:432:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:313:12)
    at Function.Module.runMain (module.js:457:10)
    at startup (node.js:138:18)
    at node.js:974:3

Bug with alphabet filter to remove duplicates when using default alphabet

var isv = require('indexed-string-variation').default;

var i = 351132634265345217312;
console.log(i);
console.log(isv()(i));

output

351132634265345200000
C:\Users\Luciano.Mammino\test\node_modules\indexed-string-variation\lib\index.js:10
    return alphabet.split('').filter(function (item, pos, self) {
                   ^

TypeError: Cannot read property 'split' of undefined
    at cleanAlphabet (C:\Users\Luciano.Mammino\test\node_modules\indexed-string-variation\lib\index.js:10:20)
    at isv (C:\Users\Luciano.Mammino\test\node_modules\indexed-string-variation\lib\index.js:19:14)
    at Object.<anonymous> (C:\Users\Luciano.Mammino\test\test.js:5:13)
    at Module._compile (module.js:425:26)
    at Object.Module._extensions..js (module.js:432:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:313:12)
    at Function.Module.runMain (module.js:457:10)
    at startup (node.js:138:18)
    at node.js:974:3

Extend README

  • Describe virtual tree approach.
  • Add section about how to install from NPM

Improve exported modules

  • Expose defaultAlphabet in the module
  • Export the default function as both default and generator
  • Update tests
  • Update documentation

Validate input

  • Validate input alphabet (must be string)
  • Validate index (must be 0 or positive integer)
  • Update tests to provide cover check for errors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.