the-grid / finitedomain Goto Github PK
View Code? Open in Web Editor NEWFinite Domain Constraint Solver
Finite Domain Constraint Solver
There is a lookup table that gives you the index of all propagators that use a certain variable index. This is used for determining which propagators to step in the next propagation cycle.
We should clear the list for any variable that is solved at the start of searching. This happens a lot. We can't do it afterwards since we don't recompute this list for every step of the search. But at least it would prevent some useless operations since a solved var can not affect other propagators unless it flat-out rejects by becoming empty.
C = solver.isEq(A, B);
D = solver.isEq(A, B);
The constraint is identical because whatever the value of C
or D
(or A
and B
for that matter), the state will be propagated across them all eventually, regardless. And at the end of it C == D
in any case.
We should be able to detect these constraints, add a hidden mapping from D
to C
. This way we can eliminate the second constraint AND the second anonymous variable D
from the system.
Note that this same scrubbing should be done for any constraint, especially those with more than two vars or those where the constraint is commutative (a @bergie b == b @ a
). In those cases sort the var index, if commutative, and drop the result vars to detect equivalent constraints. Use a string on an object (or Trie, if that turns out to be slow) to "color" them.
Another case to check is whether vars are already solved since any constraint that uses the same variables but different constants can also be considered the same. This case should not happen very frequent since it's most likely that they result in a reject. But for all intentions and purposes, any "solved" var as well as any constant should be considered "the same variable" in this context.
Does not take into account that the number of array elements of a domain (not values! just array.length
) can grow during this operation.
Must update:
domain1[index] = domain2[index] = lo1;
domain1[index + 1] = domain2[index + 1] = hi;
To:
if (index >= p1) {
domain1.splice(p1, 0, lo1, hi);
len1 += PAIR_SIZE;
p1 += PAIR_SIZE;
} else {
domain1[index] = lo1;
domain1[index + 1] = hi;
}
if (index >= p2) {
domain2.splice(p2, 0, lo1, hi);
len2 += PAIR_SIZE;
p2 += PAIR_SIZE;
} else {
domain2[index] = lo1;
domain2[index + 1] = hi;
}
domain_str_isValueStr
with domain_str
domain_num
domain_arr
domain_any
typeof
asserts with something like ASSERT_NUMDOM(domain)
ASSERT_DOMAIN
stuffA distinct is perhaps the most expensive constraint that we have because it builds a "pyramid" of propagators. One distrinct on n
variables causes n^2-1
propagators because it has to create an neq
for every pair of variables (but not to itself). Being able to eliminate some means a potential reduction of a lot.
Now solved variables in a distinct
are easily resolved. Say we have a var x
which is solved, we search for the value of x
in all the other vars. If we encounter it, we remove it from the domain. Then we drop this var from the list of vars for distinct
. Repeat for all vars.
In the real world this may not help much, but there's at least one case in the curator right now that does something "silly" like distinct([1,1], [2,2], [3,3], [4,4], [5,5], [6,6], [7,7], [8,8], [9,9], [10,10], [11,11], [12,12], [13,13], [14,14], [15,15], [16,16], [17,17], [18,18])
. That is a distinct
on 18 vars, causing 18^2-1=323
propagators for "obviously" no good reason. I mean, there's probably a reason, but finitedomain should definitely eliminate these cases as soon as possible and the generator should reconsider whether it actually needs to do this in the first place.
Right now the value distributors are applied during runtime and every step in the search tree has to figure out the next value over and over again. This is simple for min
and max
, of course, but less so for others.
What if we could somehow precompute the order of resolving?
Sounds good but it's not that trivial;
[0, SUP]
or anything like that.From the old readme
Right now, about 40~50% of the time spent is GC. We can't really control that directly so we must try to alleviate the pressure on it. I think by far the biggest pressure comes from generating lots of objects, especially the arrays for the domains. So let's change that.
One part has already been done here; domains only containing values lower than 31 will become bitfields. This already brought some huge savings.
The other part is the larger domains, which we can't efficiently express in terms of bitfields. Let's try to turn them into strings.
Currently the SUP
is 100000000, which is well above 16bit but well below 32bit. This means every (range) value of a domain can't be encoded in one 16bit char, but we can have two of them. Or maybe even better, use the UTF way of using a high-bit for multi-byte overflow.
I am worried that this will create a lot of string concat overhead on its own. Perhaps this turns out to be superduper fast though so we'll just have to give this a spin. Not a trivial change to flush through the system but it has high potentials.
This is related to multiverse and how the input data from which the constraints are generated is modeled as a tree. In such a model certain branches fan out into other branches. But that model also makes unchosen branches completely irrelevant. Finitedomain will still want to solve all those constraints unless told otherwise. That bit is currently not possible in finitedomain and as such, the conditional doesn't improve anything.
lte(eq(A, B), eq(C, D))
, meaning "if A=B then C=D, but if A!=B then don't care about C and D"). This introduces three constraints where maybe one specific one "implication constraint" could suffice, in a similar way as reifiers do. This constraint could then also contain a list of constraints that the condition omits when "A != B". Should be trivial to search that list through the list of unsolved propagators or variables and prune them accordingly.The main idea of solving domains is to reduce them as fast as possible. Often one step of the search affects only one or two constraints, especially after the first couple of nodes. We should add a var distributor that picks its next var based on how many constraints it affects. The more constraints are affected by a choice, the higher the number of changes should (or at least could) be in the same search node.
lt
is solved when the max of the left var is lower than the min of the right var (since there are no values that could falsify left<right
).This is confusing but it means that if you add lt(A, B)
and the constraint is eliminated because all elements in A
are already lower than those in B
, A
and B
should not be solved explicitly in the search. They should still be solved, but only when a solution is constructed should these vars be solved to any value that is still left in the domain. Probably would need to apply the value distributor for every such var.
The main point is that the search should not create new spaces just so it can solve those variables whose constraints have been eliminated at compile time. External code may depend on an actual solution, but that doesn't mean that the search should be burdened with them because it doesn't need to. When variables have no constraints, any value in their domain is considered a valid outcome.
Var distributors are currently applied during runtime. So every time a new step in the search tree is created the var distributor is applied over and over again. What if instead of searching we could precompute the list of vars in the order as specified by the distribution(s) picked?
Where asm.js
works with code annotations forcing/hinting the types to the browser, webasm is more close to classic assembly than anything else available right now. However, it is still in an infancy state, browser support and tools are very limited and it'll be quite some work to port the lib to webasm.
You would have full control, though. And in some cases of the code that could be a huge win. And if webasm is supported in node.js then we could already profit from it on the server side, reducing running costs considerably.
Additionally we could also decide to only port hot code to webasm, certain tricky things in domain.js
for example where we know the browser has trouble optimizing for some reason or the other. No need to port the entire library. At first...
Obviously coding in asm is on a completely different level from js or higher languages in general.
Right now the code will check whether a variable is rejected on every step of a propagator. This is somewhat of an optimization because it means propagation can stil there immediately.
What if we stop doing that during propagation and only afterwards, once none of the propagator make any changes? Then we could eliminate some of this overhead at the cost of propagating longer than absolutely needed. However, since the changes per space are often rather limited after the first few, this may not be a bad thing considering the overhead of checking for it explicitly right now.
space_propagate
will propagate all propagators (low level representation of constraints) until they either reject (when any domain becomes empty the current path to a solution can not lead to a solution and is aborted immediately) or no more changes apply. This does not mean the propagator is solved (like [0,1] == [0,1]
but simply that the propagator can't infer any furthe reductions on its own.REJECTED
, SOME_CHANGES
, NOCHANGES
. We could omit this check and simply return whether or not something changed.EMPTY
...We (now) use the Trie structure in two key positions in the library;
config.all_var_names
).Internally all vars are only referenced to by their index but while setting up the solver the api is called with (original) var names. This means we need to be able to map these efficiently and an indexOf
in this area was bogging the system down with tens thousands of vars.
As an alternative strategy to keep trying until nothing changes, the code (now) steps a propagators and if it changed, record the variables that can be affected by that propagator. At the end of the loop, all these changed vars will compile a new list of propagators to check. We use a Trie to dedupe the list of vars that changed because if we didn't the complexity would explode.
The initial implementation used a naive and simple tree of objects. That worked great, now we need to reiterate on that and improve. Some things to do or try:
Ftr; I tried caching the trie (per propagation call or even globally) that tracks changed vars but that didn't really change anything in terms of perf. For the sake of simplicity I'm dropping that change.
From the old readme
http://courses.cs.washington.edu/courses/cse326/03wi/lectures/RaoLect20.pdf
I don't think this is done yet but the list distributor, that is a variable that is defined with an ordered list of values to attempt, should remove any value from its domain that is not in the list. In fact, it should be the intersection of its initial domain and the values in the list.
Symmetrically the list could be pruned from values not in the initial domain.
Should first check whether this is actually useful at all. Maybe list vars are already defined "cleanly" in our system.
Note that if the var contains a backup distributor, the above trick can not be applied since non-list values in the domain may still be valid. Just only after the list has been exhausted. The list itself could still be updated to remove values not in the initial domain.
Currently the propagation tracks all the changed variables and when it completes a propagation cycle, it will queue up all the propagators that are affected by the vars changed in the previous cycle. It repeats until no more variables change.
While this is super efficient on the one hand it does mean having highly volatile objects during a search. Temporary queue objects are created and destroyed by the millions which is very heavy on on the GC. We probably want to take a look at some standard techniques in an attempt to mitigate that. Recycle a queue, or use two and swap them after every cycle, etc.
Uint16Array
. Together with a fixed length, the upper bound being the total number of variables, this could prevent a lot of GC related overheadBarring some very explicit value propagators like markov and list, the other distribution strategies are only guided by metrics that are not likely to be of any semantic relevance, like min
, max
, and middle
.
What if we had a propagator that examined the propagators that a var affects and then tries to pick a var that leads to either a quick solve or quick reject rather than an arbitrary value? For example, consider gt([3, 100], [0, 3])
(and ignore the fact that it will be eliminated at compile time); if a value distributor would pick 100
as a value it would make the propagator pass, sure. But if it would pick 3
it would make the propagator fail and any further search nodes would eliminate the propagator permanently without having to check the other 96 values first.
100
would lead to a solution then picking 100
first would be faster than picking 3
first (and failing).indexOf
performance issuesdistinct
by pruning solved varsThe idea of asm.js
is to use certain operations to force values to be of a certain type. The overhead this causes is eliminated by the browser which will know about these tricks and at the same time they work as type annotations which allow the browser to be more effective when optimizing. An example is let foo = getCount(); foo = foo|0; return foo;
. The browser will eliminate the |0
line and assume the returned value is an integer.
I'm not entirely sure how precise these additions have to be and how complete. You tell the browser you use asm.js
with the "use asm"
header and it will kick you into this mode.
There's various articles on the web on asmjs, like here.
For finitedomain it could mean some improvements though I'm not certain how much. Additionally it would either mean we would need to translate the code to asmjs with a tool first. We don't want asmjs in our regular code base. I'm not sure how much effort is needed for that. But we should check it because if the effort is low, or can for the most part be done through jsdocs then we should totally do it. It'd basically be "free".
I've already started with this avenue but there's probably more fruit to be picked. Certain inputs can be immediately determined to be solved or rejected. Others can at least be reduced in complexity. For example; sum([0, 1], 'x')
is the same as sum([1], 'x')
, which is the same as eq('x', 1)
which means you can eliminate the constraint entirely and set 'x'
to 1
if it contains 1
, or empty if it doesn't contain 1
.
This type of optimization happens already and more can still be found. Simplifying the input can save a lot at runtime and the overhead is only executed once, opposed to many times while searching.
The markov vars are relatively quite expensive as their probability table is built up on every step of the way. We should take a look at this and see if we can pre compute certain parts of this table. And for example prune stuff that is never going to be used. The smaller the legend of a markov var the faster it will run.
0
should be eliminated entirely. Their domains should be pruned from these optionsWe currently have two main performance problems in finitedomain when scaling up. They both related to indexOf
. The reason is variable names. We do a lot of lookups for translating variable names to their internal id. This is a side effect of other optimizations, in particular the one where all domains are stored in a large domain. We currently translate the variables to the index of this array. So internally we only work with numbers, varIndex
es. But this sometimes requires a lookup and that was fine on "small" tests.
But when the number of variables exceeds a few thousand to more than 20k the indexOf
overhead becomes a problem. To illustrate here are two prints of profiling the curator test which runs, at the time of writing this, in about 20s in total. This is after already optimizing it down from a runtime of 70s by eliminating obvious unnecessary indexOf
occurrences.
There are two phases; the preparation phase and the solving phase. There are two different ways in which indexOf
is used so the solutions are probably different for each;
Preparation phase (the top six items, sans the anon, are all internals to the browser's indexOf
):
The preparation phase is haunted by indexOf
because it needs to translate each variable name to its final index. And while this is "free" when declaring the variable, the pattern is usually doing something like solver.decl('A', 5); solver.decl('B', 6); solver.eq('A', 'B')
. So for the last call the index would need to be looked up which incurs the indexOf
wrath of a growing list of variable names.
The solving phase is slightly different. A recent optimization changed the way "propagation" works from actively maintaining a list of solved propagators to dynamically tracking a list of vars that changed and revisiting only those propagators that can be affected by those vars.
This last optimization works great on the small and relatively small test cases but went kapooya when we tried scaling up the number of variables.
The two cases are difference since the first one is about an ever growing list of variables that never decreases and goes all the way up to n
. This list persists throughout the search. So calls later in the preparation phase always take more time than earlier calls. The solve phase uses constantly growing lists, which tend to be smaller than the big list most of the time, that are quite quickly discarded and regenerated (with different values). Also the first list is an array of strings and the second is a list of numbers.
Here is a print of the second case. You're seeing the number of elements in the array where the culprit indexOf
happens:
Let's try to solve this. The solution in both cases is simply using a different data structure. An unguided search in a flat list of pseudo unordered strings is obviously not going to scale well. It's like doing bubble sort on big data. Good luck.
The actual data model may be different, though maybe not I'm not sure. For the main list of variables that persists we'll need a structure that performs well on search()
. The delete()
and findAll()
cases are not very relevant for it since we never remove elements and we could also track all vars in an actual array for trivial findAll()
. Since insert()
only happens once per var per search the (constant) overhead and GC concerns are not very relevant.
This last bit is different for the solve time problem because those arrays are constantly growing and destroyed. So it needs a structure that has good insert()
and findAll()
performance and which doesn't bog down the GC too much.
Anyways. There's a few different structures (somehow missing Trie) to pick from but I'm not convinced there's a clear winner between them. Especially when implemented in JS, taking the memory and GC overhead into account. A worst case search of O(log(n))
is definitely better than O(n)
so this must be figured out and tested asap.
Should prefix anonymous vars with underscores
We shouldn't use bare rng functions inside the project but instead an object that contains the meta data so we can reconstruct the function if needed.
Basically right now we accept a rng (like Math.random
, or something more reproducable for prod, or something fixed for tests) which is just a function. But once we have it we can't really trace back how it was created. This makes it harder to generate the code to reproduce a particular test case.
Something like this would solve it:
{
_class: 'special_rng',
_params: 'fcxfds9fdsfua9uf98ds',
roll: function(){ ... },
}
Then we can figure out what created the function through _class
, recreate it through the _params
property, and just call it through rng.roll()
.
The idea of a cycle cut is to find secluded groups of variables such that the constraints that use them do not use variables outside that group. Smaller groups have smaller search spaces which may lead to considerable improvement in solving time. These disjointed groups can be solved independently from each other, in parallel even, as a solution to on group will not affect the other group(s).
I've already tried to implement this once and my conclusion at the time was that finding these cycles is rather expensive. To make matters worse most configurations had little to no cycles in the first place; every var was somehow tied into every other var. And in the cases where something was found, the overhead outweighed any potential gains.
We should still revisit this as I'm not convinced that the approach is useless.
This is a problem because currently solved vars are copied by reference. But if the domain of a var can end up empty, then it will end up empty up the search tree as well. I was afraid this was the case but today I ran into an example.
it('should return REJECTED if domains resolved to same value', function() {
let A = fdvar_create('A', specDomainCreateRanges([SUP, SUP]));
let B = fdvar_create('B', specDomainCreateRanges([SUP, SUP]));
let R = fdvar_forceNeqInline(A, B);
expect(R).to.eql(REJECTED);
expect(A).to.eql(fdvar_create('A', specDomainCreateRanges([SUP, SUP])));
expect(B).to.eql(fdvar_create('B', specDomainSmallEmpty()));
});
The problem above lies in that B
ends up empty. I suspect there are more ways this can happen so we may have to revise the var copying strat.
When inspecting the configuration of a large case (20k+ constraints) the vars are often boolean bound (or at otherwise have an upper bound of about 5
). We have a lot of code intended to be generic, but maybe it would pay off to handle boolean cases explicitly.
We already have "small" domains, represented as bitwise flags, this already helps considerably. We could simply compare domain === ONE
and act accordingly. For example, the eq([0, 1], [1, 1])
could be handled with a table. Now this is a bad example because eq
on small domains is dealt with through binary AND-ing. But the point is that not all propagators can get away with bitwise shortcuts. For them a simply solution table for booleans could make a lot of difference.
EMPTY
ZERO
ONE
BOOL
) which is much faster than what arrays used to do. Simple lookup tables can lead to O(1)
solutions for propagation steps, for insofar they aren't already.From the old readme
This ticket has multiple parts;
Note that while this appears to be worthwhile for or main case, it's unsafe to do so because you don't really know how the boolean vars are used externally. For example, they are used in markov matrix conditions and we can't really detect that easily. Let alone use cases really outside of finitedomain in a callback or something. So for now this vector is not pursued and must be fixed at call sites.
isEq(A, 1)
is equal to isEq(A, 1, [0,1])
and means C=A==1?1:0
, which we could simplify with an eq
. Some mapping is in order for non-boolean-bound values.This ticket is only for reference and explains the "trick" that removeGteNum
and removeLteNum
use.
Given a domain as number, where each bit is a flag indicating that number is contained in the domain or not (for numbers 0 up to and including 30
), return a domain with all flags that represent numbers lesser than or equal to given value
turned off.
Example:
[[0, 5], [8, 10], [12, 20]]
(or actually [0, 5, 8, 10, 12, 20]
) only contains numbers <= 30
so it becomes this "small domain", represented as a single 32bit number: 2094911, or in binary: 0000000000111111111011100111111
. As you can see, there are three ranges of ones which corresponds to the ranges defined.
Now given this domain, let's say we want to remove anything >= 9
, that should result in the domain [0, 5, 8, 9]
or 831
or 0000000000000000000001100111111
. When we put these next to each other the pattern is slightly more obvious:
0000000000111111111011100111111
0000000000000000000001100111111
Anyways, there isn't a simple bit hack available to do this but we can get close to something like that.
When we write out manual bit masks per input value
we get something like this:
switch (value) {
case 0: return 0x00000000;
case 1: return domain & 0x000000001; // ZERO;
case 2: return domain & 0x000000003; // (ZERO | ONE);
case 3: return domain & 0x000000007; // (ZERO | ONE | TWO);
case 4: return domain & 0x00000000f; // (ZERO | ONE | TWO | THREE);
case 5: return domain & 0x00000001f; // (ZERO | ONE | TWO | THREE | FOUR);
case 6: return domain & 0x00000003f; // (ZERO | ONE | TWO | THREE | FOUR | FIVE);
case 7: return domain & 0x00000007f; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX);
case 8: return domain & 0x0000000ff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN);
case 9: return domain & 0x0000001ff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT);
case 10: return domain & 0x000003ff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE);
case 11: return domain & 0x000007ff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE | TEN);
case 12: return domain & 0x00000fff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE | TEN | ELEVEN);
case 13: return domain & 0x00001fff; // (ZERO | ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE | TEN | ELEVEN | TWELVE);
... etc
The pattern we see here is (note that the numbers are hex): 1 3 7 f 1f 3f 7f ff 1ff ...
. This pattern is equal to doubling 1
as many times as value
and then subtracting 1
from the result.
You can double a number bitwise by a left shift (x << 1
). The resulting bit mask can be AND
with the domain to remove all those values, so that results in the compact:
domain & ((1 << value)-1)
And we can drop the switch
and other complexities. Yay.
Note that a jump table, like above, is probably slightly faster when you can trust the runtime to actually use it as a jump table. As it stands, this is not the case. But perhaps we'll use that pattern for asm.js or wasm, later. There we can actually rely on it becoming a proper jump table and it will be (slightly) cheaper.
When inspecting the configurations I see many cases like reifier[eq] -> [0,1] eq [1,1] = [0,1]
. Also variations like reifier[eq] -> [0,0,2,2] eq [2,2] = [0,1]
which may not be binary but is the exact same pattern.
For one, the binary kind of reifiers can be simplified to an eq([0,1], [0,1])
, or rather eq(left, result)
or eq(right, result)
, when either left or right is already solved to [1,1]
. Note that this trick doesn't work with non-binary but we could construct a special propagator that does this mapping for us.
It may not be worth it, but perhaps it is because it you can simplify the whole thing with a lookup table.
solver.addVar([0, 1]).name('something to debug')
or more likely solver.sum(vars).name('sum_of_stuff')
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.