Comments (11)
For info, I created a PR that can reduce the number of plans generated by the optimizer rule "interchange-adjacent-enumerations" in some cases:
#20201
For the example query provided in this issue, the change would reduce the number of generated plans from 128 to 32. That should make plan optimization considerably faster than it is right now for such types of queries.
Maybe even more can be achieved by adjusting the optimizer even further. Will need to check.
from arangodb.
@matcho : I remember there were issues with index hints in some specific situations, which were fixed a while ago. I guess if you are running the queries on ArangoDB version 3.8, that the issues are still present. If the index hint issues still remain in recent versions of ArangoDB, they should be reported so we can fix them.
I think at least in theory an index hint could lead to the optimizer producing less plans, for example if multiple indexes could be used to execute the query, and the optimizer creates different plans for each candidate index.
I say "in theory" here because I don't know from the top of my head if this is what is actually happening. I think at least in some cases the optimizer will pick the index it finds most appropriate, and then goes ahead with it, without creating additional plans for other indexes. But this may be different in different optimizer rules.
It may also be that a different optimizer rules creates the additional plans here. That is not necessarily related to indexes and thus index hints.
from arangodb.
@matcho : thanks for reporting back.
I also checked our code and found that there are just 2 optimizer rules which will create additional plans: the one you mentioned and another one that can modify COLLECT statements (specialize-collect
). The latter rule doesn't apply here.
I don't think that disabling the interchange-adjacent-enumerations
optimizer rule globally is good, although it may help to limit the optimizer runtime of complex queries. There are at least some cases in which the rule can do good and can lead to better execution plans, although it is not guaranteed that it does. However, if it helps your specific case and you don't see any negative side effects of disabling the rule, why not. But this is not really future proof in case new queries are added or existing ones are changed. These may benefit from the optimizer rule running.
Maybe it would be a good idea to introduce additional limits or heuristics for plan creation. I need to look into it in more detail first though.
from arangodb.
As we don't want to change the defaults or behavior for the 3.11 stable branch too much, we will go ahead with the above change (#20201) only from 3.12 onwards. That change should reduce the number of query execution plans for this particular case and a few others.
Until then, it is still possible to adjust the maximum number of plans the optimizer generates, either by setting the per-query option (maxPlans
) or by adjusting it globally via the startup option --query.optimizer-max-plans
. There is also the option to turn off the optimizer rule "interchange-adjacent-enumerations" for certains queries or globally.
Though I won't recommend doing either of these things without testing the effects on query runtimes first.
from arangodb.
Hi,
the Est
column of the table shows the number of documents the optimizer estimates to move from one node to the other, 857829800
at peak in your example.
If you use the later version, query profiling allows you to browse the actual figures of your query.
Resulting performance is a question of keeping these figures in range, and maybe optimize them by working with the query. Some of that may be available through the optimizer, some not.
from arangodb.
@matcho : It's a bit hard to tell from just the execution plan where the query spends most of its time.
Can you also attach the profile output of the query? The profile can be retrieved from the web UI or from the arangosh by running the command db._profileQuery({ query, bindVars });
in the respective database.
One thing that is striking from the explain output already is that the optimizer created 128 different query execution plans. This is the default maximum number of plans. So the optimizer may spend a lot of time trying to work through all the different execution plans, but this is currently a guess, because I don't know how much time was spent for query optimization and how much was spent for query execution. These numbers should also be included in the query profile, so hopefully it can shed at least a bit of light on the issue.
from arangodb.
Thank you very much Wilfried and Jan for your quick answers. @jsteemann here's the profile output. It seems that you're right : most of the execution time is consumed by "optimizing plan". Is there a workaround you could think of?
Query String (6748 chars - truncated..., cacheable: false):
LET project_pn_key = "bdtgf_jfm_sync_v1"
LET r = []
FOR o IN observations
FILTER o._key == '1003316854'
LET taxo = (
FOR t IN taxa
FILTER t.name == o.computed.current_name
FILTER project_pn_key IN t.sources
FOR sp IN t.species_list
FILTER sp.nameAccordingTo == project_pn_key
RETURN {
users_common_names: (
for ve in vernaculars
filter ve.species == t.name
filter ve.lang == 'fr'
collect term = ve.term with count into c
filter c > 1
sort c desc
return {term, c}
)
}
)
LET syno = (LENGTH(taxo)) ? [] : (
FOR t in taxa
FILTER o.computed.cu...
Execution plan:
Id NodeType Calls Items Runtime [s] Comment
1 SingletonNode 1 1 0.00001 * ROOT
195 IndexNode 1 1 0.00019 - FOR o IN observations /* primary index scan */
125 CalculationNode 1 1 0.00001 - LET #155 = o.`computed`.`votes` /* attribute expression */ /* collections used: o : observations */
231 SubqueryStartNode 1 2 0.00002 - LET partners_votes = ( /* subquery begin */
126 EnumerateListNode 1 2 0.00002 - FOR v IN #155 /* list iteration */
233 SubqueryStartNode 1 3 0.00001 - LET sp = ( /* subquery begin */
206 IndexNode 1 3 0.00048 - FOR t IN taxa /* hash index scan */
194 LimitNode 1 3 0.00001 - LIMIT 0, 1
132 CalculationNode 1 3 0.00001 - LET #159 = t.`sources` /* attribute expression */ /* collections used: t : taxa */
237 SubqueryStartNode 1 4 0.00025 - LET #56 = ( /* subquery begin */
133 EnumerateListNode 1 40 0.00004 - FOR s IN #159 /* list iteration */
208 IndexNode 1 4 0.00021 - FOR p IN projects /* hash index scan, projections: `_key` */
137 LimitNode 1 4 0.00000 - LIMIT 0, 1
193 LimitNode 1 4 0.00000 - LIMIT 0, 1
138 CalculationNode 1 4 0.00001 - LET #163 = p.`_key` /* attribute expression */ /* collections used: p : projects */
238 SubqueryEndNode 1 3 0.00001 - RETURN #163 ) /* subquery end */
235 SubqueryStartNode 1 4 0.00023 - LET #61 = ( /* subquery begin */
207 IndexNode 1 100 0.00111 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
147 CalculationNode 1 100 0.00004 - LET #169 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
148 CollectNode 1 13 0.00003 - COLLECT term = #169 AGGREGATE c = LENGTH() /* hash */
149 CalculationNode 1 13 0.00001 - LET #171 = (c > 1) /* simple expression */
150 FilterNode 1 11 0.00001 - FILTER #171
151 SortNode 1 11 0.00001 - SORT c DESC /* sorting strategy: standard */
152 CalculationNode 1 11 0.00001 - LET #173 = { "term" : term, "c" : c } /* simple expression */
236 SubqueryEndNode 1 3 0.00001 - RETURN #173 ) /* subquery end */
155 CalculationNode 1 3 0.00001 - LET #175 = { "species" : { "project" : #56[0] }, "users_common_names" : #61 } /* simple expression */
234 SubqueryEndNode 1 2 0.00000 - RETURN #175 ) /* subquery end */
158 CalculationNode 1 2 0.00001 - LET #177 = { "obs_id" : o.`_key`, "val" : v.`name`, "sp" : sp[0], "plus" : v.`plus` } /* simple expression */ /* collections used: o : observations */
232 SubqueryEndNode 1 1 0.00001 - RETURN #177 ) /* subquery end */
162 CalculationNode 1 1 0.00000 - LET #179 = o.`computed`.`tags` /* attribute expression */ /* collections used: o : observations */
227 SubqueryStartNode 1 2 0.00000 - LET obs_tags = ( /* subquery begin */
163 EnumerateListNode 1 1 0.00004 - FOR tag IN #179 /* list iteration */
205 IndexNode 1 1 0.00005 - FOR tc IN tags_catalog /* persistent index scan, projections: `public_display`, `target`, `values` */ FILTER ((tc.`target` == "observation") && (tc.`public_display` == true)) /* early pruning */
172 CalculationNode 1 1 0.00000 - LET #187 = tc.`values` /* attribute expression */ /* collections used: tc : tags_catalog */
229 SubqueryStartNode 1 1 0.00000 - LET #72 = ( /* subquery begin */
173 FilterNode 1 1 0.00000 - FILTER #187
175 EnumerateListNode 1 1 0.00006 - FOR tv IN #187 /* list iteration */
176 CalculationNode 1 1 0.00001 - LET #191 = (tv.`name` == tag.`value`) /* simple expression */
177 FilterNode 1 1 0.00000 - FILTER #191
190 LimitNode 1 1 0.00000 - LIMIT 0, 1
178 CalculationNode 1 1 0.00000 - LET #193 = { "key" : tv.`name`, "label" : (tv.`description`.`fr` || tv.`description`.`en`) } /* simple expression */
230 SubqueryEndNode 1 1 0.00000 - RETURN #193 ) /* subquery end */
181 CalculationNode 1 1 0.00000 - LET #195 = { "value" : #72[0] } /* simple expression */
228 SubqueryEndNode 1 1 0.00000 - RETURN #195 ) /* subquery end */
217 SubqueryStartNode 1 2 0.00000 - LET obs_votes = ( /* subquery begin */
200 IndexNode 1 4 0.00018 - FOR v IN votes /* persistent index scan */
225 SubqueryStartNode 1 7 0.00001 - LET #47 = ( /* subquery begin */
204 IndexNode 2 7 0.00010 - FOR u IN users /* primary index scan, projections: `email` */
191 LimitNode 2 7 0.00001 - LIMIT 0, 1
118 CalculationNode 2 7 0.00001 - LET #151 = { "email" : u.`email` } /* simple expression */ /* collections used: u : users */
226 SubqueryEndNode 1 4 0.00005 - RETURN #151 ) /* subquery end */
219 SubqueryStartNode 1 7 0.00001 - LET #43 = ( /* subquery begin */
201 IndexNode 1 5 0.00041 - FOR t IN taxa /* hash index scan */
82 CalculationNode 1 5 0.00001 - LET #129 = t.`sources` /* attribute expression */ /* collections used: t : taxa */
223 SubqueryStartNode 1 6 0.00020 - LET #36 = ( /* subquery begin */
83 EnumerateListNode 1 42 0.00005 - FOR s IN #129 /* list iteration */
203 IndexNode 1 43 0.00161 - FOR p IN projects /* hash index scan, projections: `_key`, `_pn_key`, `roles` */
85 CalculationNode 1 43 0.00003 - LET #131 = ((p.`_pn_key` == s) && ((LENGTH(p.`roles`) == 0) || (LENGTH(p.`roles`[* FILTER false RETURN #32_]) > 0))) /* simple expression */ /* collections used: p : projects */
86 FilterNode 1 41 0.00002 - FILTER #131
90 CalculationNode 1 41 0.00001 - LET order = ((p.`_key` == "guyane") ? 0 : 1) /* simple expression */ /* collections used: p : projects */
91 SortNode 1 6 0.00002 - SORT order ASC /* sorting strategy: constrained heap */
92 LimitNode 1 6 0.00001 - LIMIT 0, 1
192 LimitNode 1 6 0.00001 - LIMIT 0, 1
93 CalculationNode 1 6 0.00001 - LET #135 = p.`_key` /* attribute expression */ /* collections used: p : projects */
224 SubqueryEndNode 1 5 0.00001 - RETURN #135 ) /* subquery end */
221 SubqueryStartNode 1 6 0.00021 - LET #41 = ( /* subquery begin */
202 IndexNode 1 102 0.00087 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
102 CalculationNode 1 102 0.00003 - LET #141 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
103 CollectNode 1 15 0.00003 - COLLECT term = #141 AGGREGATE c = LENGTH() /* hash */
104 CalculationNode 1 15 0.00001 - LET #143 = (c > 1) /* simple expression */
105 FilterNode 1 13 0.00001 - FILTER #143
106 SortNode 1 13 0.00001 - SORT c DESC /* sorting strategy: standard */
107 CalculationNode 1 13 0.00001 - LET #145 = { "term" : term, "c" : c } /* simple expression */
222 SubqueryEndNode 1 5 0.00001 - RETURN #145 ) /* subquery end */
110 CalculationNode 1 5 0.00001 - LET #147 = { "species" : { "project" : #36[0] }, "users_common_names" : #41 } /* simple expression */
220 SubqueryEndNode 1 4 0.00001 - RETURN #147 ) /* subquery end */
121 CalculationNode 1 4 0.00002 - LET #153 = { "obs_id" : v.`observation_id`, "img_id" : v.`image_id`, "user_id" : v.`user_id`, "user" : #47[0], "type" : v.`type`, "val" : v.`value`, "sp" : ((v.`type` == "determination") ? #43 : null)[0] } /* simple expression */ /* collections used: v : votes */
218 SubqueryEndNode 1 1 0.00001 - RETURN #153 ) /* subquery end */
213 SubqueryStartNode 1 2 0.00001 - LET #20 = ( /* subquery begin */
198 IndexNode 1 1 0.00012 - FOR t IN taxa /* hash index scan */ FILTER ("bdtgf_jfm_sync_v1" IN t.`sources`) /* early pruning */
215 SubqueryStartNode 1 1 0.00001 - LET #18 = ( /* subquery begin */
199 IndexNode 1 1 0.00005 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
52 CalculationNode 1 1 0.00000 - LET #113 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
53 CollectNode 1 1 0.00001 - COLLECT term = #113 AGGREGATE c = LENGTH() /* hash */
54 CalculationNode 1 1 0.00001 - LET #115 = (c > 1) /* simple expression */
55 FilterNode 1 1 0.00000 - FILTER #115
56 SortNode 1 1 0.00001 - SORT c DESC /* sorting strategy: standard */
57 CalculationNode 1 1 0.00000 - LET #117 = { "term" : term, "c" : c } /* simple expression */
216 SubqueryEndNode 1 1 0.00000 - RETURN #117 ) /* subquery end */
40 CalculationNode 1 1 0.00000 - LET #103 = t.`species_list` /* attribute expression */ /* collections used: t : taxa */
60 CalculationNode 1 1 0.00000 - LET #119 = { "users_common_names" : #18 } /* simple expression */
41 EnumerateListNode 1 1 0.00005 - FOR sp IN #103 /* list iteration */
42 CalculationNode 1 1 0.00000 - LET #105 = ((sp.`nameAccordingTo` == "bdtgf_jfm_sync_v1") && (o.`computed`.`current_name` IN sp.`synonyms`)) /* simple expression */ /* collections used: o : observations */
43 FilterNode 1 1 0.00000 - FILTER #105
214 SubqueryEndNode 1 1 0.00000 - RETURN #119 ) /* subquery end */
209 SubqueryStartNode 1 2 0.00001 - LET taxo = ( /* subquery begin */
196 IndexNode 1 1 0.00010 - FOR t IN taxa /* hash index scan */ FILTER ("bdtgf_jfm_sync_v1" IN t.`sources`) /* early pruning */
211 SubqueryStartNode 1 1 0.00000 - LET #8 = ( /* subquery begin */
197 IndexNode 1 1 0.00006 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
23 CalculationNode 1 1 0.00000 - LET #91 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
24 CollectNode 1 1 0.00001 - COLLECT term = #91 AGGREGATE c = LENGTH() /* hash */
25 CalculationNode 1 1 0.00000 - LET #93 = (c > 1) /* simple expression */
26 FilterNode 1 1 0.00000 - FILTER #93
27 SortNode 1 1 0.00001 - SORT c DESC /* sorting strategy: standard */
28 CalculationNode 1 1 0.00000 - LET #95 = { "term" : term, "c" : c } /* simple expression */
212 SubqueryEndNode 1 1 0.00000 - RETURN #95 ) /* subquery end */
13 CalculationNode 1 1 0.00000 - LET #83 = t.`species_list` /* attribute expression */ /* collections used: t : taxa */
31 CalculationNode 1 1 0.00000 - LET #97 = { "users_common_names" : #8 } /* simple expression */
14 EnumerateListNode 1 1 0.00007 - FOR sp IN #83 /* list iteration */
15 CalculationNode 1 1 0.00000 - LET #85 = (sp.`nameAccordingTo` == "bdtgf_jfm_sync_v1") /* simple expression */
16 FilterNode 1 1 0.00001 - FILTER #85
210 SubqueryEndNode 1 1 0.00001 - RETURN #97 ) /* subquery end */
63 CalculationNode 1 1 0.00000 - LET syno = (LENGTH(taxo) ? [ ] : #20) /* simple expression */
64 CalculationNode 1 1 0.00001 - LET taxodata = (LENGTH(taxo) ? taxo[0] : syno[0]) /* simple expression */
184 CalculationNode 1 1 0.00001 - LET #197 = { "taxo" : taxodata, "votes" : obs_votes, "partners_votes" : partners_votes, "obs_tags" : obs_tags } /* simple expression */
185 ReturnNode 1 1 0.00001 - RETURN #197
Indexes used:
By Name Type Collection Unique Sparse Selectivity Fields Ranges
195 primary primary observations true false 100.00 % [ `_key` ] (o.`_key` == "1003316854")
206 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == v.`name`)
208 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
207 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
205 idx_1722945121587036160 persistent tags_catalog true false 100.00 % [ `name` ] (tc.`name` == tag.`name`)
200 idx_1723108440400723968 persistent votes true false 100.00 % [ `observation_id`, `user_id`, `image_id`, `type`, `tag` ] (v.`observation_id` == "1003316854")
204 primary primary users true false 100.00 % [ `_key` ] (u.`_key` == v.`user_id`)
201 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == v.`value`)
203 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
203 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
202 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
198 idx_49216 hash taxa false false 100.00 % [ `synonyms[*]` ] (o.`computed`.`current_name` IN t.`synonyms`)
199 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
196 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == o.`computed`.`current_name`)
197 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
Optimization rules applied:
Id RuleName
1 move-calculations-up
2 move-filters-up
3 remove-redundant-calculations
4 remove-unnecessary-calculations
5 remove-redundant-sorts
6 optimize-subqueries
7 move-calculations-up-2
8 move-filters-up-2
9 propagate-constant-attributes
10 use-indexes
11 remove-filter-covered-by-index
12 remove-unnecessary-calculations-2
13 move-calculations-down
14 fuse-filters
15 sort-limit
16 reduce-extraction-to-projection
17 move-filters-into-enumerate
18 splice-subqueries
Query Statistics:
Writes Exec Writes Ign Scan Full Scan Index Filtered Peak Mem [b] Exec Time [s]
0 0 0 243 7 19365888 0.51132
Query Profile:
Query Stage Duration [s]
initializing 0.00000
parsing 0.00035
optimizing ast 0.00005
loading collections 0.00002
instantiating plan 0.00045
optimizing plan 0.50284
executing 0.00760
finalizing 0.00017
from arangodb.
you can use i.e. index hints if you didn't already?
from arangodb.
@matcho : the number of execution plans can be limited globally by adjusting the server startup option --query.optimizer-max-plans
. That startup option defaults to 128, which means that the optimizer is allowed to generated up to 128 query execution plans for each AQL query.
Adjusting the value of this option will affect all AQL queries.
Then there is also the option to override the maximum number of plans per query.
This can be done by passing the maxNumberOfPlans
attribute when executing the query. In the arangosh, it would look like this (https://docs.arangodb.com/3.11/aql/how-to-invoke-aql/with-arangosh/#maxnumberofplans):
maxNumberOfPlans = 64; /* whatever value */
db._query(query, bindVars, { maxNumberOfPlans });
The option can also be passed via the HTTP API (https://docs.arangodb.com/3.11/develop/http-api/queries/aql-queries/#create-a-cursor_body_options_maxNumberOfPlans).
Potentially this option is also supported by client drivers, but I don't know which one is in use.
While adjusting the maximum number of plans option will at least lead to less plans being generated, the question remains why the optimizer creates that many plans in the first place.
The plan explosion must be caused by one or many of the applied optimizer rules
1 move-calculations-up
2 move-filters-up
3 remove-redundant-calculations
4 remove-unnecessary-calculations
5 remove-redundant-sorts
6 optimize-subqueries
7 move-calculations-up-2
8 move-filters-up-2
9 propagate-constant-attributes
10 use-indexes
11 remove-filter-covered-by-index
12 remove-unnecessary-calculations-2
13 move-calculations-down
14 fuse-filters
15 sort-limit
16 reduce-extraction-to-projection
17 move-filters-into-enumerate
18 splice-subqueries
Would be interesting to know which one it is, because maybe it can be tuned to generate less plans.
If you want to investigate, you can run the query with specific optimizer rules turned off, just to check if the number of plans goes down then.
With the arangosh, the way to do this would be:
ruleName = "fuse-filters"; /* whatever rule to exclude */
db._explain(query, bindVars, { optimizer: { rules: ["-" + ruleName] } });
Running that will explain the query with the specific rule turned off. It turning off any of the rules leads to substantially fewer plans being generated, then this is a good indicator. Note that turned off some of the rules will still be unwanted, e.g. "use-indexes". While turning off that rule may lead to fewer execution plans, it may also mean that indexes will be used in fewer cases, which can have detrimental effcets on the query runtime later.
So I suggest to turn off optimizer rules only for finding out which rules lead to the plan explosion.
from arangodb.
@dothebart do you mean that unsing indexHints could help the optimizer generate less plans? I tried to add an indexHint
to every collection in the query, but the execution time did not change. Also tried adding forceIndexHint
and it changes nothing. By the way, the indexHint was refused for 2 collections in the query (projects
line 62 and tags_catalog
line 139), although when running explain it clearly shows that those 2 indexes will be used anyway… strange.
from arangodb.
First, limiting the number of plans indeed works:
db._query(q1, {}, { maxNumberOfPlans: 64 })
→ 297ms
db._query(q1, {}, { maxNumberOfPlans: 32 })
→ 123ms
db._query(q1, {}, { maxNumberOfPlans: 16 })
→ 88ms
db._query(q1, {}, { maxNumberOfPlans: 1 })
→ 20ms
Query result is valid in all cases above, even with 1 plan max.
Second, after investigating as you suggested, it appears that the rule that leads to a huge number of plans is interchange-adjacent-enumerations
(found it in "Optimization rules with highest execution times" section of explain output). After disabling it, number of plans drops to 16 and query execution time to 70ms
(explain below).
From the documentation :
interchange-adjacent-enumerations
: Try out permutations of FOR statements in queries that contain multiple loops, which may enable further optimizations by other rules.
Makes sense that a higher number of loops leads to an exploding combination of permutations.
This is great 🥳 thank you very much !
Do you think that disabling interchange-adjacent-enumerations
server-wide could have a negative impact on certain queries ?
PS: indexHint issue is still present in 3.11.4, will try to report it properly soon.
127.0.0.1:8529@plantnetv2> db._explain(q1, {}, { optimizer: { rules: [ "-interchange-adjacent-enumerations" ] } });
Query String (2797 chars, cacheable: true):
LET project_pn_key = 'bdtgf_jfm_sync_v1' LET r = [] FOR o IN observations FILTER o._key ==
'1003316854' LET taxo = ( FOR t IN taxa FILTER t.name == o.computed.current_name FILTER
project_pn_key IN t.sources FOR sp IN t.species_list FILTER sp.nameAccordingTo == project_pn_key
RETURN { users_common_names: ( for ve in vernaculars filter ve.species == t.name filter ve.lang ==
'fr' collect term = ve.term with count into c filter c > 1 sort c desc return {term, c} ) } ) LET
syno = (LENGTH(taxo)) ? [] : ( FOR t in taxa FILTER o.computed.current_name IN t.synonyms FILTER
project_pn_key IN t.sources FOR sp IN t.species_list FILTER sp.nameAccordingTo == project_pn_key
FILTER o.computed.current_name IN sp.synonyms RETURN { users_common_names: ( for ve in vernaculars
filter ve.species == t.name filter ve.lang == 'fr' collect term = ve.term with count into c filter c
> 1 sort c desc return {term, c} ) } ) LET taxodata = (LENGTH(taxo) ? taxo[0] : syno[0]) LET
obs_votes = ( FOR v IN votes FILTER v.observ...
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
195 IndexNode 1 - FOR o IN observations /* primary index scan */
125 CalculationNode 1 - LET #155 = o.`computed`.`votes` /* attribute expression */ /* collections used: o : observations */
231 SubqueryStartNode 1 - LET partners_votes = ( /* subquery begin */
126 EnumerateListNode 100 - FOR v IN #155 /* list iteration */
233 SubqueryStartNode 100 - LET sp = ( /* subquery begin */
206 IndexNode 100 - FOR t IN taxa /* hash index scan */
191 LimitNode 1 - LIMIT 0, 1
132 CalculationNode 1 - LET #159 = t.`sources` /* attribute expression */ /* collections used: t : taxa */
237 SubqueryStartNode 1 - LET #56 = ( /* subquery begin */
133 EnumerateListNode 100 - FOR s IN #159 /* list iteration */
208 IndexNode 100 - FOR p IN projects /* hash index scan, projections: `_key` */
137 LimitNode 1 - LIMIT 0, 1
190 LimitNode 1 - LIMIT 0, 1
138 CalculationNode 1 - LET #163 = p.`_key` /* attribute expression */ /* collections used: p : projects */
238 SubqueryEndNode 1 - RETURN #163 ) /* subquery end */
235 SubqueryStartNode 1 - LET #61 = ( /* subquery begin */
207 IndexNode 2 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
147 CalculationNode 2 - LET #169 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
148 CollectNode 2 - COLLECT term = #169 AGGREGATE c = LENGTH() /* hash */
149 CalculationNode 2 - LET #171 = (c > 1) /* simple expression */
150 FilterNode 2 - FILTER #171
151 SortNode 2 - SORT c DESC /* sorting strategy: standard */
152 CalculationNode 2 - LET #173 = { "term" : term, "c" : c } /* simple expression */
236 SubqueryEndNode 1 - RETURN #173 ) /* subquery end */
155 CalculationNode 1 - LET #175 = { "species" : { "project" : #56[0] }, "users_common_names" : #61 } /* simple expression */
234 SubqueryEndNode 100 - RETURN #175 ) /* subquery end */
158 CalculationNode 100 - LET #177 = { "obs_id" : o.`_key`, "val" : v.`name`, "sp" : sp[0], "plus" : v.`plus` } /* simple expression */ /* collections used: o : observations */
232 SubqueryEndNode 1 - RETURN #177 ) /* subquery end */
162 CalculationNode 1 - LET #179 = o.`computed`.`tags` /* attribute expression */ /* collections used: o : observations */
227 SubqueryStartNode 1 - LET obs_tags = ( /* subquery begin */
163 EnumerateListNode 100 - FOR tag IN #179 /* list iteration */
205 IndexNode 100 - FOR tc IN tags_catalog /* persistent index scan, projections: `public_display`, `target`, `values` */ FILTER ((tc.`target` == "observation") && (tc.`public_display` == true)) /* early pruning */
172 CalculationNode 100 - LET #187 = tc.`values` /* attribute expression */ /* collections used: tc : tags_catalog */
229 SubqueryStartNode 100 - LET #72 = ( /* subquery begin */
173 FilterNode 100 - FILTER #187
175 EnumerateListNode 10000 - FOR tv IN #187 /* list iteration */
176 CalculationNode 10000 - LET #191 = (tv.`name` == tag.`value`) /* simple expression */
177 FilterNode 10000 - FILTER #191
192 LimitNode 1 - LIMIT 0, 1
178 CalculationNode 1 - LET #193 = { "key" : tv.`name`, "label" : (tv.`description`.`fr` || tv.`description`.`en`) } /* simple expression */
230 SubqueryEndNode 100 - RETURN #193 ) /* subquery end */
181 CalculationNode 100 - LET #195 = { "value" : #72[0] } /* simple expression */
228 SubqueryEndNode 1 - RETURN #195 ) /* subquery end */
217 SubqueryStartNode 1 - LET obs_votes = ( /* subquery begin */
200 IndexNode 4291240 - FOR v IN votes /* persistent index scan */
225 SubqueryStartNode 4291240 - LET #47 = ( /* subquery begin */
204 IndexNode 4291240 - FOR u IN users /* primary index scan, projections: `email` */
193 LimitNode 1 - LIMIT 0, 1
118 CalculationNode 1 - LET #151 = { "email" : u.`email` } /* simple expression */ /* collections used: u : users */
226 SubqueryEndNode 4291240 - RETURN #151 ) /* subquery end */
219 SubqueryStartNode 4291240 - LET #43 = ( /* subquery begin */
201 IndexNode 4291240 - FOR t IN taxa /* hash index scan */
82 CalculationNode 4291240 - LET #129 = t.`sources` /* attribute expression */ /* collections used: t : taxa */
223 SubqueryStartNode 4291240 - LET #36 = ( /* subquery begin */
83 EnumerateListNode 429124000 - FOR s IN #129 /* list iteration */
203 IndexNode 858248000 - FOR p IN projects /* hash index scan, projections: `_key`, `_pn_key`, `roles` */
85 CalculationNode 858248000 - LET #131 = ((p.`_pn_key` == s) && ((LENGTH(p.`roles`) == 0) || (LENGTH(p.`roles`[* FILTER false RETURN #32_]) > 0))) /* simple expression */ /* collections used: p : projects */
86 FilterNode 858248000 - FILTER #131
90 CalculationNode 858248000 - LET order = ((p.`_key` == "guyane") ? 0 : 1) /* simple expression */ /* collections used: p : projects */
91 SortNode 858248000 - SORT order ASC /* sorting strategy: constrained heap */
92 LimitNode 1 - LIMIT 0, 1
194 LimitNode 1 - LIMIT 0, 1
93 CalculationNode 1 - LET #135 = p.`_key` /* attribute expression */ /* collections used: p : projects */
224 SubqueryEndNode 4291240 - RETURN #135 ) /* subquery end */
221 SubqueryStartNode 4291240 - LET #41 = ( /* subquery begin */
202 IndexNode 8582480 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
102 CalculationNode 8582480 - LET #141 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
103 CollectNode 6865984 - COLLECT term = #141 AGGREGATE c = LENGTH() /* hash */
104 CalculationNode 6865984 - LET #143 = (c > 1) /* simple expression */
105 FilterNode 6865984 - FILTER #143
106 SortNode 6865984 - SORT c DESC /* sorting strategy: standard */
107 CalculationNode 6865984 - LET #145 = { "term" : term, "c" : c } /* simple expression */
222 SubqueryEndNode 4291240 - RETURN #145 ) /* subquery end */
110 CalculationNode 4291240 - LET #147 = { "species" : { "project" : #36[0] }, "users_common_names" : #41 } /* simple expression */
220 SubqueryEndNode 4291240 - RETURN #147 ) /* subquery end */
121 CalculationNode 4291240 - LET #153 = { "obs_id" : v.`observation_id`, "img_id" : v.`image_id`, "user_id" : v.`user_id`, "user" : #47[0], "type" : v.`type`, "val" : v.`value`, "sp" : ((v.`type` == "determination") ? #43 : null)[0] } /* simple expression */ /* collections used: v : votes */
218 SubqueryEndNode 1 - RETURN #153 ) /* subquery end */
213 SubqueryStartNode 1 - LET #20 = ( /* subquery begin */
198 IndexNode 1 - FOR t IN taxa /* hash index scan */ FILTER ("bdtgf_jfm_sync_v1" IN t.`sources`) /* early pruning */
215 SubqueryStartNode 1 - LET #18 = ( /* subquery begin */
199 IndexNode 2 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
52 CalculationNode 2 - LET #113 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
53 CollectNode 2 - COLLECT term = #113 AGGREGATE c = LENGTH() /* hash */
54 CalculationNode 2 - LET #115 = (c > 1) /* simple expression */
55 FilterNode 2 - FILTER #115
56 SortNode 2 - SORT c DESC /* sorting strategy: standard */
57 CalculationNode 2 - LET #117 = { "term" : term, "c" : c } /* simple expression */
216 SubqueryEndNode 1 - RETURN #117 ) /* subquery end */
40 CalculationNode 1 - LET #103 = t.`species_list` /* attribute expression */ /* collections used: t : taxa */
60 CalculationNode 1 - LET #119 = { "users_common_names" : #18 } /* simple expression */
41 EnumerateListNode 100 - FOR sp IN #103 /* list iteration */
42 CalculationNode 100 - LET #105 = ((sp.`nameAccordingTo` == "bdtgf_jfm_sync_v1") && (o.`computed`.`current_name` IN sp.`synonyms`)) /* simple expression */ /* collections used: o : observations */
43 FilterNode 100 - FILTER #105
214 SubqueryEndNode 1 - RETURN #119 ) /* subquery end */
209 SubqueryStartNode 1 - LET taxo = ( /* subquery begin */
196 IndexNode 1 - FOR t IN taxa /* hash index scan */ FILTER ("bdtgf_jfm_sync_v1" IN t.`sources`) /* early pruning */
211 SubqueryStartNode 1 - LET #8 = ( /* subquery begin */
197 IndexNode 2 - FOR ve IN vernaculars /* hash index scan, projections: `term` */
23 CalculationNode 2 - LET #91 = ve.`term` /* attribute expression */ /* collections used: ve : vernaculars */
24 CollectNode 2 - COLLECT term = #91 AGGREGATE c = LENGTH() /* hash */
25 CalculationNode 2 - LET #93 = (c > 1) /* simple expression */
26 FilterNode 2 - FILTER #93
27 SortNode 2 - SORT c DESC /* sorting strategy: standard */
28 CalculationNode 2 - LET #95 = { "term" : term, "c" : c } /* simple expression */
212 SubqueryEndNode 1 - RETURN #95 ) /* subquery end */
13 CalculationNode 1 - LET #83 = t.`species_list` /* attribute expression */ /* collections used: t : taxa */
31 CalculationNode 1 - LET #97 = { "users_common_names" : #8 } /* simple expression */
14 EnumerateListNode 100 - FOR sp IN #83 /* list iteration */
15 CalculationNode 100 - LET #85 = (sp.`nameAccordingTo` == "bdtgf_jfm_sync_v1") /* simple expression */
16 FilterNode 100 - FILTER #85
210 SubqueryEndNode 1 - RETURN #97 ) /* subquery end */
63 CalculationNode 1 - LET syno = (LENGTH(taxo) ? [ ] : #20) /* simple expression */
64 CalculationNode 1 - LET taxodata = (LENGTH(taxo) ? taxo[0] : syno[0]) /* simple expression */
184 CalculationNode 1 - LET #197 = { "taxo" : taxodata, "votes" : obs_votes, "partners_votes" : partners_votes, "obs_tags" : obs_tags } /* simple expression */
185 ReturnNode 1 - RETURN #197
Indexes used:
By Name Type Collection Unique Sparse Selectivity Fields Ranges
195 primary primary observations true false 100.00 % [ `_key` ] (o.`_key` == "1003316854")
206 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == v.`name`)
208 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
207 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
205 idx_1722945121587036160 persistent tags_catalog true false 100.00 % [ `name` ] (tc.`name` == tag.`name`)
200 idx_1723108440400723968 persistent votes true false 100.00 % [ `observation_id`, `user_id`, `image_id`, `type`, `tag` ] (v.`observation_id` == "1003316854")
204 primary primary users true false 100.00 % [ `_key` ] (u.`_key` == v.`user_id`)
201 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == v.`value`)
203 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
203 idx_49132 hash projects false false 97.06 % [ `_pn_key` ] (p.`_pn_key` == s)
202 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
198 idx_49216 hash taxa false false 100.00 % [ `synonyms[*]` ] (o.`computed`.`current_name` IN t.`synonyms`)
199 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
196 idx_10219655942 hash taxa true false 100.00 % [ `name` ] (t.`name` == o.`computed`.`current_name`)
197 idx_10856933578 hash vernaculars false false 46.39 % [ `species`, `lang` ] ((ve.`species` == t.`name`) && (ve.`lang` == "fr"))
Functions used:
Name Deterministic Cacheable Uses V8
LENGTH true true false
Optimization rules applied:
Id RuleName
1 move-calculations-up
2 move-filters-up
3 remove-redundant-calculations
4 remove-unnecessary-calculations
5 remove-redundant-sorts
6 optimize-subqueries
7 move-calculations-up-2
8 move-filters-up-2
9 propagate-constant-attributes
10 use-indexes
11 remove-filter-covered-by-index
12 remove-unnecessary-calculations-2
13 move-calculations-down
14 fuse-filters
15 sort-limit
16 reduce-extraction-to-projection
17 move-filters-into-enumerate
18 splice-subqueries
Optimization rules with highest execution times:
RuleName Duration [s]
specialize-collect 0.00893
use-indexes 0.00507
splice-subqueries 0.00465
move-calculations-up-2 0.00458
move-calculations-down 0.00396
539 rule(s) executed, 16 plan(s) created
from arangodb.
Related Issues (20)
- Graph traversal 1million edges too slowly HOT 5
- REGEX_REPLACE character classes not matching correctly HOT 3
- Algorithmics Graph travsersal ArangoDB with FILTER too slowly HOT 8
- PRUNE GRAPH NOT WORK WHEN USE CONDITION USES NEGATION HOT 2
- How to determine if the query was served by a specific shard using AQL Explain? HOT 1
- Performance graph data with filter too slowly HOT 12
- Arangodb not getting deployed on AWS EKS cluster though worked on local K8s cluster HOT 8
- CentOS / RHEL installs of ArangoDB failing HOT 1
- Does arangodb support IPV6 if the dual-stack arangodb deployed in a cluster cannot be started in a dual-stack K8s environment and no obvious error is reported in the debug logs?
- GCC 13.2.0 build devel error HOT 2
- Vulnerability runc CVE-2024-21626 in ArangoDB HOT 1
- Cluster RemoteNode slowdowns query by 50 times comparing to single instance HOT 3
- An integer field cannot be incremented HOT 2
- FATAL [c81f6] {general} failed to bind to endpoint 'http+tcp://[::]:8529'. Please check whether another instance is already running using this endpoint and review your endpoints configuration. HOT 1
- Guix package HOT 1
- After updating to version 3.11.7 of Arango BD, the service is constantly restarting HOT 1
- Graph traversal with cluster from list nodes very slow HOT 1
- GEO_CONTAINS produces false results (inconsistent right-hand rule) HOT 2
- Execution time inconsistent with the use of an index with sorting
- What next after Pregel removal?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arangodb.