Comments (16)
Here is a patch adding the interface, and amending FunctionMatchQuery and FunctionScoreQuery to delegate their getCacheHelper() methods to their wrapped DoubleValuesSource. It also moves the static helper methods for DV queries and combining multiple CacheHelpers away from Weight and onto SegmentCachable.
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 07 2017]
from stargazers-migration-test.
Can we reconsider the latter?
This is a bit too much indirection and abstractions IMO for something that's essentially a boolean method returning fi.dvGen == -1
:
`@Override`
public IndexReader.CacheHelper getCacheHelper(LeafReaderContext context) {
return SegmentCachable.getDocValuesCacheHelper(field, context);
}
Given that this is an abstract method (required) on Weight, and given that we can only cache per-segment, can we please simplify it?
[Legacy Jira: Robert Muir (@rmuir) on Nov 07 2017]
from stargazers-migration-test.
This patch should simplify things a bit for implementers. Instead of directly exposing IndexReader.CacheHelper, getCacheHelper(LeafReaderContext) is now moved to an inner CacheLevel class. SegmentCachable declares a single getCacheLevel() method, and implementers can return one of the following:
- CacheLevel.SEGMENT for stuff that's always cachable
- CacheLevel.DOCVALUES(field) for things using docvalues
- CacheLevel.NEVER for stuff that's never cachable
Retrieving CacheHelpers and comparing different levels of cache is all done within the CacheHelper class itself.
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 07 2017]
from stargazers-migration-test.
For reference, the method quoted above would now look like this:
`@Override`
public CacheLevel getCacheLevel() {
return SegmentCachable.DOCVALUES(field);
}
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 08 2017]
from stargazers-migration-test.
Here's an updated patch with better testing for handling multiple nested docvalues SegmentCachable objects. The multiple-DV stuff in the previous patch was fairly comprehensively broken.
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 08 2017]
from stargazers-migration-test.
some of the code in this patch still uses SegmentCacheable, so i have trouble reviewing. I think these are just leftovers?
[Legacy Jira: Robert Muir (@rmuir) on Nov 08 2017]
from stargazers-migration-test.
Also i would still like to see if we can make this simply a boolean method, isCacheable.
I am concerned that API decisions are being made on broken assumptions (eg. LUCENE-8017). Such function queries that depend on the documents score can never be cached, ever, because they users can override *statistics methods in IndexSearcher and implement distributed search, or feed numbers from a random number generator, or whatever the hell they want. So it is actually false that such queries depend on the whole index, they are simply unsafe to cache.
So, I'd like to put an end to the theoretical discussion of top-level caching here, right now, and make the api minimal and something we can live with.
[Legacy Jira: Robert Muir (@rmuir) on Nov 08 2017]
from stargazers-migration-test.
Also i would still like to see if we can make this simply a boolean method, isCacheable
Here's a patch that does just that, and it does indeed simplify a whole bunch of code. I moved the method checking whether or not DV fields are cacheable to DocValues.isCacheable() - maybe it should be called isUpdated() or something instead?
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 08 2017]
from stargazers-migration-test.
@rcmuir are you happy with the last patch?
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 10 2017]
from stargazers-migration-test.
+1, this really looks a lot better to me.
[Legacy Jira: Robert Muir (@rmuir) on Nov 10 2017]
from stargazers-migration-test.
Commit 6e4f9a62e7cc221dcb49788ab683c87f764f2f4a in lucene-solr's branch refs/heads/branch_7x from @romseygeek
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6e4f9a6
LUCENE-8042: Add SegmentCachable interface
[Legacy Jira: ASF subversion and git services on Nov 10 2017]
from stargazers-migration-test.
Commit 317c9f359f3779725324fdb546fbb2ebe7fcf54c in lucene-solr's branch refs/heads/branch_7x from @romseygeek
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=317c9f3
LUCENE-8042: Fix precommit and CHANGES
[Legacy Jira: ASF subversion and git services on Nov 10 2017]
from stargazers-migration-test.
Commit 276e317e9424252d89df7596851c7cd3559d79b1 in lucene-solr's branch refs/heads/master from @romseygeek
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=276e317
LUCENE-8042: Add SegmentCachable interface
[Legacy Jira: ASF subversion and git services on Nov 10 2017]
from stargazers-migration-test.
Commit b5571031cab9199d7a74370f69d821f4676e2caa in lucene-solr's branch refs/heads/master from @romseygeek
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b557103
LUCENE-8042: Fix precommit and CHANGES
[Legacy Jira: ASF subversion and git services on Nov 10 2017]
from stargazers-migration-test.
Thanks for the reviews Robert!
[Legacy Jira: Alan Woodward (@romseygeek) on Nov 10 2017]
from stargazers-migration-test.
Really nice!
[Legacy Jira: David Smiley (@dsmiley) on Nov 10 2017]
from stargazers-migration-test.
Related Issues (20)
- Code Cleanup: Use entryset for map iteration wherever possible - part 2 possible. [LUCENE-8979] HOT 5
- Optimise SegmentTermsEnum.seekExact performance [LUCENE-8980] HOT 9
- Update javadocs to reflect experimental status of Kuromoji DictionaryBuilder [LUCENE-8981] HOT 3
- Make NativeUnixDirectory pure java now that direct IO is possible [LUCENE-8982] HOT 31
- PhraseWildcardQuery - new query to control and optimize wildcard expansions in phrase [LUCENE-8983] HOT 13
- SynonymGraphFilter cannot handle input stream with tokens filtered. [LUCENE-8985] HOT 12
- Add asf.yaml to our git repo [LUCENE-8986] HOT 7
- Move Lucene web site from svn to git [LUCENE-8987] HOT 58
- Maximal -- Minimum Based Early Termination For TopFieldCollector [LUCENE-8988]
- IndexSearcher Should Handle Rejection of Concurrent Task [LUCENE-8989] HOT 10
- IndexOrDocValuesQuery can take a bad decision for range queries if field has many values per document [LUCENE-8990] HOT 8
- disable java.util.HashMap assertions to avoid spurious vailures due to JDK-8205399 [LUCENE-8991] HOT 13
- Share minimum score across segments in concurrent search [LUCENE-8992] HOT 7
- Change Maven POM repository URLs to https [LUCENE-8993] HOT 15
- Code Cleanup - Pass values to list constructor instead of empty constructor followed by addAll(). [LUCENE-8994] HOT 5
- TopSuggestDocsCollector#collect should be able to signal rejection [LUCENE-8995] HOT 1
- Add type of triangle info to ShapeField encoding [LUCENE-8997] HOT 4
- OverviewImplTest.testIsOptimized reproducible failure [LUCENE-8998] HOT 5
- expectThrows doesn't play nicely with "assume" failures [LUCENE-8999] HOT 12
- Cannot resolve classes from org.apache.lucene.core plugin and others [LUCENE-9000] HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stargazers-migration-test.