Giter VIP home page Giter VIP logo

cbelasticsearch's People

Contributors

blusol850 avatar davidsf avatar elpete avatar eppinger avatar jclausen avatar lmajano avatar michaelborn avatar sbawaney avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cbelasticsearch's Issues

client.saveAll() does not properly handle errors if throwOnError=false

If one calls the saveAll() method without passing throwOnError=true, the HyperClient will actually cause an error on line 1121 because the expected items do not exist in the returned result.

https://github.com/coldbox-modules/cbox-elasticsearch/blob/development/modules/cbelasticsearch/models/io/HyperClient.cfc#L1121

So basically, if throwOnError=true is not passed, the Hyper client fails to handle errors whatsoever, and still attempts to loop over items even if they do not exist.

I wouldn't mind sending a small PR to fix this!

Suggested fix:

param saveResult.items = []

Add track_total_hits property

The total response of a query (I think since version 7), set a struct like:

"hits": { 
    "total": {
      "value": 10000,
      "relation": "gte"
    },
...

Sometimes we need the real total value, not the estimation.

The are a track_total_hits property to the query Elasticsearch doc that when set to true it tracks the correct total:

"total": 
      "value": 2734635,
      "relation": "eq"
    }

(it's posible to set to a number too, see the docs)

Cannot update index settings with CBElasticsearch

This code should update the index settings:

getIndexBuilder().patch( name = "reviews", settings = { "refresh_interval" : "1s" } );

Instead, nothing happens. ๐Ÿ›

This seems to be an omission in HyperBuilder.applyIndex()... i.e. there is no code path which deals with configuring index settings on existing indices.

Document.setId() does not persist??

The setId() method call on the Document component doesn't seem to actually set variables.id in the component.

application.wirebox.getInstance( "Document@CBElasticsearch" )
						.setIndex( "classes" )
						.setId( rc.id )
						.get()

Running this gives me the following error:

Screenshot from 2021-04-25 20-51-30

And yep, I confirmed that rc.id is a valid document ID. Changing to the following made the error "go away":

application.wirebox.getInstance( "Document@CBElasticsearch" )
						.get(
							index = "classes",
							id = rc.id
						)

Has something changed on this? I scanned the Document.cfc, but I'm not seeing any variables.id nonsense... unless somehow the reset is being called before get() runs. ๐Ÿค”

Properties population to new scopes reserved words to query

When using the following instantiation of the searchBuilder:

searchBuilder.new( "indexName", "typeName", { "maxrows" : 1000} )

The maxrows value in the properties arg is being scoped as the query and the maxrows value in the DSL provided to execution.

Add connection reaping strategy

Currently, when using the JEST client, connection reaping is disabled to prevent connection issues with the client singleton. This is seen in the log message:

INFO io.searchbox.client.JestClientFactory - Idle connection reaping disabled...

Implement a connection reaping strategy and pooling, similar to that used in CBMongoDB

Logstash appender fallback broken on ColdBox 7

See this partial stack trace:

lucee.runtime.exp.ExpressionException: The parameter [appenders] to function [getAppendersMap] is required but was not passed in.
	at lucee.runtime.type.UDFImpl.defineArguments(UDFImpl.java:143)
	at lucee.runtime.type.UDFImpl._call(UDFImpl.java:347)
	at lucee.runtime.type.UDFImpl.call(UDFImpl.java:223)
	at lucee.runtime.ComponentImpl._call(ComponentImpl.java:698)
	at lucee.runtime.ComponentImpl._call(ComponentImpl.java:586)
	at lucee.runtime.ComponentImpl.call(ComponentImpl.java:1933)
	at lucee.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:787)
	at lucee.runtime.PageContextImpl.getFunction(PageContextImpl.java:1775)
	at models.logging.logstashappender_cfc$cf.udfCall1(/cbElasticsearch/models/logging/LogstashAppender.cfc:160)
	at models.logging.logstashappender_cfc$cf.udfCall(/cbElasticsearch/models/logging/LogstashAppender.cfc)

Looks like the getAppendersMap() call in LogstashAppender.cfc is missing the required appenders parameter.

var appendersMap  = application.wirebox.getLogbox().getAppendersMap();

Should be:

var appendersMap  = application.wirebox.getLogbox().getAppendersMap(
	application.wirebox.getLogbox().getConfig().getRoot().appenders
);

Connect to various elasticsearch clusters

We have two elasticsearch clusters (one with user data and another with stats).

As I see, the module only can connect to one cluster, correct?

Can I use the module to connect to two clusters?

Support method chaining from filterTerm, filterTerms

Both of these search builder fields do not return anything, and thus contradict the standard SearchBuilder api.

  • filterTerm( name, value )
  • filterTerms( array name, array value )

I can push up a PR for this as well, provided we agree it's a good fix.

HyperClient.getAliases() is faulty

The return is only correct if only one "index" is assigned to an "alias".

However, it is possible that multiple "indexes" are assigned to an "alias".

Example Response from the ES API

{
     "s4_page_de_v1.3": {
        "aliases": {
            "s4_de": {}
        }
    },
    "s4_news_it_v1.3": {
        "aliases": {
            "s4_it": {}
        }
    },
    "s4_page_it": {
        "aliases": {
            "s4_it": {}
        }
    },
    "s4_page_it_v1.3": {
        "aliases": {
            "s4_it": {}
        }
    }
}

I fixed this locallly by changes in Line 523 in the HyperClient and use an array instead of a struct

if ( !aliasesMap.aliases.keyExists(alias) ) {
	aliasesMap.aliases[ alias ] = [];
}

aliasesMap.aliases[ alias ].append(indexName);

Should I open a pull request for this ?

filterterms() function is broken

since commit 4a2e943

Their is a difference between filterTerm() and filterTerms()

You change generates now:

"query": {
      "bool": {
        "filter": {
          "bool": {
            "must": [
              { "term": { "siteconstant_origin": "ch_sozjobs" } },
              { "term": { "siteconstant_origin": "ch_spitexjobs" } },
              { "term": { "type": "joboffer" } },
              { "term": { "trashstatus": "0" } }
            ]
          }
        }
   ....

and for this change

  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "siteconstant_origin": ["ch_sozjobs", "ch_spitexjobs"]
              }
            },
            { "term": { "type": "joboffer" } },
            { "term": { "trashstatus": "0" } }
          ]
        }
      }
  ....

Your change ignored the different handling of "terms" vs "term". For "terms" only one condition must be fulfilled

Add shouldMatchAnyTerm() method

Add a shouldMatchAnyTerm method, which allows a field and an array of values. This will prevent "hacky" workarounds to the shouldMatch method like the following:

for( var item in listToArray( listOfValues ) ){
	search.shouldMatch( "term", { "myfield" : item  } );
}

Investigate whether additional methods for non-term search patterns with an array of values may be implemented, as well.

Support method chaining from Document.setValue()?

If the Document.setValue() function returned true as opposed to nothing (void), we could chain multiple setValue() calls together:

var myDoc = getInstance( "Document@cbElasticSearch" )
    .new(
        index = "my_index"
        type = "_doc"
    )
    .setValue( "name", "Fred" )
    .setValue( "age", "89" )
    .setValue( "isMember", true )
    .save();

The populate() method returns the initialized Document, so it seems this is already an established precedent?

Need better error message returns for "all shards failed"

Occasionally an error message returned from the Jest client is pretty useless - i.e. "all shards failed". If we can dig deeper into the error response from Elasticsearch, we can provide a much more useful message to the user/developer.

I tried the following in SearchResult.cfc, however I am having trouble testing it - I don't know how to programmatically generate an "all shards failed" error.

// original "reason"
var reason = structKeyExists( arguments.properties, "error" ) ? " Reason: #arguments.properties.error.reason#" : "";

// new, more useful / deeper "reason"
if ( reason == "all shards failed" && structKeyExists( arguments.properties, "error" ) && structKeyExists( arguments.properties.error, "caused_by" ) ){
    reason = arguments.properties.error["caused_by"].reason;
}
// throw...

Add support for updating index settings

It would be really nice if we had the ability to programmatically update index settings from CFML (without using Postman). Right now the JestClient ApplyIndex() function definitely only supports creating new indices.

I believe this is the class we need to utilize:

https://github.com/searchbox-io/Jest/blob/fe19dc4396e08f0e03c4e749a60b8a0ed398f084/jest-common/src/main/java/io/searchbox/indices/settings/UpdateSettings.java

Then any of these settings could be supported using a key/value struct:

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html#indices-update-settings

I intend to take a crack at this at some point.

FYI, I'm highly interested in this to improve write performance during a mass-serialization on startup.

e.g. issue this call before mass-inserting documents

PUT /content/_settings
{
    "index" : {
        "refresh_interval" : "-1"
    }
}

And of course issue a reversal afterwards.

In the README.md:28 the wrong repository/module is linked

Is:

* **Code**: [https://github.com/coldbox-modules/cbox-cborm](https://github.com/coldbox-modules/cbox-cborm)

Should be:

* **Code**: [https://github.com/coldbox-modules/cbox-elasticsearch](https://github.com/coldbox-modules/cbox-elasticsearch)

Bug: Logstash Appender fails if tagContext "Template" file does not exist.

Try to run this code on ACF:

javaCast( "integer", 0 );

In ACF, that one-liner will create an exception where tagContext[1].TEMPLATE is actually "eg". This obviously makes no sense, but it seems we may need to safely handle cases where the referenced template file does not exist.

See screenshot:

Screenshot from 2021-08-12 13-39-53

I could try to add a workaround, but I'm not sure what should happen here. I would think downstream stuff like stachebox would expect the source code keys to be populated in the error log JSON.

[docs] Indexing Basics section out of date

We need to update the index section of the docs:

In short, indexes have a higher overhead and make the aggregation of search results between types very more expensive. If it is desired that your application search interfaces return multiple entity or domain types, then those should respresent distinctive types within a single index, allowing them to be aggregated, sorted, and ordered in search results.

This is no longer valid in ES 7.0.

Need indexbuilder `update()` method

There is no obvious way to update an index, and calling indexbuilder.new().save() is both counter-intuitive and actually returns a warning that the index already exists.

https://github.com/coldbox-modules/cbox-elasticsearch/blob/development/modules/cbelasticsearch/models/JestClient.cfc#L348

We can easily make this intuitive with the following changes:

  1. Adding an update() method in IndexBuilder which wraps .new().
  2. Removing the "Index NAME already exists" warning
  3. Adding docs for the .update() method - see Issue #57 .

elasticSearch Logbox Appender

this will break ES >6

instance.DEFAULTS = { "index" : "logbox", "type" : "logs-" & dateFormat( now(), "yyyy-mm-dd" ), "rotate" : true, "rotationDays" : 30, "rotationFrequency": 5, "ensureChecks" : true, "autoCreate" : true };

In ES > 6 is only one Type per Index allowed.

I think it would be better to set:
"index" : "logs-" & dateFormat( now(), "yyyy-mm-dd" )", "type" : "logbox"

returnvalue in jestClient newResult() ?

Should this not be "searchResult" instead of "Document" in the newResult() function

	/**
	* SearchResult provider
	**/
	Document function newResult() provider="SearchResult@cbElasticsearch"{}
	
	/**
	* Configure instance once DI is complete
	**/
	any function onDIComplete(){

		configure();

	}

Client.saveAll() does not detect document save error

If you attempt to save a document with an invalid field value (like "George" in an integer field), JestClient will throw an error at line 1072 because it expects item.index.result to exist:

for( var item in saveResult.items ){
    arrayAppend(
        results,
        {
            "_id"    : item.index[ "_id" ],
            "result" : item.index.result // <- throws var undefined error if document save resulted in error
        }
    );
}

Instead, we should detect item.index.error and either

  1. throw an error, or
  2. return the error struct back to the user.

I'd think option 2 makes more sense for the saveAll() method since it's a bulk API... however, most code probably wouldn't watch for that and could "save" 10,000 rows and never realize there was an error. So option 1 would also make a lot of sense. ๐Ÿคทโ€โ™‚

Option 1:

for( var item in saveResult.items ){
    if ( structKeyExists( item.index, "error" ) ){
        throw(
            type="cbElasticsearch.JestClient.PersistenceException",
            message="Document could not be saved.  The error returned was: #( isSimpleValue( saveResult.error ) ? item.index.error : item.index.error.reason )#",
            extendedInfo=serializeJSON( item.index.error, false, listFindNoCase( "Lucee", server.coldfusion.productname ) ? "utf-8" : false )
        );
    }
    arrayAppend(
        results,
        {
            "_id"    : item.index[ "_id" ],
            "result" : item.index.result
        }
    );
}

Option 2:

for( var item in saveResult.items ){
    var result = {
        "_id"    : item.index[ "_id" ]
    };
    if ( structKeyExists( item.index, "error" ) ){
        result["error"] = item.index.error;
        throw(
            type="cbElasticsearch.JestClient.PersistenceException",
            message="Document could not be saved.  The error returned was: #( isSimpleValue( saveResult.error ) ? item.index.error : item.index.error.reason )#",
            extendedInfo=serializeJSON( item.index.error, false, listFindNoCase( "Lucee", server.coldfusion.productname ) ? "utf-8" : false )
        );
    }
    if ( structKeyExists( item.index, "result" ) ){
        result["result"] = item.index.result;
    }
    arrayAppend(
        results,
        result
    );
}

Add fluency convenience methods for control of query item placement

Currently many of the query items are assumed to be at the root level and do not support placement nestings ( e.g. query.bool.must[].term or query.bool.must[].bool.should[]

Add methods for fluency ( or possibly separate objects ) which can allow for more precise placement of query operators, without resorting to code like the following:

            var q = search.getQuery();
            param q.bool = {};
            param q.bool.filter = {};
            param q.bool.filter.bool.must = [];
            arrayAppend( q.bool.filter.bool.must, {
                "wildcard" : {
                    "#arguments.field#" : {
                        "value" : arguments.query
                    }
                }
            } );

Aggregation only Querys doesn't support the setMaxrows() anymore

This is caused by the commit 3459d8d

If i only want to aggregate, i have no query. In my opinion "from" and "size" could take the defaults. and are not dependant to the query

       if ( !isNull( variables.query ) && !structIsEmpty( variables.query ) ) {
            dsl[ "query" ] = variables.query;
        }
        
        dsl[ "from"] = variables.startRow;
        dsl[ "size" ] = variables.maxRows;

[feat] Missing AliasBuilder.save() method

Seeing as we can run getInstance( "Document@cbElasticsearch" ).new(...).save(), and getInstance( "IndexBuilder@cbElasticsearch" ).new(...).save(), why can't we have getInstance( "AliasBuilder@cbElasticsearch" ).new(...).save()?

IMHO it's a little awkward to require passing the AliasBuilder instance in to client.applyAliases(), though I understand it would still be necessary for saving multiple aliases at once.

I would be happy to tackle this myself, by the way. :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.