Comments (11)
@ErickTamayo I've discussed the issue with elasticsearch devs (both the php and application) in elastic/elasticsearch#25156 .
This issue still applies now as I think it's an issue with this package.
In short: As of now, whenever we save a model into the database or update it, it's always successful even when the indexing fails (I saw that models are updated using bulk update if this matters). The hooks that this package sets up, get run but the results get capsulated away from our application, which makes error-detection hard, and no exception is being thrown when indexing fails, which makes it impossible to catch such errors. Can you suggest steps to get access to the json response body when saving a model, as mentioned by @polyfractal in elastic/elasticsearch#25156 ?
Else I suggest that this package checks the response body when calling searchable()
and if any errors are detected in the response body, that it throws the response body as message.
We have updated everything to the newest versions that we can support:
"require": {
"laravel/framework": "5.4.*",
(...)
"tamayo/laravel-scout-elastic": "^3.0",
(...)
},
(as for the scout version, it is determined by the dependency list, as of now only this package uses scout in our application)
If it should throw on indexing when the bulk update response body contains errors, can you tell us where we could look / determine why we don't get exceptions when an invalid index operation is issued?
Edit:
Me and the other dev have tracked the code down to here
elasticsearch-php
does not throw on bulk update, and this package does nothing with the result from the bulk
operation.
I would suggest to rewrite the code to something that checks the result and lets the user know in any way, in my opinion with an exception, like:
public function update($models)
{
$params['body'] = [];
$models->each(function($model) use (&$params)
{
$params['body'][] = [
'update' => [
'_id' => $model->getKey(),
'_index' => $this->index,
'_type' => $model->searchableAs(),
]
];
$params['body'][] = [
'doc' => $model->toSearchableArray(),
'doc_as_upsert' => true
];
});
$result = $this->elastic->bulk($params);
if (isset($result['errors']) === true && $result['errors'] === true)
{
throw new Exception(json_encode($result));
}
}
or either expose an hasErrors()
/getErrors()
or similar function.
from laravel-scout-elastic.
Hey, sorry it took so long, we analysed the issue and it's a bug in Elasticsearch engine / application. (it took us a long time to update everything to newest versions)
See elastic/elasticsearch#25156
It seems that elasticsearch-php does not throw on a HTTP 200 OK status code, but does on everything else. What actually happens here is that elasticsearch returns a 200 OK code when it fails (ironically, in the JSON the status is 400).
We decided to take out the big guns and sniff the network and compare the traffic.
from laravel-scout-elastic.
Thats weird, I'm getting exceptions from ES just fine, I do not enclose anything in a try catch can you tell me which version of ES are you using?
from laravel-scout-elastic.
"tamayo/laravel-scout-elastic": "^1.0",
<- That's our composer.json require entry for your plugin. Is this fixed in 2.x and 3.x?
from laravel-scout-elastic.
Hmm not sure, as I don't recall doing changes on that at all.
What version of Laravel/Scout are you using?
from laravel-scout-elastic.
5.3 at the moment but it does come from a laravel 4 project (that has been manually upgraded / ported). That is was L4 shouldn't matter. I'll try to investigate further asap when I'm at work.
from laravel-scout-elastic.
Alright, If you can please update to the latest version as Scout have been updated too
from laravel-scout-elastic.
Just chiming in to say that I agree with @grasmanek94's suggested code change (or something similar). Bulk requests will only throw exceptions when something about the coordinating node goes wrong, like the node being unavailable or a gross parsing exception (like if the bulk is improperly formatted).
The rest of the errors a user may care about will be embedded in the bulk response. Ignoring bulk responses can be especially dangerous because some documents may return with EsRejectedExecution
exceptions, which is basically saying "the node is too busy and the bulk queue is full, you need to retry this document later". If you don't catch those, you can easily drop data on the floor and not realize it unless you do a comparison between source and Elasticsearch later.
from laravel-scout-elastic.
@grasmanek94 Elasticsearch not throwing any exceptions has caused me a lot of trouble. When some documents (of about 80000) were not indexed, it was pretty tiresome to find the reason. I see, you tracked the code down to there.
I went a little deeper and found this in the repo elastic/elasticsearch-php.
$future->promise()->then(
//onSuccess
function ($response) {
$this->retryAttempts = 0;
// Note, this could be a 4xx or 5xx error
},
//onFailure
function ($response) {
// Ignore 400 level errors, as that means the server responded just fine
if (!(isset($response['code']) && $response['code'] >=400 && $response['code'] < 500)) {
// Otherwise schedule a check
$this->connectionPool->scheduleCheck();
}
}
);
return $future;
So the devs are fully aware, that there could be errors, and ignore them. In the failure callback, all status codes in the range 400-499 are ignored. If the status code is outside this range, the implementation of scheduleCheck() of \Elasticsearch\ConnectionPool\AbstractConnectionPool is called. I looked through the implementations, but they do pretty much nothing, and definitely no error handling or exception throwing.
To put it in a nutshell: All requests fail silently. 🤐
But wait, it gets worse:
I added some logging into the aforementioned code to log all responses. It turned out, even when the requests were failing (I reproduced that via curl on the command line), they were logged with properties like successful: 1, failed: 0, status: 201
and so on. There wasn't a single status 400 in the log after trying to insert some 80000 documents one-by-one, of which some failed. Yet, when I tried the same via curl on the commandline, with the documents I identified as erroneous, I got a status 400, because of a mapping error.
It all totally made sense, and I was able to pinpoint the problems and fix all documents/mappings, although it was a chore. If only there was some form of error reporting!
from laravel-scout-elastic.
Hi @grasmanek94, I'm afraid you're interpreting that bit of code incorrectly. The onFailure
section in that snippet only deals with if the connection pool should schedule a liveness check. E.g. if there was a 5xx error (something on the server went wrong), it's likely that a node died or the master is down or something and we should allow connection pools a chance to rebuild their list of nodes. The sniffing pool will go fetch a new list of nodes, the static connection pool is just a no-op, etc.
The way it works is like this:
- We default to using RingPHP's
CurlMultiHandler
to handle the request - When the
Transport
object is built, we wrap the handler (CurlMultiHandler
or something else, if the user overrides it). - This wrapping does things like logging, handling network-level curl exceptions, non-network fatal errors, as well as HTTP status code errors. You'll notice that if we encounter a 4xx or 5xx HTTP status code error, those are turned into actual exceptions and thrown.
- Since all that happens in a wrapper around the handler, by time we get to the snippet you linked we've already executed the request, dealt with the appropriate logging/exception throwing, etc and we are merely chaining an additional execution to the promise, which does the connection pool rechecks if it's a non-4xx error.
I know the control flow here is a bit convoluted, mostly due to refactoring in async code at some point. I'm sure it could be cleaned up a bit.
What was happening in this ticket, and likely in your code, is what I described in #37 (comment). The status code of an Elasticsearch bulk request only relates to whether the bulk itself was successful, meaning acknowledged and processed by the server. All the bulk "action items" could have failed in the bulk (each with their own individual error code in the response body) and the API call itself will still have a successful HTTP code because the API call worked... just not the contents of the request.
You mentioned for example that you were indexing them one-by-one (e.g. not a bulk) so I suspect the bulk was slightly different from your single-indexing test, or your tests were not the same as the curl.
Having received many 4xx and 5xx errors myself while using the client, I can definitely say they get thrown :)
from laravel-scout-elastic.
Hi @polyfractal, you mentionend the wrong person, I guess. ;)
Thanks for taking the time to explain in such detail. In the meantime, I noticed the thing with the bulk requests. That's absolutely understandable.
But when indexing one-by-one via laravel/scout, e .g. $singleModel->searchable()
, it still makes a bulk request with a single item, because a single model is turned into a collection first. See https://github.com/laravel/scout/blob/6.0/src/Searchable.php#L135
And that's, why we never get notified about failed inserts/updates.
I second @grasmanek94 's proposal to throw an exception in the update-method or expose a method to get errors from the last bulk operation.
from laravel-scout-elastic.
Related Issues (20)
- lumen8 use paginate error, HOT 3
- How to perform advanced customisation? HOT 1
- Warning about type deprecation HOT 10
- All data doesn't get indexed and no error pops up when importing HOT 1
- index not found exception HOT 3
- When i fetch data from elastic its give error like this "no handler found for uri" HOT 1
- Laravel Scout 9. please? HOT 1
- Please add php 8 suport HOT 1
- Make it work with ES 7 HOT 3
- Make it work with ES 7 HOT 1
- ElasticClient namespace incompatible HOT 6
- Change namespace HOT 1
- _ide_helper.php should not be part of the packet HOT 1
- 小建议 HOT 1
- laravel9安装失败了 HOT 1
- '_index' => $model->searchableAs(), '_type' => get_class($model), HOT 1
- Class "ScoutEngines\Elasticsearch\ElasticsearchProvider" not found - Laravel 9 HOT 1
- ->forceDelete() on Model, don't delete on elasticsearch
- Please install the Elasticsearch PHP client: elasticsearch/elasticsearch but is installed HOT 1
- after 7.0 ,only 'index', delete 'Type'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from laravel-scout-elastic.