Comments (13)
HTTP/2 is not really tested or supported (at least for now). We've been reporting similar issues to the jetty team and they are steadily being fixed. NPE usually means "use after free" which suggests that request/response was already recycled but some parts of the code still try to use it and access/write. This could happen if the jetty client i.e. aborts the request due to a timeout. There are bunch of fixes for this scenarios that will be a part of the next jetty release that we will update to shortly after.
from trino.
We are not using FTE.
But, the issue you addressed in the PR could be the one we are struggling with.
I will try it out in our build and see how the cluster behaves.
Thanks!
from trino.
Quick update.
Good news! I built 446 with PR #21744 and I am not able to reproduce the issue mentioned above. Worker was restarted multiple times and so far it has no trouble in joining the cluster and take on fresh queries. Coordinator is not treating that worker as a stepchild.
Testing is continuing, but so far everything looks good. Thanks a bunch @findepi. This is going to address a ton of issues we were facing in prod environment.
I will be upgrading the cluster to the latest (447) tonight. I am not anticipating any issues though.
from trino.
Thanks! Are you considering Jetty 12.0.9 or higher?
Jetty 12.0.9 is out and airlift 247-SNAPSHOT is still configured to use 12.0.8.. I am using 12.0.8 and still seeing the above issue. Are we planning to upgrade to 12.0.9 or its next release?
from trino.
@sajjoseph next release
from trino.
@sajjoseph actually Jetty 12.0.9 is not yet out. It's tagged but artifacts are not yet published
from trino.
cc @losipiuk
from trino.
@wendigo - I tried 12.0.9 snapshot version and looks like jetty problems are fixed now. I don't see the above exceptions anymore.
But, I still see that the restarted worker node is not doing any work as the assigned task to that worker is staying in "PLANNED" state. All other workers finished their work.
Eventually, the query fails with the following PAGE_TRANSPORT_TIMEOUT exception:
io.trino.operator.PageTransportTimeoutException: Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes. (https://<worker_ip>:8443/v1/task/20240505_025223_07108_qk3nn.26.10.0/results/0/0 - 30 failures, failure duration 302.86s, total failed request time 312.86s)
at io.trino.operator.HttpPageBufferClient$1.onFailure(HttpPageBufferClient.java:505)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException: Total timeout 10000 ms elapsed
at io.airlift.http.client.ResponseHandlerUtils.propagate(ResponseHandlerUtils.java:25)
at io.trino.operator.HttpPageBufferClient$PageResponseHandler.handleException(HttpPageBufferClient.java:665)
at io.trino.operator.HttpPageBufferClient$PageResponseHandler.handleException(HttpPageBufferClient.java:652)
at io.airlift.http.client.jetty.JettyResponseFuture.failed(JettyResponseFuture.java:152)
at io.airlift.http.client.jetty.BufferingResponseListener.onComplete(BufferingResponseListener.java:84)
at org.eclipse.jetty.client.transport.ResponseListeners.notifyComplete(ResponseListeners.java:350)
at org.eclipse.jetty.client.transport.ResponseListeners.lambda$addCompleteListener$7(ResponseListeners.java:335)
at org.eclipse.jetty.client.transport.ResponseListeners.notifyComplete(ResponseListeners.java:350)
at org.eclipse.jetty.client.transport.ResponseListeners.notifyComplete(ResponseListeners.java:342)
at org.eclipse.jetty.client.transport.HttpExchange.notifyFailureComplete(HttpExchange.java:307)
at org.eclipse.jetty.client.transport.HttpExchange.abort(HttpExchange.java:278)
at org.eclipse.jetty.client.transport.HttpConversation.abort(HttpConversation.java:162)
at org.eclipse.jetty.client.transport.HttpRequest.abort(HttpRequest.java:795)
at org.eclipse.jetty.client.transport.HttpDestination$RequestTimeouts.onExpired(HttpDestination.java:596)
at org.eclipse.jetty.client.transport.HttpDestination$RequestTimeouts.onExpired(HttpDestination.java:579)
at org.eclipse.jetty.io.CyclicTimeouts.onTimeoutExpired(CyclicTimeouts.java:113)
at org.eclipse.jetty.io.CyclicTimeouts$Timeouts.onTimeoutExpired(CyclicTimeouts.java:206)
at org.eclipse.jetty.io.CyclicTimeout$Wakeup.run(CyclicTimeout.java:294)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
... 3 more
Caused by: java.util.concurrent.TimeoutException: Total timeout 10000 ms elapsed
... 11 more
@findepi / @losipiuk / @wendigo
If you can guide me to the code area where I can debug it further (to troubleshoot why worker is stuck in PLANNED state), I really appreciate it. Is this a coordinator issue or worker issue? Since the worker was restarted, it has no recollection of past failures. I suspect that the coordinator is not triggering something that blocks the worker from starting with the task.
Looking forward to your assistance.
Thanks!
from trino.
@sajjoseph are we taking about FTE or pipeline execution?
Can #21744 be related?
from trino.
@sajjoseph glad to hear positive feedback on the #21744 PR!
Worker was restarted multiple times and so far it has no trouble in joining the cluster and take on fresh queries
I am interesting in this scenario. I was not able to reproduce this particular problem on current master (before #21744) and this problem wasn't the motivation for that PR.
Quite contrary (#21744 (comment)) -- on current master, if i restart a worker, the coordinator can use it immediately (once it is fully initialized). After #21744, coordinator can use it only after it's fully initialized and fully announced.
Now that master is updated to jetty 12.0.9 I will try to reproduce #21735 (comment) again with a different test query.
from trino.
Now that master is updated to jetty 12.0.9 I will try to reproduce #21735 (comment) again with a different test query.
wasn't able to repro with a query involving ORDER BY (from the issue description) with and without HTTP/2 for internal comms.
from trino.
I used a heavily modified version of query replay tool to re-run production traffic in a separate cluster. I purposefully kept just one worker in that cluster so that I can make the worker node fail with load (I was able to do that consistently with 444 version). After restart, the worker node was getting into the state I described above.
After #21744, the worker is not failing even after seeing 100s of concurrent queries (many queries were failing due to memory pressure as the worker can't handle the data volume, but the worker itself didn't go down this time!!! Kudos to Trino 447, JDK 22, Jetty 12.0.9, Airlift 248, HTTP/2 etc). I am testing it with even more queries and see if I can make it fail.
Even though worker didn't fail, I manually restarted it multiple times (in the middle of validations) and it just joined the cluster and started processing fresh queries that came in. This is why I feel that #21744 is helping with the above issue.
I will keep you posted.
from trino.
Potentially related #18329.
from trino.
Related Issues (20)
- Add new table handle interface to expose common datalake-type info HOT 4
- PreparedStatement dont support java.sql.Statement.NO_GENERATED_KEYS option HOT 1
- Verify checksum when reading deletion vectors in Delta Lake connector
- Add support for creating tables with deletion vector in Delta Lake connector
- Reading Delta Lake tables with icebergCompatV1(2) writer features fails HOT 1
- Subquery cache roadmap HOT 6
- Prune casted partition columns
- Order split scheduling between stages for common subqueries HOT 2
- Implement cache prioritization
- Add ttl for cached data
- Normalize conjunct order in CanonicalSubplanExtractor HOT 2
- Add support for (partial) Limit operator to subquery cache
- Add support for Join operator to subquery cache
- [Feature Request] Support "Single Row Mode" similar to PostgreSQL HOT 4
- Flaky TestIcebergMinioOrcConnectorTest>BaseIcebergConnectorTest.testOptimizeTimePartitionedTable HOT 3
- Add flush_metadata_cache procedure to BigQuery connector
- When Trino uses Iceberg and HDFS, the metadata of Iceberg will be concentrated on the HDFS node where the Trino master is located, causing an excessive number of files on that node. HOT 1
- Improvement of task doc HOT 3
- Where clause doesn't work with column masking (with Ranger)
- Query Clickhouse is pretty slow HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trino.