Giter VIP home page Giter VIP logo

Comments (6)

pimeys avatar pimeys commented on June 6, 2024

It would be nice to think some other ways.

How I see it, the connection error handling should happen in hyper's connection pool. And for now, the connections have been super stable, occasionally getting a TimeoutError or ConnectionError. We've been testing and monitoring to see if any problems happen, and how we've solved this was just monitoring the results and letting RAII to drop the problematic connection if errors are frequent enough, then creating a new one.

Again, I need to rethink the README now when we have more ideas how the library works in real life.

from a2.

pimeys avatar pimeys commented on June 6, 2024

So to answer all of your questions in one place, many of these gotchas were because we were not really sure how this all will work out with the experimental hyper and we did big precautions to not break things to much, and I was taking a big risk to allow testing the new library on production.

It all worked out quite well. I still don't really know what are those ConnectionErrors, and TimeoutErrors are quite random. Apple says if you get zero bytes back as a response, you should form a new connection, retry the notification and all notifications sent after the one having a zero-byte response. These might be deployments or just plain errors on Apple's side.

Here we don't go so low level. How I understand it, when we start getting ConnectionErrors for notifications, we should reopen a new connection. Right now the easiest choice was to just drop the notifier thread and create a new one, having a connection reset might work and might be a cleaner solution.

Just noting, that this kind of errors were more common when we first went to production, but after certain fixes for hyper and h2, I don't see them so much anymore.

Summarizing this that in the end, you should monitor the responses and you should have some kind of retry strategy for certain errors. If you re-use the same apns-id, you don't need to worry about duplicates anyways.

from a2.

pimeys avatar pimeys commented on June 6, 2024

Oh, one more thing:

These errors are a problem when you have a decent amount of traffic through your connection. We have a keep-alive timeout of 10 minutes in Hyper, so a stale connection with no traffic will get dropped and a new one opened for the next request.

from a2.

dbrgn avatar dbrgn commented on June 6, 2024

Yes, it seems that stale connections will be automatically re-established:

DEBUG 2018-04-05T12:56:35Z: hyper::client: client connection error: protocol error: not a result of an error
DEBUG 2018-04-05T12:56:35Z: hyper::client::dns: resolving host="api.development.push.apple.com", port=443
DEBUG 2018-04-05T12:56:35Z: hyper::client::connect: connecting to 17.188.138.73:443
TRACE 2018-04-05T12:56:35Z: apns2::alpn: AlpnConnector::call got TCP, trying TLS
DEBUG 2018-04-05T12:56:36Z: hyper::client::pool: pooling idle connection for ("https://api.development.push.apple.com", Http2)

However, when I kill the established connection with tcpkill, the first push fails:

DEBUG 2018-04-05T13:01:51Z: hyper::client::pool: reuse idle connection for "https://api.development.push.apple.com"
DEBUG 2018-04-05T13:01:51Z: hyper::client: client connection error: connection reset
TRACE 2018-04-05T13:01:51Z: apns2::client: Request error: an operation was canceled internally before starting
 WARN 2018-04-05T13:01:51Z: push_relay::server: Error: Push message could not be processed: Push was unsuccessful: Error connecting to APNs
DEBUG 2018-04-05T13:01:51Z: hyper::proto::h1::io: flushed 155 bytes

However, the next push works again:

DEBUG 2018-04-05T13:04:01Z: hyper::client::dns: resolving host="api.development.push.apple.com", port=443
DEBUG 2018-04-05T13:04:01Z: hyper::client::connect: connecting to 17.188.138.73:443
TRACE 2018-04-05T13:04:02Z: apns2::alpn: AlpnConnector::call got TCP, trying TLS
DEBUG 2018-04-05T13:04:02Z: hyper::client::pool: pooling idle connection for ("https://api.development.push.apple.com", Http2)

So the only case when the user needs to reestablish a connection is when the connection is still up but the apns2 client instance returns a ConnectionError or TimeoutError, right?

from a2.

pimeys avatar pimeys commented on June 6, 2024

I'd say so, yes. You get those time to time, and I hold a strategy now to track about 5 timeout errors per minute to kick a thread out from the supervisor and create a new one. For connection errors it goes out from the first error. Those are quite rare now, so I haven't had an urge to change those.

In the end, monitoring connections on this level is something that shouldn't be done, and I'd let hyper to deal with bigger problems. You don't really want to write all that boilerplate to your consumer/service/whatever.

from a2.

dbrgn avatar dbrgn commented on June 6, 2024

Ok, thanks for clarification :)

from a2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.