Giter VIP home page Giter VIP logo

Comments (13)

dtebbs avatar dtebbs commented on May 22, 2024 3

Yes, I believe the solution is a try/except block that keeps retrying the send until it either succeeds or a different exception is thrown. I have code that looks something like this (in this case I know in advance that we want to read exactly size_bytes bytes):

while len(self.buffer) < size_bytes:
        try:
            self._read()    # <==== Calls socket.recv() and updates self.buffer
        except socket.error as ex:
            if str(ex) == "[Errno 35] Resource temporarily unavailable":
                time.sleep(0)
                continue
            raise ex

There may be a better way to detect the exception type rather than a string comparison.

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024 3

@GP89, here is what I did. I'm sure there are all kinds of things wrong with this, but it did the job for me.

if "darwin" == sys.platform:
    # Monkey path socket.sendall to handle EAGAIN (Errno 35) on mac.
    import socket
    import time
    def socket_socket_sendall(self, data):
        while len(data) > 0:
            try:
                bytes_sent = self.send(data)
                data = data[bytes_sent:]
            except socket.error, e:
                if str(e) == "[Errno 35] Resource temporarily unavailable":
                    time.sleep(0.1)
                else:
                    raise e
    socket._socketobject.sendall = socket_socket_sendall

from urllib3.

GP89 avatar GP89 commented on May 22, 2024 2

@dtebbs Awesome thanks for getting back to me! I will give this a whirl and see what happens.

btw rather than checking the str representation of the error it'd be a little bit more robust to import errno and check if e.errno == errno.EAGAIN :)

from urllib3.

shazow avatar shazow commented on May 22, 2024

Hi @dtebbs, thanks for the bug report! Any thoughts on what the solution would be? We can put a try/except block to catch EAGAIN but what's the correct course of action once we catch it? Should we simply retry?

from urllib3.

shazow avatar shazow commented on May 22, 2024

Fun times, so this would need to happen on the httplib.HTTPConnection layer...

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024

Ah. Yes, I guess so. I'll try and get a full call stack (as I should have done in the first place).

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024

Looks like you are right. httplib.py needs to handle EAGAIN.

16:19:23,074 WARNI [urllib3.connectionpool] Traceback (most recent call last):
File "/Users/dtebbs/turbulenz/env/lib/python2.7/site-packages/urllib3-1.1-py2.7.egg/urllib3/connectionpool.py", line 335, in urlopen
body=body, headers=headers)
File "/Users/dtebbs/turbulenz/env/lib/python2.7/site-packages/urllib3-1.1-py2.7.egg/urllib3/connectionpool.py", line 217, in _make_request
conn.request(method, url, *_httplib_request_kw)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 946, in request
self._send_request(method, url, body, headers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 987, in _send_request
self.endheaders(body)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 940, in endheaders
self._send_output(message_body)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 807, in _send_output
self.send(message_body)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 772, in send
self.sock.sendall(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 222, in meth
return getattr(self._sock,name)(_args)
error: [Errno 35] Resource temporarily unavailable

I'm going to try changing connectionpool.py so it keeps retrying when it sees EAGAIN from _make_request().

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024

Changing connectionpool.py to keep retrying didn't help. It seems specific large uploads always block at the same place, so httplib seems to be where EAGAIN should be handled. I'm looking at alternatives, like setting SO_SNDBUF on the httplib sockets.

I'll close the bug for now and post any solutions I come across. I think this is only really likely to hit people doing large uploads through httplib.

from urllib3.

piotr-dobrogost avatar piotr-dobrogost commented on May 22, 2024

It appears this is a known issue in python, when the caller of send must handle EAGAIN errors on BSD platforms.

I guess it's http://bugs.python.org/issue9090 where Antoine Pitrou writes

Here is an updated patch wrapping all variants of recv() and send(), except sendall() which already has its own retry loop.

but Eric Hohenstein states

As far as I can tell, sendall() will still fail with these recoverable errors in Python 3.2

From reading issue 9090 I gather that sendall() tries to handle the EAGAIN/EWOULDBLOCK basically in the same way across Python 2 and 3. However it handles it not perfectly hence the errors we see.

from urllib3.

shazow avatar shazow commented on May 22, 2024

In the semi-distant future we'd like to get rid of our dependency on httplib altogether so I'll definitely keep this use case in mind if/when we do that.

Is there a semi-reliable way to reproduce this? How big are your uploads?

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024

Is there a semi-reliable way to reproduce this? How big are your uploads?

In fact, I've seen it happen on uploads that are only about 200k in size, although we are uploading many files at once which may have an effect. To reproduce it requires a mac or probably just freebsd and ideally a slow-ish connection, although I'm fairly sure it will happen across any remote connection eventually.

I think this issue is related:

http://bugs.python.org/issue8493

although it's happening in ftplib in that case. The conclusion of that thread seems to be that this is expected behavior when the socket has a timeout set and is therefore internally non-blocking. (The last comment half implies that the EAGAIN is actually a timeout, but I'm not convinced of that. It doesn't happen on other platforms over the same connection.)

We are calling 'connection_from_url' with timeout=None, so the socket should be blocking. And even if it weren't, it seems the caller of httplib has no way to handle EAGAIN except retrying the same request, but there is no reason to think that it will work the second time (in fact, my experiment trying to handle EAGAIN in connectionpool.py implies it generally won't).

This macports ticket mentions a similar problem, although it's very old:

http://trac.macports.org/ticket/18376

Monkeypatching socket.sendall to just handle EAGAIN worked for me. As did monkeypatching httplib to not call sendall. I guess I'll have to live with that for now and create a python bug.

from urllib3.

GP89 avatar GP89 commented on May 22, 2024

Hey,

I'm running into this EAGAIN problem on mac. I can reproduce it if I set up a very simple server:

import socket

serversocket = socket.socket(
    socket.AF_INET, socket.SOCK_STREAM)
serversocket.bind(("localhost", 80))

serversocket.listen(5)

raw_input()

If I make a request with a timeout and send enough data to fill my network buffer it raises out EAGAIN once it tries to send and the buffer is full, rather than eventually timing out because its unable to write any more to the network buffer.

@dtebbs How did you monkey patch socket.sendall? I'd be really interested to know so I can get timeouts to work.

from urllib3.

dtebbs avatar dtebbs commented on May 22, 2024

@GP89 Yeah, checking the errno is much better. I was wondering how to do that. Thanks.

from urllib3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.