Giter VIP home page Giter VIP logo

Comments (16)

csernazs avatar csernazs commented on June 27, 2024 1

With a quick search I found this bug from 2015: pallets/werkzeug#33
Interestingly, it is the opposite as our case. :)

I think werkzeug should require an IP address, and not a hostname for bind (if it makes decisions on the hostname string, then it is an issue) or call getaddrinfo() but that's a more complicated story.

One idea which came up since my last comment is the following:

  1. server is bound to "localhost", so TLS will be intact, no default hostname will be changed
  2. first call to url_for obtains the hostname specified for the server. If it is "localhost", it tries to connect it with a simple connect() call
    1. if it is successful, then everything can go, it creates urls with "localhost" text. It assumes that name resolution will happen in the client software in the same way as the standard socket API's gethostbyname() (if not, we can't do anything)
    2. if it cannot connect to localhost, it can try with IP addresses "127.0.0.1" or "::1". If any of them successful, it will use that for URL. One "minor" issue is that the client being tested needs to support IPV6 if it returns "::1" which is not always the case. Port numbers should be specified differently in the case of IPv6 for example. In this case, it will break TLS but TLS would not work with "localhost" due to the previous step failed.

Alternatively, there would be just a check implemented in the url_for function and report the issue in a human friendly way, and do not do any heuristics. But I'm not sure if this should be responsibility of url_for.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024 1

I think I've found the solution 💡

In pytest-httpserver, I'll do a gethostbyname() for the provided hostname and pass the resulted IP address to werkzeug.

So, if gethostbyname() call...

  • ...fails: it would fail with the implicit gethostbyname() call at bind() also ➡️ it won't break anything which is working now
  • ...results IPv4 address (eg. 127.0.0.1) ➡️ werkzeug will bind to IPv4 (AF_INET)
  • ...results IPv6 address (eg. ::1) ➡️ werkzeug will bind to IPv6 (AF_INET6) as it will find the colon (:) in the host parameter

As the http client will also resolve the name with the same mechanism, it will use the same IP address the server bound so the connection will be successful.

If there's no objection about it, I would proceed with this.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024 1

I think I have a better understanding of this issue.

In your test, you are testing with a bogus TLS setup. In the test a ServerFingerprintMismatch exception is expected, which is a ServerConnectionError, defined in aiohttps's client_exceptions.py. This is important. :)

There's a retry logic in aiohttp which iterates on the resolved addresses (for localhost it contains 127.0.0.1 and ::1). If it cannot connect and catches ServerConnectionError, it tries with the next one (https://github.com/aio-libs/aiohttp/blob/d55728da7486e0295a11bf2378a2eceb8fe39c64/aiohttp/connector.py#L992). When all hosts have been tried and none of them where successful, the last exception is raised.

I think that in the second request of your test case aiohttp fails with 127.0.0.1, it raises ServerFingerprintMismatch, which is a ServerConnectionError, so it moves on to ::1 address. This also fails but not with ServerFingerprintMismatch, so it makes the test case failed. This is also reflected in the traceback where the last_exc is re-raised, which is the exception received for the last connection attempt.

By fixing the address to 127.0.0.1 as you did, the list of IP addresses is only one so the last_exc exception is the one the test is expecting.

I think that it is an edge case and it does not means that pytest-httpserver does not work on dual-stack systems. It means that if you want to test for a TLS error (or some other error which raises an exception derived from ServerConnectionError) that fails with a completely different error due to the re-trying implemented in aiohttp.

What can be done in pytest-httpserver:

  • listening on both ipv4 and ipv6 by setting IPV6_V6ONLY socket option to 0 (there's a python API for it in newer python interpreters but I think I can backport it). This doesn't work on my linux for some reason I couldn't find out.
  • listening on both ipv4 and ipv6 by two different servers: it adds complexity as handlers needs to be registered/cleaned up twice, server stop time doubles, two threads needs to be started instead of one
  • changing the default host to 127.0.0.1, it may cause issues with TLS as I described previously
  • changing the default host to 127.0.0.1 only when TLS is not enabled: this would not solve your issue
  • adding a documentation entry on this issue: this can be done easily

And probably more. I'll think about it.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024 1

Could you please look at the text I added to howto.rst? If there's some error or I missed something, I'm happy to fix it.

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024 1

I think that's clear enough.

It's a very curious caveat though, and I think pretty much all systems nowadays are dualstack (even if they have public IPv6 connectivity). But given that I can't think of a way to completely get rid of it, that sounds good to me.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

Hi!

I've received your issue and will provide a fix very soon. I'm sorry for the delay, was on holiday for a couple of days and didn't have time to work on this.
Probably I can come up with a fix on this today or tomorrow latest.

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024

No worries, we all have lives.

I think a simple fix might be to just change the default host to 127.0.0.1. Listening on IPv6 would be cleanest though.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

I wanted to understand this issue so I looked at werkzeug's code. It has a naive algorithm to detect the address family:
https://github.com/pallets/werkzeug/blob/main/src/werkzeug/serving.py#L606

So if the host specified does not contain : (colon) then it is assumed to be IPv4 only (with the special case for unix domain sockets), no address family lookup is done there by getaddrinfo...

As you suggested there are two possible fixes:

  1. I could specify the IPv6 loopback address as the default but that would break systems running with IPv6 disabled and I don't want to break any code. Adding auto-detection is a seems to be very hard to do properly on both Linux and Windows.
  2. Specifying 127.0.0.1 seems to be ok...
  3. Looking up the bound address in the server and returning that IP address in url_for, seems to be ok...

But the real issue with changing the default hostname is the following: if someone uses TLS the certificate may contain the hostname and if it is specified as "localhost", it won't work with "127.0.0.1" as the certificate contains the hostname. If it was made properly and contains 127.0.0.1 and localhost it will work but if only "localhost" is there, it will fail... For example test_ssl.py fails in pytest-httpserver's test suite.
The main issue here is that the same hostname is used in url_for which is used for the server to bind to...

Currently I have no good idea to fix this issue, as changing the default host may break TLS. I could release 2.0.0 but 1.0.0 has been released recently and releasing potentially breaking releases is not a good sign. I'm also thinking about writing a howto chapter about it or updating the documentation but that's also a half solution. Let me think on this for a while - or if you have ideas, please share it.

Last but not least, I couldn't reproduce it in github actions, even if I added test test running on windows.

For the record, I've just realized that this issue have been reported in PR #52 by @tenuki, but I wasn't able to understand it that time.

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024

Huh, that's actually a really smart and simple approach. I like it.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

There's one little issue with my "solution": gethostbyname() is for IPv4 only. I forgot that...
There's getaddrinfo() which does name resolution for both IPv4 and IPv6 but it returns both addresses, and I cannot decide which one the client will connect to, eg. which one the server should bind. According to its documentation a client wanting to connect to a remote host should try the addresses one by one, it seems that in this issue this is not happening as it would find the server on 127.0.0.1.

Could you share a bit more information about the issue? What is the environment and which HTTP client are you using? Having a reproduction on my side would be very helpful.

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024

If you run this unit test with httpserver.host instead of 127.0.0.1 it fails:

https://github.com/pimutils/vdirsyncer/blob/1a1f6f078882a8b7d1bd768a4a6544da8eec6cb1/tests/system/utils/test_main.py#L51-L69

I'm using aiohttp for the request. I think just using aiohttp with nothing special should reproduce the failure. My setup is dualstack.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

Thanks, I have a reproduction! Looking at it.
Interestingly a very minimal setup worked for me (create server with socket api for AF_INET, and use aiohttp with "localhost"), but your case triggers the issue.

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

Interestingly, only the second request made in line 69 (https://github.com/pimutils/vdirsyncer/blob/1a1f6f078882a8b7d1bd768a4a6544da8eec6cb1/tests/system/utils/test_main.py#L69 ) fails with the connection error, the first passes.
If I remove that 2nd request by commenting that line and the pytest.raises line from the code, the test passes and the first request completes.

In aiohttp's resolver, it resolves both addresses (ipv4 and ipv6) for localhost:
https://github.com/aio-libs/aiohttp/blob/d55728da7486e0295a11bf2378a2eceb8fe39c64/aiohttp/resolver.py#L27
And my guess is that it tries with the other address when there's some error, and as the second request tests an invalid http request (with an invalid TLS, if I'm not mistaken) then it passes on the ipv6 address.

But I'm still looking.

from pytest-httpserver.

WhyNotHugo avatar WhyNotHugo commented on June 27, 2024

Ah, thanks for the research here, that was quite a tricky one.

I think listening in dualstack is always a good choice if it doesn't introduce complexity or other issues. But it sounds like in my use case, it's best to specify 127.0.0.1 (and probably leave a comment to this explanation!).

from pytest-httpserver.

csernazs avatar csernazs commented on June 27, 2024

Thanks!

I want to note that pytest-httpserver does work on dual stack system (mine is dual stack, I presume that the github CI is also dual-stack), as long as the client implements retrying logic which falls back to 127.0.0.1 (requests does, aiohttp does).
This issue appears when you want to test for errors as in such case it is not possible to distinguish between the retry caused errors and the "error I want to test", which is in your code is a TLS error - this is what I documented.

As discussed above, changing the default host to 127.0.0.1 is likely to cause errors in clients which use TLS connections whose certificate is issued for the "localhost" and does not issued for "127.0.0.1".
I think I tried various ways to solve this issue, but unfortunately none of them seems to be feasible.

from pytest-httpserver.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.