Comments (24)
Do you only see this for public domain names or also when container names are resolved?
First thing is to check in the journal for any logged errors by aardvark-dns.
from aardvark-dns.
This is only for public domain names, I don't use any container names in my tests.
I managed to setup journald logging.
As far as errors goes, there were 2:
aardvark-dns[129509]: [25183] fail response: ProtoError { kind: Msg("mpsc::SendError send failed because receiver is gone") }
aardvark-dns[215345]: 14284 dns request got empty response
Apart from that, there is a lot of Received SIGHUP will refresh servers: 1
. That is probably because I run 40 containers which are spawning up and down fast (my tests last between 20s and 2min).
Sometimes I also see:
aardvark-dns[129509]: No configuration found stopping the sever
systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Example of logs:
Sep 29 12:11:26 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:27 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:27 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:28 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:28 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:28 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:29 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:30 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:32 host.example.tld aardvark-dns[72066]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:32 host.example.tld aardvark-dns[72066]: No configuration found stopping the sever
Sep 29 12:11:33 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:34 host.example.tld aardvark-dns[84196]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:34 host.example.tld aardvark-dns[84196]: No configuration found stopping the sever
Sep 29 12:11:39 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:40 host.example.tld aardvark-dns[84336]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:41 host.example.tld aardvark-dns[84336]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:41 host.example.tld aardvark-dns[84336]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:41 host.example.tld aardvark-dns[84336]: No configuration found stopping the sever
Sep 29 12:11:41 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:41 host.example.tld aardvark-dns[84649]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:42 host.example.tld aardvark-dns[84649]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:42 host.example.tld aardvark-dns[84649]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:43 host.example.tld aardvark-dns[84649]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:43 host.example.tld aardvark-dns[84649]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:43 host.example.tld aardvark-dns[84649]: No configuration found stopping the sever
Sep 29 12:11:44 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:45 host.example.tld aardvark-dns[85160]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:45 host.example.tld aardvark-dns[85160]: No configuration found stopping the sever
Sep 29 12:11:46 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:48 host.example.tld aardvark-dns[85294]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:48 host.example.tld aardvark-dns[85294]: No configuration found stopping the sever
Sep 29 12:11:49 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:49 host.example.tld aardvark-dns[85439]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:50 host.example.tld aardvark-dns[85439]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:50 host.example.tld aardvark-dns[85439]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:51 host.example.tld aardvark-dns[85439]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:51 host.example.tld aardvark-dns[85439]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:51 host.example.tld aardvark-dns[85439]: No configuration found stopping the sever
Sep 29 12:11:51 host.example.tld systemd[1084]: Started /usr/libexec/podman/aardvark-dns --config /run/user/988/containers/networks/aardvark-dns -p 53 run.
Sep 29 12:11:51 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:52 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:52 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:52 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:53 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
Sep 29 12:11:53 host.example.tld aardvark-dns[85951]: Received SIGHUP will refresh servers: 1
from aardvark-dns.
No configuration found stopping the sever
This is normal assuming all containers are stopped at that moment. The next container start would respawn the process.
As far as errors goes, there were 2:
* `aardvark-dns[129509]: [25183] fail response: ProtoError { kind: Msg("mpsc::SendError send failed because receiver is gone") }` * `aardvark-dns[215345]: 14284 dns request got empty response`
These look definitely relevant, @flouthoc any idea?
from aardvark-dns.
The problem is that our of 150 containers, between 10 and 20 fail due to dns error. Could SIGHUPs be the reason if the request comes when server is reloading?
from aardvark-dns.
That is a possibility, are these containers all on the same network or are there multiple networks in use?
from aardvark-dns.
Every container is in it's own network (I'm using FF_NETWORK_PER_BUILD flag with gitlab-runner).
from aardvark-dns.
Would it be possible run aardvark-dns
in debug mode and share logs ?
from aardvark-dns.
I think I could, yes. How do I run aardvark in debug mode?
from aardvark-dns.
@Luap99 Does --log-level
gets propogated to aardvark-dns
and netavark
?
from aardvark-dns.
Yes podman --log-level debug ...
gets passed down to netavark and aardvark. Although it is important to keep in mind that this of course has to happen on the command that start the aardvark-dns server, so it must be set on the first podman command who starts a container with dns.
from aardvark-dns.
Anyway the code looks pretty clear to me we teardown on each sighub and the setup again, that definitely looks wrong and likely is responsible for the package loss. We must keep the sockets active only only add/remove the ones according to the changed configs. Looking at it this whole section would need to be rewritten to handle it in a much better way.
from aardvark-dns.
I have to figure our a way to do that with gitlab-runner if it's even possible. I wasn't able to replicate this issue without gitlab-runner jobs.
from aardvark-dns.
I would assume just spamming the aardvark-dns process with SIGHUB signals should work as a reproducer to cause some package loss as the sockets are closed and opened each time again. So it is just a question of hitting that window.
from aardvark-dns.
Hi,
I am experiencing the same symptoms described by @matejzero under similar conditions: I use one GitLab Runner with Docker executor (which is configured to use a rootless and unprivileged Podman socket). I experience many DNS resolution failures where an estimated half of all CI jobs fail due to this issue.
Troubleshooting steps:
- First I tried to mitigate this by running a local caching DNS on the host, which did not improve the issue.
- By running a special GitLab CI job that performs hundreds of unique DNS lookups I could confirm that whenever there is a DNS lookup failure, the caching DNS on the host did not even receive the DNS query.
- Then I assumed that packet loss due to high network load between the host and the containers could be at fault; but even when I limited the incoming network bandwidth of the host, the issue was not improving.
This pretty much leaves only the container network stack as the potential cause.
When GitLab Runner jobs are started and stopped, I can see bursts of the following log lines in journald:
Feb 19 08:29:18 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:19 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:20 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:21 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:21 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:21 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:21 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:22 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:22 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:25 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
Feb 19 08:29:25 myhostname aardvark-dns[11348]: Received SIGHUP will refresh servers: 1
I can reproduce these messages using a custom test job, however I found that the presence of one of these messages is not sufficient to cause a DNS resolution failure.
from aardvark-dns.
I was hoping to have some more time to debug this, but unfortunately I had to switch back to docker for performance reasons so no new info from my side:/
from aardvark-dns.
#389 (comment) is still valid and must be fixed, and seeing all the Received SIGHUP will refresh servers
messages makes me confident that this is the problem you are seeing.
from aardvark-dns.
Related Issues (20)
- Shall we lookup host's /etc/hosts before forwarding other request to host's /etc/resolv.conf? HOT 5
- Need way to tell aardvark DNS to refer to a particular DNS, and not host's configured DNS HOT 13
- dns request failed: request timed out HOT 22
- dns: inbuilt resolver should return both `IPv6` and `IPv4` records if request type is `ANY` HOT 2
- Add LICENSE file and COC to repoistory HOT 1
- Dependency Dashboard
- Disable Dependabot after renovate trial
- Need bidirectional communication channel between netavark and aardvark HOT 8
- Add host.containers.internal entry in aardvark-dns HOT 2
- [NOT UPSTREAM PROBLEM] test `packit propose-downstream` HOT 2
- [packit] Propose downstream failed for release v1.7.0
- test_backend_network_scoped_custom_dns_server fails HOT 3
- Updating trust-dns HOT 1
- Is there a way to reserve or limit IP addresses when using DNS? HOT 1
- netavark dns resolves container fqdn on only one network when multiple networks connected HOT 11
- CI flake: three networks with a connect HOT 1
- When forward dns request to outside name server, `aardvark-dns` should check and ignore its own listening IPs or error out, to avoid infinite recursion. HOT 1
- Setting invalid options in /etc/resolv.conf makes dns unresponsive HOT 1
- Add response TTL settings HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aardvark-dns.