mesos-dns service unstable for external hosts

Description

1. i configd resolvers parameter when i first install dc/os like:
"resolvers": ["10.10.137.243", "8.8.8.8", "8.8.4.4"],

2.slave nodes can get external host's A record from mesos-dns like:

3.get wired ping result on slave nodes like:

4.sometimes the slave nodes failed to get external host's A record from mesos-dns :

Activity

Show:
Cody Maloney
September 7, 2016, 9:25 PM

In DC/OS we use a DNS proxy on every agent to make Mesos-DNS HA within the cluster since customers didn't typically have hardware load balancers. The localhost spartan DNS load balancer, listens on a special internal interface on those `198.51.100.*` addresses.

As far as why delegation from Mesos-DNS to upstreams isn't working right, can you ssh to a master and do a `jouranctl -u dcos-mesos-dns` and share that and what's happening around those forwarded requests?

Seamon Zhao
September 8, 2016, 6:50 AM

there is no error info...

Sargun Dhillon
September 8, 2016, 7:41 AM

If you do `dig $INTERNALHOSTNAME`, what's the result you get? If you repeat this command several times, do you get different results? If you reconfigure your DNS to only have ["10.10.137.243"], does it work?

Seamon Zhao
September 8, 2016, 2:58 PM


i tried nslookup and dig ,both of them show different result.
from this picture(mesos-dns-tcpdump) we can see the slaver ask for host's A record from all 'resolvers'

i realized that if i want to use the external(outside of dc/os env) dns ,i have to set 10.10.137.243 as the only `resolver` in config.json.

as your suggestion,i have to re-install the whole dc/os...

https://dcosjira.atlassian.net/browse/DCOS-367

Assignee

Sargun Dhillon

Labels

None

Components