Navstar crash after installing marathon-lb (DCOS 1.8.7)

Description

Hi,

we are using DC/OS 1.8.7 on CentOS 7.3.1611 , with 3x master and 1x slave.

Everything works fine, but after adding Marathon-lb we start receiving crash reports from navstar-env (attached as journalctl-navstar-marathonlb.txt ). The log file is obtained with journalctl -flu dcos-navstar.service

The errors keep appearing for a variable interval of time (30 sec to 5 minutes), at a 30sec rate, then stop. If we try to start/stop another service while marathon-lb is running, the crashes keep appearing for the new instance ( log attached as journalctl-navstar-api.txt )

We have been having this issue on a 1x master 2x slave configuration, too.

If we stop marathon-lb and wait for the corresponding crashes to end, then we are able to start/stop other services without errors.

It seems that the DNS configuration update is delayed until the errors stop: following the previous example and trying to dig api.marathon.autoip.dcos.thisdcos.directory after stopping the service, while the errors are appearing, the "old" IP keeps appearing in the ANSWER SECTION.

I've also attached the marathon.json file for marathon-lb.

What could be the issue?
We're available to provide further information if needed.

Thanks in advance,
Marco

EDIT: I've also attached relevant crash.log from /opt/mesosphere/packages/navstar--...../navstar/log/

Assignee

Nicholas Sun

Labels

None

Components

Configure