Turning of `docker_bridge` on master-agent module breaks upgrade from 1.8 - 1.9


This is my hypothesis for the root cause. For the overlays, we store the overlay allocation of each agent in the Mesos overlay replicated log. In 1.8.7 the `docker_bridge` for the agent module is enabled, and hence in 1.8.7 the overlay master allocates a docker subnet to the agent module running on the Master and the agent module configures this `docker_bridge` on the Master. However, in 1.9, though we are disabling this `docker_bridge` and don't expect it to be configured in the agent module on the Master, since the Master is retrieving the allocated overlay configuration from the replicated log it sees the `docker_bridge` enabled and asks the agent to "erroneously" configure the docker_bridge.
The fix should be for the Master to update its overlay configuration for the agent (retrieved from the replicated log), based on the agent registration it retrieves and than send an overlay configuration command to the agent. This would satisfy the upgrade from 1.8.7 to 1.9 and keep the overlay configuration sane.


Avinash Sridharan
December 12, 2016, 6:47 PM
Avinash Sridharan
December 16, 2016, 7:24 PM

Test case is fixed. Waiting for a bump of the `dcos-meoss-modules` to get in changes for the overlay modules.


Avinash Sridharan