In a cluster spun up from the provided AWS Cloudformation instances, running Docker 1.7.1.
When pulling from an internal Docker registry (2.4) on Marathon through a VIP, the pull fails with image not found, even when the image was just pushed. Logs show that this is caused by a connection reset when using the VIP, which causes Docker to fall back to the v1 protocol, which then 404s.
If the image is pulled via a direct connection or through marathon-lb, the pull works fine. Also interestingly enough, if TLS is disabled on the registry, it also works.
I can supply a tcpdump if requested.
There is a pull request to fix this:
You can manually run whichever of the commands suits your system on all your agents to work around it for now:
/sbin/sysctl -w net.netfilter.ip_conntrack_tcp_be_liberal=1
/sbin/sysctl -w net.netfilter.nf_conntrack_tcp_be_liberal=1
/sbin/sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_be_liberal=1