Route53 Health Checks - unhealthy sets still being served

Question

I have a set of 4 load balancers, each with it's own A record for mydomain.bla

Each record is something like:

* Alias: No
* TTL: 60s
* Value: an IP address of a specific server
* Routing policy: Weighted
* Weight: 10
* Set ID: a unique name referencing the specific server, eg.: `edge-01`
* Assoc w/ Health Check: Yes
* Health Check: a health check that monitors this specific server.

It was my understanding, that a failed health check would result in that particular IP address be removed from Amazon's nameservers.

However I'm observing that even with the health check showing a server to be unhealthy (cos I shut down nginx on that server), Route53 still serves that servers IP address on occasion.

I'm verifying this using the following:

dig mydomain.bla @ns-137.awsdns-17.com +short

So it's hopefully not an issue of some intermediate nameserver caching.

My health checks work fine, and show the correct status.

Why is the IP of an unhealthy server still being served by Route53? How can I debug this?

There shouldn't be any intermediate caching since you're digging directly @ one of the Route 53 NS. Do all 4 authoritative NS behave the same? Is the health check hard-down (not flapping)? — Michael - sqlbot, Mar 22 '19 at 22:25
It doesn't flap - I shut down nginx for 20m at one point (resulting in 20m of health check fails) and still see the IP in the dig commands. Tried the other NS and same output - the unhealthy IP is there. — Danielle M., Mar 24 '19 at 03:57

Route53 Health Checks - unhealthy sets still being served

0 Answers0