Bugzilla – Bug 4007
Hang on DNS query with dead-end CNAME
Last modified: 2018-06-04 06:57:16 UTC
Given an host with a CNAME pointing to nothing (see examples below), the connection hang forever with the following in the log (debug_options 14,9): 2014/01/15 14:46:59.366 kid1| ipcache.cc(497) ipcacheParse: ipcacheParse: 2 answers for 'cdn1.adbard.net' 2014/01/15 14:46:59.366 kid1| ipcacheParse: No Address records in response to 'cdn1.adbard.net' Note: this has already been reported on ML: http://www.squid-cache.org/mail-archive/squid-users/201204/0251.html Example hosts : - cdn1.adbard.net - d1.openx.org - 51c25aed8fef700bb500053e.csrum.cleverscale.com - anycast.cedexis.com - ea.tablette-tactile.net
We have also had this in 3.4.3 and 3.2.6. The problem seems to be the following lines in ipcacheHandleReply in ipcache.cc /* If we have not produced either IPs or Error immediately, wait for recursion to finish. */ if (done != 0 || error_message != NULL) From the comment and from a little bit of Software archaeology, this looks like this if condition was added as part of work that was done to make the Squid DNS resolver recursive. When this work was abandoned, it seems this change did not get reverted. The result is if the DNS server returns a success but only a CNAME record then this code is still expecting the Squid resolver to be recursing, but that is no longer the case. The quick fix is to comment out or remove just the if line, so that the block of code that comes after it now executes unconditionally, as it did in Squid 3.0 // if (done != 0 || error_message != NULL) A better fix would also tidy up the no longer used done and error_message values. More recent ADs seem the be the main DNS servers that do this when a domain resolves to a CNAME that does not resolve further. A couple more examples: fbcdn-sphotos-p-a.akamaihd.net mediaplayer.yahoo.com
Squid Cache: Version 3.5.21 debian 8 2016/10/10 18:01:01 kid1| ipcacheParse: No Address records in response to 'sdk.open.phone.igexin.com' 2016/10/10 18:05:15 kid1| ipcacheParse: No Address records in response to 'sdk.open.phone.igexin.com' 2016/10/10 18:05:35 kid1| ipcacheParse: No Address records in response to 'sdk.open.phone.igexin.com'
I did encounter the same behaviour using the following versions of squid: squid 3.5.12-1ubuntu7.2 on ubuntu xenial squid 3.4.8-6+deb8u3 on debian jessie One of the affected Domains is cdn2.libute.de, resolving to CNAME d3e1otjey2rrj.cloudfront.net - which does not resolve to an A or AAAA record.
Created attachment 3469 [details] Proposed fix for bug 4007 The attached patch prepared for Squid-5 (r14956) solves the issue for me. @Developers: Please, review the fix. Thanks in advance.
(In reply to Garri Djavadyan from comment #4) > Created attachment 3469 [details] > Proposed fix for bug 4007 > > The attached patch prepared for Squid-5 (r14956) solves the issue for me. > > @Developers: Please, review the fix. Thanks in advance. Below is explanation related to the code change. One of the tasks of function ipcacheParse() is to assign DNS resolution errors to ipcache entries. Some errors are related to DNS lookup process itself (explicit error) like: if (nr < 0) { debugs(14, 3, "ipcacheParse: Lookup failed '" << error_message << "' for '" << (const char *)i->hash.key << "'"); i->error_message = xstrdup(error_message); return -1; } Other errors are not related to lookup process (implicit error) and have not associated 'error_message', so Squid assigns own gathered errors to the ipcache entries: if (nr == 0) { debugs(14, 3, "ipcacheParse: No DNS records in response to '" << name << "'"); i->error_message = xstrdup("No DNS records"); return -1; } if (na == 0) { debugs(14, DBG_IMPORTANT, "ipcacheParse: No Address records in response to '" << name << "'"); i->error_message = xstrdup("No Address records"); if (cname_found) ++IpcacheStats.cname_only; return 0; } The function ipcacheHandleReply(), only considers DNS lookup errors (error_message), but should check any assigned error (i->error_message) to an ipcache entry.
(In reply to Garri Djavadyan from comment #4) > Created attachment 3469 [details] > Proposed fix for bug 4007 > > The attached patch prepared for Squid-5 (r14956) solves the issue for me. > > @Developers: Please, review the fix. Thanks in advance. Sorry, the change produces same effect as following patch, proposed in comment 1: === modified file 'src/ipcache.cc' --- src/ipcache.cc 2016-04-22 11:39:23 +0000 +++ src/ipcache.cc 2016-11-27 11:26:27 +0000 @@ -474,11 +474,8 @@ int done = ipcacheParse(i, answers, na, error_message); - /* If we have not produced either IPs or Error immediately, wait for recursion to finish. */ - if (done != 0 || error_message != NULL) { - ipcacheAddEntry(i); - ipcacheCallback(i, age); - } + ipcacheAddEntry(i); + ipcacheCallback(i, age); } /** --------
Yes, Stephen was right. The results of ipcacheParse() are irrelevant now that recursion is not happening. I have applied a cleanup patch to ipcacheParse() and ipcacheHandleReply()
Will this fix be backported to 3.4 (which is what debian stable ships right now...)?
(In reply to bugzillamailbox from comment #8) > Will this fix be backported to 3.4 (which is what debian stable ships right > now...)? I believe it should be done by binary packet maintainers of the OS distribution. The maintainers of the packet decide which commits (usually bug and performance fixes) from current upstream versions should be backported to maintained local version. It is a main reason to maintain local version. It makes local version outdated but more stable that any current upstream version. IIRC, only critical security fixes could be applied to unsupported version by Squid team.
Applied to v4 and v3.5. Older versions are up to the downstream distributors.
*** Bug 4564 has been marked as a duplicate of this bug. ***
*** Bug 4858 has been marked as a duplicate of this bug. ***