I got several assertion errors tonight: # grep -1 "Starting Squid" cache.log 2007/12/18 22:49:09| Starting Squid Cache version 3.0.STABLE1 for i486-pc-linux-gnu... 2007/12/18 22:49:09| Process ID 4005 -- 2007/12/19 00:14:52| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2007/12/19 00:15:03| Starting Squid Cache version 3.0.STABLE1 for i486-pc-linux-gnu... 2007/12/19 00:15:03| Process ID 10614 -- 2007/12/19 00:15:54| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2007/12/19 00:16:00| Starting Squid Cache version 3.0.STABLE1 for i486-pc-linux-gnu... 2007/12/19 00:16:00| Process ID 10915 -- 2007/12/19 00:16:18| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2007/12/19 00:16:24| Starting Squid Cache version 3.0.STABLE1 for i486-pc-linux-gnu... 2007/12/19 00:16:24| Process ID 10944 -- 2007/12/19 00:16:52| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2007/12/19 00:16:58| Starting Squid Cache version 3.0.STABLE1 for i486-pc-linux-gnu... 2007/12/19 00:16:58| Process ID 10967 After that, things went back to normal.
If you get a stack trace from the coredump or cache.log output with full debugging enabled, please email or post. Thank you.
Bumping this to UNCONFIRMED until a backtrace or other reletant details can be provided. I don't doubt your Squid crashed, it's just that there is too little information to go on to process this report any further.
Spotted again in 3.0.STABLE4 (bug 2325) but still no trace information.
*** Bug 2353 has been marked as a duplicate of this bug. ***
Finally a good trace. see bug 2353.
also have the same problem with squid from cache.log: 2008/09/22 14:02:49| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2008/09/22 14:05:16| Starting Squid Cache version 3.0.STABLE7 for amd64-portbld-freebsd7.0... 2008/09/22 14:05:16| Process ID 86406 2008/09/22 14:05:16| With 11095 file descriptors available and backtrace from core: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)... Core was generated by `squid'. Program terminated with signal 6, Aborted. Reading symbols from /lib/libcrypt.so.4...(no debugging symbols found)...done. Loaded symbols for /lib/libcrypt.so.4 Reading symbols from /usr/lib/librt.so.1...(no debugging symbols found)...done. Loaded symbols for /usr/lib/librt.so.1 Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /usr/lib/libstdc++.so.6...(no debugging symbols found)...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x0000000800f2162c in thr_kill () from /lib/libc.so.7 [New Thread 0x807480 (LWP 100151)] [New Thread 0x807300 (LWP 100150)] [New Thread 0x807180 (LWP 100149)] [New Thread 0x807000 (LWP 100148)] [New Thread 0x806e80 (LWP 100147)] [New Thread 0x806d00 (LWP 100146)] [New Thread 0x806b80 (LWP 100145)] [New Thread 0x806a00 (LWP 100144)] [New Thread 0x806880 (LWP 100143)] [New Thread 0x806700 (LWP 100142)] [New Thread 0x806580 (LWP 100141)] [New Thread 0x806400 (LWP 100140)] [New Thread 0x806280 (LWP 100139)] [New Thread 0x806100 (LWP 100138)] [New Thread 0x805f80 (LWP 100137)] [New Thread 0x805e00 (LWP 100136)] [New Thread 0x805c80 (LWP 100135)] [New Thread 0x805b00 (LWP 100134)] [New Thread 0x805980 (LWP 100133)] [New Thread 0x805800 (LWP 100132)] [New Thread 0x805680 (LWP 100131)] [New Thread 0x805500 (LWP 100130)] [New Thread 0x805380 (LWP 100129)] [New Thread 0x805200 (LWP 100128)] [New Thread 0x805080 (LWP 100127)] [New Thread 0x804f00 (LWP 100126)] [New Thread 0x804d80 (LWP 100125)] [New Thread 0x804c00 (LWP 100124)] [New Thread 0x804a80 (LWP 100123)] [New Thread 0x804900 (LWP 100122)] [New Thread 0x804780 (LWP 100121)] [New Thread 0x804600 (LWP 100093)] [New Thread 0x804180 (LWP 100234)] (gdb) bt #0 0x0000000800f2162c in thr_kill () from /lib/libc.so.7 #1 0x0000000800fb028b in abort () from /lib/libc.so.7 #2 0x0000000000443acc in xassert () #3 0x00000000004b1082 in store_client::scheduleDiskRead () #4 0x00000000004b1519 in storeClientCopy2 () #5 0x00000000004b190c in store_client::copy () #6 0x000000000043a3e5 in clientGetMoreData () #7 0x00000000004309b5 in ClientSocketContext::pullData () #8 0x00000000004cd104 in commio_call_callback () #9 0x00000000004cd198 in commio_call_callbacks () #10 0x00000000004cd1ca in CommDispatcher::dispatch () #11 0x000000000044bec8 in EventLoop::runOnce () #12 0x000000000044c018 in EventLoop::run () #13 0x00000000004869e0 in main ()
Please verify of this problem still exists in current squid 3.1. If not the bug will be considered closed against that version.
can you provide patch against squid 3.0 because we don't use 3.1 in our production environment.
I have the same problem with squid with versions 3.0.STABLE8, STABLE9, STABLE10 and even 3.1.0.3 on my 3 systems. They are quite busy servers, everyone with about 300-600 req/sec during working hours. Each server crashes about 5 to 10 times a day, but not or very seldom on weekends. There is definitely less load on weekends... Typical messages in cache.log: 2009/01/06 10:06:56| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2009/01/06 10:09:33| Starting Squid Cache version 3.0.STABLE9 for x86_64-unknown-linux-gnu... >> The time difference between the assertion and the restart is over 2 minutes because squid had to write the 10 GB core file. # uname -a: Linux rbgs240x 2.6.18-92.1.10.0.1.el5 #1 SMP Mon Aug 4 17:11:38 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux # squid -v Squid Cache: Version 3.0.STABLE9 configure options: '--prefix=/appl' '--localstate=/var' '--with-filedescriptors=16384' '--enable-storeio=ufs,null' Cache configuration: cache_mem 8192 MB cache_dir null /tmp maximum_object_size 1024 KB I have about 10 core files, all with 3.0.STABLE9 and can supply more information. In the config there are src IP and dstdomain ACLs, each over 200 entries. I do not use disk cache because it is too slow on the servers (HP DL380G5) and the squid process was in I/O blocking most of the time and performance was poor until I disabled it and used only mem cache. Backtrace: GNU gdb Red Hat Linux (6.5-16.el5rh) Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1". Reading symbols from /lib64/libcrypt.so.1...done. Loaded symbols for /lib64/libcrypt.so.1 Reading symbols from /lib64/libresolv.so.2...done. Loaded symbols for /lib64/libresolv.so.2 Reading symbols from /lib64/libnsl.so.1...done. Loaded symbols for /lib64/libnsl.so.1 Reading symbols from /usr/lib64/libstdc++.so.6...done. Loaded symbols for /usr/lib64/libstdc++.so.6 Reading symbols from /lib64/libm.so.6...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libgcc_s.so.1...done. Loaded symbols for /lib64/libgcc_s.so.1 Reading symbols from /lib64/libc.so.6...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib64/libnss_files.so.2...done. Loaded symbols for /lib64/libnss_files.so.2 Reading symbols from /lib64/libnss_dns.so.2...done. Loaded symbols for /lib64/libnss_dns.so.2 Core was generated by `(squid) -D'. Program terminated with signal 6, Aborted. #0 0x000000323f230155 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x000000323f230155 in raise () from /lib64/libc.so.6 #1 0x000000323f231bf0 in abort () from /lib64/libc.so.6 #2 0x000000000043d441 in xassert (msg=0x4fddc8 "STORE_DISK_CLIENT == getType()", file=0x4fd95b "store_client.cc", line=430) at debug.cc:574 #3 0x000000000049e0e2 in store_client::scheduleDiskRead (this=0x27a52d4f8) at store_client.cc:430 #4 0x000000000049e508 in storeClientCopy2 (e=0xd94615c0, sc=0x27a52d4f8) at store_client.cc:331 #5 0x000000000049e874 in store_client::copy (this=0x27a52d4f8, anEntry=<value optimized out>, copyRequest= {flags = {error = 0}, length = 4096, offset = -18480, data = 0x27ce41a10 "HTTP/1.1 200 OK\r\nDate: Tue, 06 Jan 2009 09:06:56 GMT\r\nServer: Apache\r\nAccept-Ranges: bytes\r\nCache-Control: public, must-revalidate, max-age=0\r\nConnection: close\r\nContent-Disposition: inline; filename="...}, callback_fn=0x432ec0 <clientReplyContext::SendMoreData(void*, StoreIOBuffer)>, data=0x27afcdfb8) at store_client.cc:264 #6 0x0000000000434e25 in clientGetMoreData (aNode=0x714393a8, http=0x27d38eb68) at client_side_reply.cc:1492 #7 0x000000000042b790 in ClientSocketContext::pullData (this=0x27ce419f8) at client_side.cc:1478 #8 0x00000000004b7b70 in commio_call_callback (ccb=0x2b332e5ec1a0) at comm.cc:207 #9 0x00000000004b7b98 in commio_call_callbacks () at comm.cc:217 #10 0x00000000004b7bca in CommDispatcher::dispatch (this=<value optimized out>) at comm.cc:2535 #11 0x0000000000444e28 in EventLoop::runOnce (this=0x7fff7cfb7810) at EventLoop.cc:131 #12 0x0000000000444f18 in EventLoop::run (this=0x7fff7cfb7810) at EventLoop.cc:100 #13 0x000000000047a356 in main (argc=<value optimized out>, argv=<value optimized out>) at main.cc:1335 (gdb) quit As mentioned above using the beta 3.1.0.3 did not help, it crashed with the same error, again with 8 GB cache_mem and no disk cache: 2008/12/26 05:08:25| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" 2008/12/26 05:08:28| Starting Squid Cache version 3.1.0.3 for x86_64-unknown-linux-gnu... Any help to make squid more stable is appreciated. Thank you.
Same problem with Squid Cache version 3.0.STABLE11 for i686-pc-linux-gnu Squid Cache: Version 3.0.STABLE11 configure options: '--enable-cache-digests' '--enable-storeio=aufs,diskd,ufs,null' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/sbin' '--sbindir=/usr/sbin' '--sysconfdir=/etc/squid' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/lib/squid' '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--enable-icap-support' '--enable-poll' '--enable-async-io' '--with-maxfd=48000' '--enable-delay-pools' '--with-large-files' '--enable-icap-client' '--disable-dlmalloc' '--with-pthreads' '--enable-epoll' '--enable-arp-acl' '--enable-useragent-log' '--enable-referer-log' '--disable-ident-lookups' GDB (gdb) where #0 0xffffe410 in __kernel_vsyscall () #1 0xb7d4c051 in raise () from /lib/tls/libc.so.6 #2 0xb7d4da3b in abort () from /lib/tls/libc.so.6 #3 0x0808cab7 in xassert (msg=0x81818f8 "STORE_DISK_CLIENT == getType()", file=0x818156f "store_client.cc", line=Variable "line" is not available. ) at debug.cc:574 #4 0x080feb19 in store_client::scheduleDiskRead (this=0xe979888) at store_client.cc:430 #5 0x080ffa3a in storeClientCopy2 (e=0x25adee38, sc=0xe979888) at store_client.cc:331 #6 0x080ffd34 in store_client::copy (this=0xe979888, anEntry=0x25adee38, copyRequest= {flags = {error = 0}, length = 4096, offset = -209141, data = 0x110ded68 "HTTP/1.1 200 OK\r\nServer: nginx/0.5.35\r\nDate: Tue, 06 Jan 2009 13:27:19 GMT\r\nContent-Type: application/octet-stream\r\nAccept-Ranges: bytes\r\nCache-Control: no-store, no-cache, must-revalidate, post-check"...}, callback_fn=0x80823f0 <clientReplyContext::SendMoreData(void*, StoreIOBuffer)>, data=0x112933b8) at store_client.cc:264 #7 0x0808103e in clientGetMoreData (aNode=0x855c7e8, http=0x15af16e4) at client_side_reply.cc:1492 #8 0x08078eb8 in ClientSocketContext::pullData (this=0x110ded5c) at client_side.cc:1478 #9 0x0811dc8d in commio_call_callback (ccb=0xb707ecd8) at comm.cc:207 #10 0x0811dd0b in commio_call_callbacks () at comm.cc:217 #11 0x0811dd41 in CommDispatcher::dispatch (this=0xbfbe2a54) at comm.cc:2535 #12 0x0809fa17 in EventLoop::runOnce (this=0xbfbe2a10) at EventLoop.cc:131 #13 0x0809fb29 in EventLoop::run (this=0xbfbe2a10) at EventLoop.cc:100 #14 0x080d7862 in main (argc=2, argv=0xbfbe2ae4) at main.cc:1335
My squid dies two or three times a day: # squid -v Squid Cache: Version 3.0.STABLE12 configure options: '--prefix=/usr' '--sysconfdir=/etc/squid' '--bindir=/usr/sbin' '--sbindir=/usr/sbin' '--localstatedir=/var' '--libexecdir=/usr/sbin' '--datadir=/usr/share/squid' '--disable-carp' '--disable-htcp' '--disable-icap-client' '--disable-ident-lookups' '--disable-wccp' '--disable-wccpv2' '--enable-async-io=128' '--enable-auth=basic digest' '--enable-basic-auth-helpers=LDAP' '--enable-digest-auth-helpers=ldap' '--enable-default-err-language=German_Datev' '--enable-epoll' '--enable-err-languages=German_Datev' '--enable-snmp' '--enable-storeio=aufs,ufs,diskd,null' '--enable-referer-log' '--enable-useragent-log' '--enable-large-cache-files' '--enable-removal-policies=lru,heap' '--mandir=/usr/share/man' '--with-default-user=squid' '--with-filedescriptors=8192' '--with-large-files' (gdb) bt #0 0xffffe410 in ?? () #1 0xbffff808 in ?? () #2 0x00000006 in ?? () #3 0x0000151b in ?? () #4 0x40193581 in raise () from /lib/tls/libc.so.6 #5 0x40194e65 in abort () from /lib/tls/libc.so.6 #6 0x08088983 in xassert (msg=0x814c140 "STORE_DISK_CLIENT == getType()", file=0x814be51 "store_client.cc", line=Variable "line" is not available. ) at debug.cc:574 #7 0x080eb75d in store_client::scheduleDiskRead (this=0x89245c4) at store_client.cc:430 #8 0x080ec413 in store_client::doCopy (this=0x89245c4, anEntry=0x40e91028) at store_client.cc:377 #9 0x080ec635 in storeClientCopy2 (e=0x40e91028, sc=0x89245c4) at store_client.cc:331 #10 0x080ec9da in store_client::copy (this=0x89245c4, anEntry=0x40e91028, copyRequest= {flags = {error = 0}, length = 4096, offset = -18463, data = 0x42ee5954 "HTTP/1.1 200 OK\r\nDate: Thu, 05 Feb 2009 09:58:00 GMT\r\nServer: Apache/2.2.10 (Unix)\r\nX-Powered-By: PHP/5.2.6\r\nCache-Control: public, must-revalidate, max-age=0\r\nPragma: no-cache\r\nContent-Disposition: i"...}, callback_fn=0x807d9f0 <clientReplyContext::SendMoreData(void*, StoreIOBuffer)>, data=0x42f07da8) at store_client.cc:264 #11 0x0807ec65 in clientGetMoreData (aNode=0x871da38, http=0x9500340) at client_side_request.cci:39 #12 0x080860b9 in clientStreamRead (thisObject=0x871ede8, http=0x9500340, readBuffer= {flags = {error = 0}, length = 4096, offset = -18899, data = 0x42ee5954 "HTTP/1.1 200 OK\r\nDate: Thu, 05 Feb 2009 09:58:00 GMT\r\nServer: Apache/2.2.10 (Unix)\r\nX-Powered-By: PHP/5.2.6\r\nCache-Control: public, must-revalidate, max-age=0\r\nPragma: no-cache\r\nContent-Disposition: i"...}) at clientStream.cc:199 #13 0x0807236e in ClientSocketContext::pullData (this=0x42ee5948) at client_side.cc:1479 #14 0x08105663 in commio_call_callback (ccb=0x405263d0) at comm.cc:207 #15 0x081056bb in commio_call_callbacks () at comm.cc:217 #16 0x081056f1 in CommDispatcher::dispatch (this=0xbffffca0) at comm.cc:2535 #17 0x08091965 in EventLoop::runOnce (this=0xbffffcd0) at Array.h:57 #18 0x08091a8c in EventLoop::run (this=0xbffffcd0) at EventLoop.cc:100 #19 0x080c8796 in main (argc=4, argv=0xbffffeb4) at main.cc:1334
More info: Core reason is that when NTLM is used, in DelayUserCmp left->authUser->username() can be NULL. Changing the function DelayUser::id(CompositePoolNode::CompositeSelectionDetails &details) to also include a check for if (NULL==details.user->user()->username()) return new NullDelayId; Fixes this particular issue, but it causes a SIGPIPE in commHandleWrite in case of an aborted connection, and I can't really understand why. I'll leave this one to be figured out by someone who understands the comm code a better than me.
another backtrace: Squid Cache: Version 3.0.STABLE13 configure options: '--with-default-user=squid' '--bindir=/usr/local/sbin' '--sbindir=/usr/local/sbin' '--datadir=/usr/local/etc/squid' '--libexecdir=/usr/local/libexec/squid' '--localstatedir=/usr/local/squid' '--sysconfdir=/usr/local/etc/squid' '--enable-removal-policies=lru heap' '--disable-linux-netfilter' '--disable-linux-tproxy' '--disable-epoll' '--enable-auth=basic digest negotiate ntlm' '--enable-basic-auth-helpers=DB NCSA PAM MSNT SMB squid_radius_auth YP' '--enable-digest-auth-helpers=password' '--enable-external-acl-helpers=ip_user session unix_group wbinfo_group' '--enable-ntlm-auth-helpers=SMB' '--enable-negotiate-auth-helpers=squid_kerb_auth' '--with-pthreads' '--enable-storeio=ufs diskd null aufs' '--disable-carp' '--enable-cache-digests' '--disable-wccp' '--disable-ident-lookups' '--enable-kqueue' '--with-large-files' '--enable-stacktraces' '--enable-err-languages=Armenian Azerbaijani Bulgarian Catalan Czech Danish Dutch English Estonian Finnish French German Greek Hebrew Hungarian Italian Japanese Korean Lithuanian Polish Portuguese Romanian Russian-1251 Russian-koi8-r Serbian Simplify_Chinese Slovak Spanish Swedish Traditional_Chinese Turkish Ukrainian-1251 Ukrainian-koi8-u Ukrainian-utf8' '--enable-default-err-language=templates' '--disable-unlinkd' '--disable-optimizations' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/info/' '--build=amd64-portbld-freebsd7.1' 'build_alias=amd64-portbld-freebsd7.1' 'CC=cc' 'CFLAGS=-pipe -g -g' 'LDFLAGS=' 'CPPFLAGS=' 'CXX=c++' 'CXXFLAGS=-pipe -g -g' gdb /usr/local/sbin/squid squid.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Core was generated by `squid'. Program terminated with signal 6, Aborted. Reading symbols from /lib/libcrypt.so.4...done. Loaded symbols for /lib/libcrypt.so.4 Reading symbols from /lib/libthr.so.3...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x0000000800ee9dcc in kill () from /lib/libc.so.7 [New Thread 0x801102180 (LWP 100062)] (gdb) bt #0 0x0000000800ee9dcc in kill () from /lib/libc.so.7 #1 0x0000000800ee8c3b in abort () from /lib/libc.so.7 #2 0x0000000000454072 in xassert (msg=0x560ed8 "STORE_DISK_CLIENT == getType()", file=0x560c5d "store_client.cc", line=430) at debug.cc:574 #3 0x00000000004d740e in store_client::scheduleDiskRead (this=0x8049d4678) at store_client.cc:430 #4 0x00000000004d800e in store_client::scheduleRead (this=0x8049d4678) at store_client.cc:423 #5 0x00000000004d83b4 in store_client::doCopy (this=0x8049d4678, anEntry=0x804a3d7c0) at store_client.cc:377 #6 0x00000000004d8567 in storeClientCopy2 (e=0x804a3d7c0, sc=0x8049d4678) at store_client.cc:331 #7 0x00000000004d892d in store_client::copy (this=0x8049d4678, anEntry=0x804a3d7c0, copyRequest= {flags = {error = 0}, length = 4096, offset = -16273927, data = 0x804a42030 "HTTP/1.1 200 OK\r\nServer: nginx/0.5.35\r\nDate: Mon, 06 Apr 2009 11:28:34 GMT\r\nContent-Type: application/octet-stream\r\nConnection: keep-alive\r\nKeep-Alive: timeout=20\r\nAccept-Ranges: bytes\r\nContent-Dispos"...}, callback_fn=0x446ef0 <clientReplyContext::SendMoreData(void*, StoreIOBuffer)>, data=0x804a64018) at store_client.cc:264 #8 0x00000000004d89ab in storeClientCopy (sc=0x8049d4678, e=0x804a3d7c0, copyInto= {flags = {error = 0}, length = 4096, offset = -16273927, data = 0x804a42030 "HTTP/1.1 200 OK\r\nServer: nginx/0.5.35\r\nDate: Mon, 06 Apr 2009 11:28:34 GMT\r\nContent-Type: application/octet-stream\r\nConnection: keep-alive\r\nKeep-Alive: timeout=20\r\nAccept-Ranges: bytes\r\nContent-Dispos"...}, callback=0x446ef0 <clientReplyContext::SendMoreData(void*, StoreIOBuffer)>, data=0x804a64018) at store_client.cc:218 #9 0x0000000000448316 in clientGetMoreData (aNode=0x8049ec018, http=0x80480aa38) at client_side_reply.cc:1492 #10 0x0000000000450a88 in clientStreamRead (thisObject=0x8049ec098, http=0x80480aa38, readBuffer= {flags = {error = 0}, length = 4096, offset = -16274319, data = 0x804a42030 "HTTP/1.1 200 OK\r\nServer: nginx/0.5.35\r\nDate: Mon, 06 Apr 2009 11:28:34 GMT\r\nContent-Type: application/octet-stream\r\nConnection: keep-alive\r\nKeep-Alive: timeout=20\r\nAccept-Ranges: bytes\r\nContent-Dispos"...}) at clientStream.cc:199 #11 0x000000000043c281 in ClientSocketContext::pullData (this=0x804a42018) at client_side.cc:1479 #12 0x0000000000442270 in ClientSocketContext::writeComplete (this=0x804a42018, fd=465, bufnotused=0x0, size=3001, errflag=COMM_OK) at client_side.cc:1626 #13 0x0000000000442330 in clientWriteComplete (fd=465, bufnotused=0x0, size=3001, errflag=COMM_OK, xerrno=0, data=0x804a42018) at client_side.cc:1553 #14 0x00000000004fe32c in commio_call_callback (ccb=0x802014e98) at comm.cc:207 #15 0x00000000004fe361 in commio_call_callbacks () at comm.cc:217 #16 0x00000000004fe393 in CommDispatcher::dispatch (this=0x7fffffffeb20) at comm.cc:2535 #17 0x000000000045eac4 in EventLoop::runOnce (this=0x7fffffffeab0) at EventLoop.cc:131 #18 0x000000000045ebee in EventLoop::run (this=0x7fffffffeab0) at EventLoop.cc:100 #19 0x00000000004a2fa5 in main (argc=2, argv=0x7fffffffebc0) at main.cc:1334
Same probleme here with ntlm authentication. Squid is: Squid Cache: Version 3.0.STABLE13 configure options: '--build=i386-redhat-linux-gnu' '--host=i386-redhat-linux-gnu' '--target=i386-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/libexec' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--exec_prefix=/usr' '--bindir=/usr/sbin' '--libexecdir=/usr/lib/squid' '--localstatedir=/var' '--datadir=/usr/share' '--sysconfdir=/etc/squid' '--disable-dependency-tracking' '--enable-arp-acl' '--enable-auth=basic,digest,ntlm' '--enable-basic-auth-helpers=LDAP,MSNT,NCSA,PAM,SMB,YP,getpwnam,multi-domain-NTLM,SASL' '--enable-cache-digests' '--enable-cachemgr-hostname=localhost' '--enable-delay-pools' '--enable-digest-auth-helpers=password' '--enable-epoll' '--enable-external-acl-helpers=ip_user,ldap_group,unix_group,wbinfo_group' '--enable-icap-client' '--enable-ident-lookups' '--with-large-files' '--enable-linux-netfilter' '--enable-ntlm-auth-helpers=SMB,fakeauth' '--enable-referer-log' '--enable-removal-policies=heap,lru' '--enable-snmp' '--enable-ssl' '--enable-storeio=aufs,diskd,null,ufs' '--enable-useragent-log' '--enable-wccpv2' '--with-aio' '--with-default-user=squid' '--with-filedescriptors=16384' '--with-dl' '--with-openssl=/usr/kerberos' '--with-pthreads' 'build_alias=i386-redhat-linux-gnu' 'host_alias=i386-redhat-linux-gnu' 'target_alias=i386-redhat-linux-gnu' 'CFLAGS=-fPIE -Os -g -pipe -fsigned-char -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables' 'LDFLAGS=-pie' 'CXXFLAGS=-fPIE -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables' 'FFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables' Is there a chance that this is fixed in Squid 3.0 stable 17? Thank you so much!!
Going by comment 12. This looks like another face of bug 2127 which is now resolved. Closing as fixed. The comm Write error is a separate issue.
Saw this error on our proxy today. Version of squid is 3.1.0.12 with applied patch against the DoS vulnerability. Got this one fixed with 3.1.0.13? I cannot find any informations in the changelog.
'twas supposed to be fixed from 3.1.0.10. But seems not :(
Is there anything you need to reproduce/debug this for squid 3.1.0.13?
*** Bug 2754 has been marked as a duplicate of this bug. ***
This bug still exists in STABLE19 -- our servers crash every couple of days also, interrupting some important Web applications hosted elsewhere. Some thoughts: The bug in bug 2353 is not the same as this bug, as implied by comment #5. The bug has nothing to do with NTLM, as comment #12 suggests. The information already provided in other comments (./configure --etc.) makes that clear, and i can verify it from our environment: we do not use NTLM, and we do not compile it in. The problem has to do with the type of the object to be read, as is apparent from the assertion. The assertion comes in store_client::scheduleDiskRead(), and can only be reached from store_client::scheduleRead() if copyInto.offset is outside of the range reserved for memory. What i notice from all of the stacktraces already attached to this bug is that copyRequest.offset is always negative in the calls to store_client::copy(). I don't know the Squid source code well enough to know if this is normal, but it doesn't seem like it, and would certainly explain how a memory read could be misinterpreted as a disk read in store_client::scheduleRead().
(In reply to comment #20) > The bug has nothing to do with NTLM, as comment #12 suggests. The information > already provided in other comments (./configure --etc.) makes that clear, and i > can verify it from our environment: we do not use NTLM, and we do not compile > it in. Looks to me like Comment #12 belongs to another bug report. > The bug in bug 2353 is not the same as this bug, as implied by comment #5. Yes and no. Two issues is discussed there, where one is the same as this. > from the assertion. The assertion comes in store_client::scheduleDiskRead(), > and can only be reached from store_client::scheduleRead() if copyInto.offset is > outside of the range reserved for memory. Yes, that is very obvious from the message alone. > What i notice from all of the > stacktraces already attached to this bug is that copyRequest.offset is always > negative in the calls to store_client::copy(). I don't know the Squid source > code well enough to know if this is normal, but it doesn't seem like it, and > would certainly explain how a memory read could be misinterpreted as a disk > read in store_client::scheduleRead(). Thanks! That is very valuable info. Regards Henrik
Created attachment 2062 [details] Still working to fix this bug? We have 4 squids and we got the error sometimes up to 5 times a day. Some days nothing happens. Error: assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" Squid Parent: child process 16640 exited due to signal 6 Squid Parent: child process 19281 started This Bug Report is 2 years old. Are you still working to fix this problem? It seems like we users can not solve this problem alone, but can we do something to avoid this error?
I may be able to help with this bug but I need more information to proceed. Which Squid version are you running? Can you get a stack trace? If you can print a few variables from live stack (Squid started from gdb) or core file (post-mortem analysis), it may help to fix this much quicker.
(In reply to comment #23) > I may be able to help with this bug but I need more information to proceed. I can try to provide any helpful information about this issue if it helps you solving this problem. My squid server crashes every 2-5 days with the same error. I'm running squid-3.0.19 under FreeBSD 7.2-STABLE amd64. Server perfoms about 100-150 req/sec duaring workhours Cache configuration is cache_mem 24 GB cache_dir aufs /opt/squid/cache 12288 16 256 cache_dir aufs /mnt/str/squid-cache 12288 32 256 Both cache_dirs resides on the same SAS mirror but uses different partitions mounted with "noatime" option. Disk load is less than average. So I'm not suffered with IO starvation. What I had noticed reviewing the logs is that squid crashe arises soon after the kernel message Oct 28 10:01:14 ti_server kernel: /mnt/str: optimization changed from TIME to SPACE Both cache_dirs are balancing on the edge between TIME and SPACE optimization. And kernel toggles it from time to time. That's why I use two cache dirs instead of one on the same disks. One cache dir can't stand that huge amount of small files on one partition. If you can tell me how to get a stack trace or core file analysis - I can give you more info.
(In reply to comment #24) > If you can tell me how to get a stack trace or core file analysis - I can give > you more info. The following link has a few coredump-related hints: http://wiki.squid-cache.org/SquidFaq/BugReporting
Hello all. I use squid 3.1.0.14 without disk cache, some times (1..2 per day)I received same error messages. At mail-list [squid-users] I find that may to cancel "scheduleDiskRead();" and I apply this patch: ====patch==== --- squid-3.1.0.14/src/store_client.cc.orig 2009-10-23 03:47:17.000000000 +0300 +++ squid-3.1.0.14/src/store_client.cc 2009-10-23 14:06:07.000000000 +0300 @@ -420,7 +420,12 @@ if (copyInto.offset >= mem->inmem_lo && copyInto.offset < mem->endOffset()) scheduleMemRead(); else - scheduleDiskRead(); + { + /*scheduleDiskRead();*/ + debugs(90,1,HERE << "WORKAROUND - averted calling scheduleDiskRead()"); + fail(); + flags.store_copying = 0; + } } void ====patch==== Squid`s Up time more than 4,5 days is well.
Here I got the backtrace #squid -v Squid Cache: Version 3.0.STABLE19 configure options: '--with-default-user=squid' '--bindir=/usr/local/sbin' '--sbindir=/usr/local/sbin' '--datadir=/usr/local/etc/squid' '--libexecdir=/usr/local/libexec/squid' '--localstatedir=/usr/local/squid' '--sysconfdir=/usr/local/etc/squid' '--enable-removal-policies=lru heap' '--disable-linux-netfilter' '--disable-linux-tproxy' '--disable-epoll' '--enable-auth=basic digest negotiate ntlm' '--enable-basic-auth-helpers=DB NCSA PAM MSNT SMB squid_radius_auth LDAP YP' '--enable-digest-auth-helpers=password ldap' '--enable-external-acl-helpers=ip_user session unix_group wbinfo_group ldap_group' '--enable-ntlm-auth-helpers=SMB' '--enable-negotiate-auth-helpers=squid_kerb_auth' '--with-pthreads' '--enable-storeio=ufs diskd null aufs' '--enable-delay-pools' '--disable-wccp' '--enable-kqueue' '--with-large-files' '--enable-stacktraces' '--enable-err-languages=Armenian Azerbaijani Bulgarian Catalan Czech Danish Dutch English Estonian Finnish French German Greek Hebrew Hungarian Italian Japanese Korean Lithuanian Polish Portuguese Romanian Russian-1251 Russian-koi8-r Serbian Simplify_Chinese Slovak Spanish Swedish Traditional_Chinese Turkish Ukrainian-1251 Ukrainian-koi8-u Ukrainian-utf8' '--enable-default-err-language=templates' '--prefix=/usr/local' '--mandir=/usr/local/man' '--infodir=/usr/local/info/' '--build=amd64-portbld-freebsd7.2' 'build_alias=amd64-portbld-freebsd7.2' 'CC=cc' 'CFLAGS=-O2 -fno-strict-aliasing -pipe -march=nocona -I/usr/local/include -g -DLDAP_DEPRECATED' 'LDFLAGS= -L/usr/local/lib' 'CPPFLAGS=' 'CXX=c++' 'CXXFLAGS=-O2 -fno-strict-aliasing -pipe -march=nocona -I/usr/local/include -g -DLDAP_DEPRECATED' GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Core was generated by `squid'. Program terminated with signal 6, Aborted. Reading symbols from /lib/libcrypt.so.4...done. Loaded symbols for /lib/libcrypt.so.4 Reading symbols from /lib/libthr.so.3...done. Loaded symbols for /lib/libthr.so.3 Reading symbols from /usr/lib/libstdc++.so.6...done. Loaded symbols for /usr/lib/libstdc++.so.6 Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x0000000800e069cc in thr_kill () from /lib/libc.so.7 [New Thread 0x801005440 (LWP 100507)] [New Thread 0x8010052b0 (LWP 100506)] [New Thread 0x801005120 (LWP 100505)] [New Thread 0x801004f90 (LWP 100504)] [New Thread 0x801004e00 (LWP 100503)] [New Thread 0x801004c70 (LWP 100502)] [New Thread 0x801004ae0 (LWP 100501)] [New Thread 0x801004950 (LWP 100500)] [New Thread 0x8010047c0 (LWP 100499)] [New Thread 0x801004630 (LWP 100498)] [New Thread 0x8010044a0 (LWP 100497)] [New Thread 0x801004310 (LWP 100496)] [New Thread 0x801004180 (LWP 100495)] [New Thread 0x801003ff0 (LWP 100494)] [New Thread 0x801003e60 (LWP 100493)] [New Thread 0x801003cd0 (LWP 100492)] [New Thread 0x801003b40 (LWP 100410)] [New Thread 0x8010039b0 (LWP 100232)] [New Thread 0x801003820 (LWP 100231)] [New Thread 0x801003690 (LWP 100230)] [New Thread 0x801003500 (LWP 100229)] [New Thread 0x801003370 (LWP 100228)] [New Thread 0x8010031e0 (LWP 100227)] [New Thread 0x801003050 (LWP 100226)] [New Thread 0x801002ec0 (LWP 100225)] [New Thread 0x801002d30 (LWP 100224)] [New Thread 0x801002ba0 (LWP 100223)] [New Thread 0x801002a10 (LWP 100222)] [New Thread 0x801002880 (LWP 100221)] [New Thread 0x8010026f0 (LWP 100220)] [New Thread 0x801002560 (LWP 100219)] [New Thread 0x8010023d0 (LWP 100218)] [New Thread 0x8010020b0 (LWP 100275)] (gdb) backtrace #0 0x0000000800e069cc in thr_kill () from /lib/libc.so.7 #1 0x0000000800e8eaf3 in abort () from /lib/libc.so.7 #2 0x0000000000440e14 in xassert (msg=Variable "msg" is not available. ) at debug.cc:574 #3 0x00000000004b02ea in store_client::scheduleDiskRead (this=0xb2249cfb8) at store_client.cc:430 #4 0x00000000004b06ee in storeClientCopy2 (e=0x80ffb5fc0, sc=0xb2249cfb8) at store_client.cc:331 #5 0x00000000004b0a99 in store_client::copy (this=0xb2249cfb8, anEntry=Variable "anEntry" is not available. ) at store_client.cc:264 #6 0x00000000004380c5 in clientGetMoreData (aNode=0x802efd418, http=0x801ef3838) at client_side_reply.cc:1491 #7 0x000000000042ebe8 in ClientSocketContext::pullData (this=0x802ec9ac8) at client_side.cc:1480 #8 0x00000000004c87aa in commio_call_callback (ccb=0x80174e3b8) at comm.cc:207 #9 0x00000000004c8830 in commio_call_callbacks () at comm.cc:217 #10 0x00000000004c8859 in CommDispatcher::dispatch (this=Variable "this" is not available. ) at comm.cc:2537 #11 0x000000000045242a in EventLoop::runOnce (this=0x7fffffffeb10) at EventLoop.cc:131 #12 0x0000000000452565 in EventLoop::run (this=0x7fffffffeb10) at EventLoop.cc:100 #13 0x000000000048adb6 in main (argc=Variable "argc" is not available. ) at main.cc:1334 (gdb) Is there anything else I can do for help?
Please run the following commands in gdb while looking at the core file: set print pretty frame 3 print *this frame 4 print *e print *e->mem_obj frame 6 print *http frame 7 print *this Please attach the output to this bug report (it will be relatively long). For the future, if you can ./configure your Squid with --disable-optimizations, then the backtrace may make more sense and contain more information. We will try to deal with whatever you've got, of course.
Created attachment 2068 [details] core debug Hi, Alex. Here is the output you requested. I can't recompile the squid right now, 'cause it's under heavy production. But I'll try to do it if any chance anyway.
Bug is fully reproducible. Versions tested 3.0 STABLE9 and STABLE19. All you need is to try to download any file from: http://www.kmgep.kz/eng/investor_relations/financial_information/ Here is extraction from cache.log with full debugging: 2009/11/12 10:37:15.186| The reply for GET http://www.kmgep.kz/details/d.php?idf=113&lang=eng is ALLOWED, because it matched 'all' 2009/11/12 10:37:15.186| StoreEntry::lock: key '2F4E0A7C9398885B7A764476C2A384EF' count=4 2009/11/12 10:37:15.186| clientReplyContext::sendMoreData: Appending 3633 bytes after 463 bytes of headers 2009/11/12 10:37:15.186| clientStreamCallback: Calling 1 with cbdata 0xb7b17054 from node 0x85f42d8 2009/11/12 10:37:15.186| cbdataReferenceValid: 0x85f4320 2009/11/12 10:37:15.186| packing sline 0x85a7ddc using 0xbffacd40: 2009/11/12 10:37:15.186| HTTP/1.0 200 OK 2009/11/12 10:37:15.186| packing hdr: (0x85a7d7c) 2009/11/12 10:37:15.186| client_side.cc(1210) sendStartOfMessage schedules clientWriteComplete 2009/11/12 10:37:15.186| comm_write: FD 23: sz 4275: hndl 1: data 0xb7b16018. 2009/11/12 10:37:15.186| cbdataLock: 0xb7b16018=1 2009/11/12 10:37:15.186| commSetSelect(FD 23,type=2,handler=1,client_data=0x837fa68,timeout=0) 2009/11/12 10:37:15.186| cbdataFree: 0x85a9cc4 2009/11/12 10:37:15.186| cbdataFree: Freeing 0x85a9cc4 2009/11/12 10:37:15.186| ACLChecklist::~ACLChecklist: destroyed 0x85fc588 2009/11/12 10:37:15.186| cbdataFree: 0x85fc588 2009/11/12 10:37:15.186| cbdataFree: Freeing 0x85fc588 2009/11/12 10:37:15.186| cbdataUnlock: 0xb7af4018=0 2009/11/12 10:37:15.186| cbdataUnlock: 0x8621464=0 2009/11/12 10:37:15.186| BodyPipe.cc(231) consumed 8000 bytes [8000<=8000<=? 0+8191 pipe0x8636b0c prod0x860ce98 cons0x8631a58] 2009/11/12 10:37:15.186| BodyPipe.cc(233) will call BodyPipe::tellMoreBodySpaceAvailable(0x8636b0c) 2009/11/12 10:37:15.186| cbdataLock: 0x8636b0c=1 2009/11/12 10:37:15.186| event.cc(315) schedule: Adding 'BodyPipe::tellMoreBodySpaceAvailable', in 0.00 seconds 2009/11/12 10:37:15.186| exiting BodyPipe::tellMoreBodyDataAvailable(0x8636b0c) 2009/11/12 10:37:15.186| comm_iocallbackpending: 0 2009/11/12 10:37:15.186| comm_calliocallback: 0 2009/11/12 10:37:15.186| comm_calliocallback: 0 2009/11/12 10:37:15.186| event.cc(225) checkEvents 2009/11/12 10:37:15.186| cbdataReferenceValid: 0x8636b0c 2009/11/12 10:37:15.186| Engine 0xbffad440 is idle. 2009/11/12 10:37:15.186| comm_select(): got FD 23 events=4 monitoring=1d F->read_handler=1 F->write_handler=1 2009/11/12 10:37:15.186| comm_select(): Calling write handler on FD 23 2009/11/12 10:37:15.186| commHandleWrite: FD 23: off 0, sz 4275. 2009/11/12 10:37:15.186| commHandleWrite: write() returns 4275 2009/11/12 10:37:15.186| commio_complete_callback: called for 23 (0, 0) 2009/11/12 10:37:15.186| cbdataReferenceValid: 0x8636b0c 2009/11/12 10:37:15.186| cbdataUnlock: 0x8636b0c=0 2009/11/12 10:37:15.186| EventDispatcher::dispatch: Running 'BodyPipe::tellMoreBodySpaceAvailable' 2009/11/12 10:37:15.186| entering BodyPipe::tellMoreBodySpaceAvailable(0x8636b0c) 2009/11/12 10:37:15.186| ICAPModXact::noteMoreBodySpaceAvailable called [FD 32r;Rrw(5)p(2)S(2)/P icapx159] 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(574) have 0 bytes to parse [FD 32r;Rrw(5)p(2)S(2)/P icapx159] 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(575) 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(937) have 0 body bytes to parse 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(945) have 0 body bytes after parse; parsed all: 0 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(957) 0x860ce40 needsMoreData = 1 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(960) 0x860ce40 2009/11/12 10:37:15.186| ICAP/ICAPModXact.cc(474) returning from readMore because reader or doneReading() 2009/11/12 10:37:15.186| ICAPModXact::noteMoreBodySpaceAvailable ended [FD 32r;Rrw(5)p(2)S(2)/P icapx159] 2009/11/12 10:37:15.186| exiting BodyPipe::tellMoreBodySpaceAvailable(0x8636b0c) 2009/11/12 10:37:15.186| comm_iocallbackpending: 0 2009/11/12 10:37:15.186| comm_calliocallback: 0 2009/11/12 10:37:15.186| comm_calliocallback: 0 2009/11/12 10:37:15.186| commio_call_callback: called for 23 2009/11/12 10:37:15.186| cbdataReferenceValid: 0xb7b16018 2009/11/12 10:37:15.186| cbdataUnlock: 0xb7b16018=0 2009/11/12 10:37:15.186| clientWriteComplete: FD 23, sz 4275, err 0, off 4275, len 0xb79f8e38 2009/11/12 10:37:15.186| client_side.cc(1509) body bytes sent vs. expected: 3633 ? 3321885 (+-1660942) 2009/11/12 10:37:15.186| ClientSocketContext::pullData: FD 23 2009/11/12 10:37:15.186| clientStreamRead: Calling 1 with cbdata 0xb7af5080 from node 0x85f4320 2009/11/12 10:37:15.187| cbdataReferenceValid: 0x85f42d8 2009/11/12 10:37:15.187| store_client::copy: 2F4E0A7C9398885B7A764476C2A384EF, from -1656846, for length 4096, cb 1, cbdata 0xb7af4018 2009/11/12 10:37:15.187| cbdataLock: 0xb7af4018=1 2009/11/12 10:37:15.187| storeClientCopy2: 2F4E0A7C9398885B7A764476C2A384EF 2009/11/12 10:37:15.187| cbdataLock: 0x8621464=1 2009/11/12 10:37:15.187| store_client::doCopy: co: -1656846, hi: 8463 2009/11/12 10:37:15.187| assertion failed: store_client.cc:430: "STORE_DISK_CLIENT == getType()" Hope this will help.
There are HTTP headers from http://www.kmgep.kz/details/d.php?idf=113&lang=eng GET /details/d.php?idf=113&lang=eng HTTP/1.1 Host: www.kmgep.kz User-Agent: Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.9.0.14) Gecko/2009091010 Iceweasel/3.0.6 (Debian-3.0.6-3) Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: ru,en-us;q=0.7,en;q=0.3 Accept-Encoding: gzip,deflate Accept-Charset: windows-1251,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: PHPSESSID=a01b8e5b57c781244cc8b6e4fd976fa2; hotlog=1 HTTP/1.x 200 OK Date: Mon, 16 Nov 2009 14:23:36 GMT Server: Apache X-Powered-By: PHP/5.2.11 Content-Disposition: attachment; filename=IFRS Financial Statements for 2008.pdf Last-Modified: Wed, 11 Mar 2009 18:29:32 ALMT Accept-Ranges: bytes Content-Length: 1660943 Content-Range: bytes -1660942/1660943 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: application/pdf Is it possible that assertion caused by negative Content-Range? (Content-Range: bytes -1660942/1660943)
It seems that assertion really caused by wrong values, which got in result of incorrect parsing of incorrect Content-Range header in HttpHdrContRange.cc See function static int httpHdrRangeRespSpecParseInit(HttpHdrRangeSpec * spec, const char *field, int flen) Case 1: we have correct header Content-Range: bytes 0-49/50 httpHdrRangeRespSpecParseInit got results: offset = 0 last_pos = 49 length = 49 Case 2: we have wrong header Content-Range: bytes -49/50 httpHdrRangeRespSpecParseInit got results: offset = -49 last_pos = 49 length = 98 Which are obviously wrong. Here the patch to fix this (see attachment too): diff -u -r squid3-3.0.STABLE8/src/HttpHdrContRange.cc squid3-3.0.STABLE8-PATCHED/src/HttpHdrContRange.cc --- squid3-3.0.STABLE8/src/HttpHdrContRange.cc 2008-07-18 14:02:53.000000000 +0400 +++ squid3-3.0.STABLE8-PATCHED/src/HttpHdrContRange.cc 2009-11-17 12:06:56.000000000 +0300 @@ -80,6 +80,12 @@ return 0; } + /* Additional check for BUG2155 - there MUST BE first-byte-pos and it MUST be positive*/ + if (p - field == 0) { + debugs(68, 2, "invalid (no first-byte-pos or it is negative) resp-range-spec near: '" << field << "'"); + return 0; + } + /* parse offset */ if (!httpHeaderParseOffset(field, &spec->offset)) return 0; @@ -93,6 +99,12 @@ if (!httpHeaderParseOffset(p, &last_pos)) return 0; + /* Additional paranoidal check for BUG2155 - last-byte-pos MUST be positive*/ + if (last_pos < 0) { + debugs(68, 2, "invalid (last-byte-pos is negative) resp-range-spec near: '" << field << "'"); + return 0; + } + spec->length = size_diff(last_pos + 1, spec->offset); /* Ensure typecast is safe */ assert (spec->length >= 0); @@ -177,6 +189,12 @@ else if (!httpHeaderParseOffset(p, &range->elength)) return 0; + /* Additional paranoidal check for BUG2155 - entity-length MUST be positive*/ + if ((range->elength != range_spec_unknown) && (range->elength < 0)) { + debugs(68, 2, "invalid (entity-length is negative) content-range-spec near: '" << str << "'"); + return 0; + } + debugs(68, 8, "parsed content-range field: " << (long int) range->spec.offset << "-" << (long int) range->spec.offset + range->spec.length - 1 << " / " <<
Created attachment 2072 [details] Patch to fix incorrect parsing of incorrect Content-Range header, in result of which occurs assertion It seems that assertion really caused by wrong values, which got in result of incorrect parsing of incorrect Content-Range header in HttpHdrContRange.cc See function static int httpHdrRangeRespSpecParseInit(HttpHdrRangeSpec * spec, const char *field, int flen) Case 1: we have correct header Content-Range: bytes 0-49/50 httpHdrRangeRespSpecParseInit got results: offset = 0 last_pos = 49 length = 49 Case 2: we have wrong header Content-Range: bytes -49/50 httpHdrRangeRespSpecParseInit got results: offset = -49 last_pos = 49 length = 98 Which are obviously wrong. Here the patch to fix this
I can confirm this bug within squid 3.0 stable 18 and 3.1.0.14 with the urls given by Andrey Motoshkov. With kind regards William
Hmm.. there is more sharks in this same piece of code..
Additionally the question is what the meaning of a Content-Range header in a 200 response really is.. those are only supposed to be seen in 206 / 416 responses.
We get quite confused if the response has a Content-Range. Seem to process as if it was a request header.
Comment on attachment 2068 [details] core debug Patch version 2, making Content-Range parser pretty strict.
Created attachment 2076 [details] Proposed patch v2 Patch version 2, making Content-Range parser pretty strict.
The other oddness seen filed in bug #2812
Applied to trunk. Targeting to 3.0 as the same bug most likely exists there as well.
(In reply to comment #40) > The other oddness seen filed in bug #2812 Bug #2821 I meant.
I confirm that patch v1 by Vladimir is doing it's job! About a week without crashes on my heavy production server. Good job, Vladimir, tnx. I'll test patch v2 if squid reload happens.
The patch by Vladimir is fine, it's just that there is additional issues in the same code not covered by his patch wich would crash Squid in other kinds of malformed Content-Range headers.
Applied to 3.1 and 3.0.
*** Bug 2841 has been marked as a duplicate of this bug. ***
*** Bug 2847 has been marked as a duplicate of this bug. ***