Bug 3664 - ssl_crtd fails to build on OpenSolaris/OpenIndiana/Solaris 11
ssl_crtd fails to build on OpenSolaris/OpenIndiana/Solaris 11
Status: RESOLVED FIXED
Product: Squid
Classification: Unclassified
Component: other
3.3
All Solaris: OpenIndiana
: P2 major
: 3.4
Assigned To: SQUID BUGS ALIAS
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-06 12:00 UTC by Andrew Evdokimov
Modified: 2017-04-18 01:23 UTC (History)
3 users (show)

See Also:
Browser: ---
Fixed Versions:
Needs:


Attachments
changes to certificate_db.cc (848 bytes, patch)
2012-10-06 13:42 UTC, Andrew Evdokimov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Evdokimov 2012-10-06 12:00:38 UTC
Passing '--enable-ssl-crtd' to configure on OpenSolaris/OpenIndiana/Solaris 11 causes failed build due to absence of flock() in any Solaris version and unavailability of UCB compatibility software since OpenSolaris/OpenIndiana/Solaris 11

g++ -DHAVE_CONFIG_H  -I../.. -I../../include -I../../lib -I../../src -I../../include  -I/usr/include/gssapi -I/usr/include/kerberosv5   -I/opt/smbsys/include -I/usr/include/libxml2 -m64 -O3 -I/opt/smbsys/include -I/opt/smbbld/include -I/usr/include/gssapi -I/usr/include/kerberosv5 -I/usr/include/libxml2 -Wall -Wpointer-arith -Wwrite-strings -Wcomments -Werror -pipe -D_REENTRANT -pthreads -m64 -m64 -O3 -I/opt/smbsys/include -I/opt/smbbld/include -Wno-error=maybe-uninitialized -MT certificate_db.o -MD -MP -MF .deps/certificate_db.Tpo -c -o certificate_db.o certificate_db.cc
certificate_db.cc: In member function 'void Ssl::Lock::lock()':
certificate_db.cc:63:19: error: 'LOCK_EX' was not declared in this scope
certificate_db.cc: In member function 'void Ssl::Lock::unlock()':
certificate_db.cc:78:19: error: 'LOCK_UN' was not declared in this scope
make[3]: *** [certificate_db.o] Error 1
make[3]: Leaving directory `/src/smbsys/squid-3.2.1/src/ssl'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/src/smbsys/squid-3.2.1/src'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/src/smbsys/squid-3.2.1/src'
make: *** [all-recursive] Error 1
Comment 1 Andrew Evdokimov 2012-10-06 12:06:37 UTC
Already answered at http://www.squid-cache.org/mail-archive/squid-users/201203/0300.html


But what about replacing flock/funlock with fcntl(fd, F_SETLK, F_WRLCK) and fcntl(fd, F_SETLK, F_UNLCK)? It at least works :)
Comment 2 Andrew Evdokimov 2012-10-06 13:42:41 UTC
Created attachment 2752 [details]
changes to certificate_db.cc

Or even simpler solution using lockf(fd, F_LOCK, 0) and lockf(fd, F_ULOCK, 0). See attached diff. Working fine for me.
Comment 3 Andrew Evdokimov 2013-10-21 16:01:32 UTC
Reproduces with 3.3.9
Comment 4 Alex Rousskov 2013-10-21 21:23:26 UTC
Using lockf() may be indeed the right approach because, according to manual pages, lockf() is "more standard" than flock() and, hence, it may be more portable. If lockf() is not supported on some platforms we care about, we would need to add one more #ifdef clause to Ssl::Lock::lock/unlock() methods to cover all three known Unix and Windows variants: flock(), lockf(), and LockFile().

As for the attachment 2752 [details] patch you have proposed, could you please comment on the change to the open() call? What does that change do? Is it required? Can you use symbolic flag constant(s) rather than the magical "2"?
Comment 5 Andrew Evdokimov 2013-10-22 15:52:06 UTC
"2" at open() stands for O_RDWR flag (while original "0" stands for O_RDONLY). I've just used the same style as in original code here, symbolic name will be definitely better. This is needed because lockf(3C) man at Solaris box says:

     #include <unistd.h>
     int lockf(int fildes, int function, off_t size);
...
     The fildes argument is an open  file  descriptor.  The  file
     descriptor  must have O_WRONLY or O_RDWR permission in order
     to establish locks with this function call.
...

So, I've used O_RDWR flag to allow reading from this file descriptor (I've not followed code path but it is logical that we'll be at least reading from it at some time).
Comment 6 Alex Rousskov 2013-10-23 14:30:31 UTC
Sounds good. The final version should use O_RDWR then.
Comment 7 Amos Jeffries 2014-11-12 10:13:10 UTC
Any update on this bug?
Comment 8 Alex Rousskov 2014-11-13 18:06:08 UTC
Comment #4 reflects the current status AFAICT. We now have an explanation regarding the magic constant "2", but somebody would still need to make the change to O_RDWR and verify lockf() portability to produce the final patch.
Comment 9 Yuri Voinov 2014-12-27 09:22:23 UTC
(In reply to comment #8)
> Comment #4 reflects the current status AFAICT. We now have an explanation
> regarding the magic constant "2", but somebody would still need to make the
> change to O_RDWR and verify lockf() portability to produce the final patch.

Alex,

I confirmed this patch works perfectly.

I've tested it with 3.4.10 on Solaris 10 u11 (kernel 150401-16) and all ok, works like charm.

Build options are:

root @ ktulhu / # /usr/local/squid/sbin/squid -v
Squid Cache: Version 3.4.10
configure options:  '--prefix=/usr/local/squid' '--enable-translation' '--enable-external-acl-helpers=file_userip' '--enable-icap-client' '--enable-ipf-transparent' '--enable-storeio=diskd' '--enable-removal-policies=lru,heap' '--enable-devpoll' '--disable-wccp' '--enable-wccpv2' '--enable-http-violations' '--enable-follow-x-forwarded-for' '--enable-arp-acl' '--enable-htcp' '--enable-cache-digests' '--with-dl' '--enable-auth-negotiate=none' '--disable-auth-digest' '--disable-auth-ntlm' '--disable-auth-basic' '--enable-storeid-rewrite-helpers=file' '--enable-log-daemon-helpers=file' '--enable-ssl' '--enable-ssl-crtd' '--with-filedescriptors=131072' '--with-build-environment=POSIX_V6_LP64_OFF64' 'CFLAGS=-O3 -m64 -fPIE -fstack-protector -mtune=core2 --param=ssp-buffer-size=4 -pipe' 'CXXFLAGS=-O3 -m64 -fPIE -fstack-protector -mtune=core2 --param=ssp-buffer-size=4 -pipe' 'CPPFLAGS=-I/opt/csw/include' 'LDFLAGS=-fPIE -pie -Wl,-z,now' --enable-ltdl-convenience
Comment 10 Amos Jeffries 2015-01-02 13:21:29 UTC
Patch applied with O_RDWR adjustment to Squid-3.
Comment 11 Amos Jeffries 2015-01-15 05:14:24 UTC
Applied to 3.5 and 3.4
Comment 12 Alex Rousskov 2015-04-09 15:27:25 UTC
FYI: The committed fix corrupts certificate databases on FreeBSD and possibly other POSIX-compliant platforms. I do not know whether it actually works correctly on Solaris (beyond compilation). See bug 4212 for a script you can use to test your environment.
Comment 13 Amos Jeffries 2015-04-13 03:53:04 UTC
I'm taking the #ifdef approach mentioned in comment 4 as the fix for bug 4212, making the use of lockf() only apply to Solaris and reverting all other OS to flock() for now. That way we should not have to re-open this bug.
Comment 14 Alex Rousskov 2015-05-01 21:12:45 UTC
I wonder whether bug 4212 problem exists on Solaris. This bug was about making Squid compile on Solaris, and we know it does now. However, it is possible that Solaris lockf() is as broken as FreeBSD and Linux lockf(). If that is the case, Squid on Solaris will sooner or later corrupt its certificate database.

If you are running Squid on Solaris, I recommend that you test your setup for locking failures using Guy's script posted in bug 4212.
Comment 15 Yuri Voinov 2015-05-03 20:14:58 UTC
(In reply to Alex Rousskov from comment #14)
> I wonder whether bug 4212 problem exists on Solaris. This bug was about
> making Squid compile on Solaris, and we know it does now. However, it is
> possible that Solaris lockf() is as broken as FreeBSD and Linux lockf(). If
> that is the case, Squid on Solaris will sooner or later corrupt its
> certificate database.
> 
> If you are running Squid on Solaris, I recommend that you test your setup
> for locking failures using Guy's script posted in bug 4212.

Sure.

wrong number of fields on line 6 (looking for field 6, got 1, '' left)
/usr/local/squid/libexec/ssl_crtd: The SSL certificate database /var/lib/ssl_db is corrupted. Please rebuild
wrong number of fields on line 6 (looking for field 6, got 1, '' left)
Invalid response from ssl_crtd: '' length=0
wrong number of fields on line 6 (looking for field /usr/local/squid/libexec/ssl_crtd: 6The SSL certificate database /var/lib/ssl_db is corrupted. Please rebuild, got 1, '' left)

Invalid response from ssl_crtd: '' length=0
/usr/local/squid/libexec/ssl_crtd: The SSL certificate database /var/lib/ssl_db is corrupted. Please rebuild
Invalid response from ssl_crtd: '' length=0
wrong number of fields on line 6 (looking for field 6, got 1, '' left)
/usr/local/squid/libexec/ssl_crtd: The SSL certificate database /var/lib/ssl_db is corrupted. Please rebuild
Invalid response from ssl_crtd: '' length=0

So, seems broken.

But what is alternative?
Comment 16 Alex Rousskov 2017-04-18 01:23:56 UTC
(In reply to Yuri Voinov from comment #15)

> But what is alternative?

Good question. Solaris is not my area of expertise, but after hitting this problem again in a different context, I can offer the following summary:

Long term, your best option is to convince the Solaris/Oracle team to add proper support for flock() and/or "Open file description" locks (i.e., fcntl(F_OFD_SETLK...)). I am not saying this to shovel this issue under the carpet and/or upset you. I genuinely believe this is the best long-term solution for this growing problem.

Meanwhile, as a temporary workaround, it is possible to substitute flock() with custom locking based on file creation. It would be an ugly (on several levels) Solaris-only solution, but a quality patch implementing such a workaround should be accepted IMO. I cannot volunteer such an implementation, unfortunately.

FWIW, googling shows that other projects step on this Solaris landmine as well, iteratively developing an understanding that there is no good alternative for a proper flock()-like system call. For example: https://github.com/golang/go/issues/11113