-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doStartTLS() occasionally fails due to closed connection #126
Comments
Thank you for reporting this. I've been able to reproduce the problem in my own test, and I've found the underlying cause. When you're using a connection in non-synchronous mode (which is the mode it uses by default), then it sets a short SO_TIMEOUT on the connection when beginning StartTLS processing, and then resets it after StartTLS processing completes. The shorter SO_TIMEOUT is set in a try block with all of the other processing, and the SO_TIMEOUT is reset in a finally block that is attached to that try block. Because you had the hostname verifier in place, and because the hostname you're using doesn't match the expected value, then TLS negotiation is failing, and there's an exception being thrown in that try block. The failed negotiation also initiates the process of closing the connection, and in some cases, the closure has happened before the finally block gets around to trying to resetting the SO_TIMEOUT, which causes that reset attempt to fail, and that throws another exception that hides the exception that was originally thrown as a result of the TLS negotiation. I've committed a change to address the problem by not attempting to reset the SO_TIMEOUT if there was a failure in StartTLS processing. Before this change, my test program ran into the problem you described after only a few attempts. With the fix in place, I've successfully completed a couple of 10,000-pass runs without the SO_TIMEOUT issue raising its head. The fix will be included in the next release of the LDAP SDK, which is currently scheduled for sometime around June. You can get the fix sooner than that if you check out and build the LDAP SDK for yourself. As a workaround, you can try using the LDAP SDK in synchronous mode, by using LDAPConnectionOptions.setUseSynchronousMode(true) before establishing the connection. Connections that operate in synchronous mode send requests and wait for responses on the same thread, which eliminates the need for background reader threads. Connections in synchronous mode tend to be slightly faster and have a little less overhead, but you can't use asynchronous operations or abandon/cancel operations in progress. If you don't need support for asynchronous operations or abandons/cancels, then synchronous mode may be a better option than the default asynchronous mode. |
Nice! Many thanks for the details and suggestions, I'm going to check if synchronous mode works for us, but I think it might :) |
Summary
When trying to connect to an instance of
InMemoryDirectoryServer
, using STARTTLS with host name verification, sometimes an exception is thrown indicating thatSO_TIMEOUT
could not be set because the connection is closed.The circumstances seem a bit special, since this came from a test that checks if the connection via STARTTLS fails with a host name verification error. It did not seem to happen with TLS. If the host name verifier is not present, the problem also isn't present.
The error message reads as follows:
It seems to stem from the second call in the
finally
of theelse
block ofLDAPConnectionReader.doStartTLS()
, which tries to set the socket timeout back to the original value. That is as far as I could check, I didn't see the first call (InternalSDKHelper.setSoTimeout(connection, 50)
) failing, but my sample size wasn't that big.Affected Version
SDK version: 6.0.4
Tests were done with OpenJDK 11.0.14.1 (2022-02-08).
Expected outcome
I'd expect that, no matter how often the test is run, it always results in the message
Hostname verification failed because the expected hostname 'localhost' was not found in peer certificate 'subject='CN=foo.example.com
.How to reproduce
The test depends on a certificate that is valid in itself, but presents the wrong host. The connection is made to "localhost". The following Bash script can be used to generate the key store for this test in the current folder (on Linux).
There are multiple resulting files matching
foo.example.com.*
, of whichfoo.example.com.jks
needs to be placed in/tmp
(or the test needs to be amended).Currently the test runs this 25 times in rapid succession. It usually fails within the first ten tries, but since this is probably some race condition, it may vary with machine speed. I hope it is reproducible for others!
The text was updated successfully, but these errors were encountered: