-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8326498: java.net.http.HttpClient connection leak using http/2 #28233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…rrored or completed
|
👋 Welcome back jpai! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
Webrevs
|
| /** | ||
| * Closes the connection normally (with a NO_ERROR termination cause), if not already closed. | ||
| */ | ||
| final void close() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should mark the Http2Connection as Closeable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's reasonable. I've updated the PR to include this change.
| if (!isOpen()) { | ||
| final Http2TerminationCause tc = this.connTerminator.getTerminationCause(); | ||
| assert tc != null : "termination cause is null for a closed connection"; | ||
| return Optional.of(tc.getCloseCause()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this code block. The comment seems to imply that it fixes the race, but it doesn't.
I assume you verified that all callers of getTerminationException are properly synchronized, so that we don't leak resources if the method returns an empty optional while the connection is closed in another thread.
| * {@linkplain NetworkChannel#isOpen() channel is open}. false otherwise. | ||
| */ | ||
| final boolean isOpen() { | ||
| return this.connTerminator.terminationCause.get() == null && connection.channel().isOpen(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ever observe a situation where channel is not open but termination cause is not set?
As far as I could tell, channel.isOpen only returns false after close() is called, and close() is only called from doTerminate after the termination cause is set. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A channel can be closed asynchronously by the peer. So it may be closed even if close() has not been called.
| * is not considered as an erroneous termination and this method returns {@code false} for | ||
| * such cases. | ||
| */ | ||
| public abstract boolean isErroneousClose(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we use a different word here? "Erroneous close" feels vague here; would "is(Non)Graceful", "isAbrupt" or "hasErrorCode" capture the intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That that erroneous close has been used in several other contexts; in code, in comments, etc. If this gets updated, I'd appreciate other relevant occurrences get updated too.
vy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally liked the clean-up of the state changes in Http2Connection, yet the change is non-trivial. I leave the judgement of that to reviewers more versed in Http2Connection.
| if (!cached) | ||
| c.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We remove the if (!cached) c.close() logic. Where do we restore this functionality? If not, why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the terminationException is not null the connection is already closed (or being closed by another thread) so there's no need to call close() again.
| if (debug.on()) { | ||
| debug.log("Unexpected state %s, skipping idle connection shutdown", | ||
| describeClosedState(closedState)); | ||
| debug.log("Not initiating idle connection close"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| debug.log("Not initiating idle connection close"); | |
| debug.log("Connection is already cancelled, skipping idle connection close"); |
| * being idle. | ||
| */ | ||
| public static Http2TerminationCause idleTimedOut() { | ||
| return new NoError("HTTP/2 connection idle timed out", "idle timed out"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason this is not cached in NoError.IDLE_TIMED_OUT in a similar manner to NoError.INSTANCE? I could not see a place where its stack trace would be of value.
| import jdk.internal.net.http.frame.ErrorFrame; | ||
|
|
||
| /** | ||
| * Termination cause for a {@linkplain Http2Connection HTTP/2 connection} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Termination cause for a {@linkplain Http2Connection HTTP/2 connection} | |
| * Termination cause for an {@linkplain Http2Connection HTTP/2 connection} |
| * Returns the IOException that is considered the cause of the connection termination. | ||
| * Even a normal {@linkplain #isErroneousClose() non-erroneous} termination will have | ||
| * a IOException associated with it, so this method will always return a non-null instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Returns the IOException that is considered the cause of the connection termination. | |
| * Even a normal {@linkplain #isErroneousClose() non-erroneous} termination will have | |
| * a IOException associated with it, so this method will always return a non-null instance. | |
| * Returns the {@link IOException} that is considered the cause of the connection termination. | |
| * Even a normal {@linkplain #isErroneousClose() non-erroneous} termination will have | |
| * an {@code IOException} associated with it, so this method will always return a non-null instance. |
| // if the underlying SocketChannel isn't open, then terminate the connection. | ||
| // that way when Http2Connection.isOpen() returns false in that situation, then this | ||
| // getTerminationCause() will return a termination cause. | ||
| terminate(Http2TerminationCause.forException(new IOException("channel is not open"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terminating the connection in a getter doesn't feel all right. Would you mind elaborating on the rationale, the absence of a better alternative, please?
|
|
||
| private static Field openConnections; // Set<> jdk.internal.net.http.HttpClientImpl#openedConnections | ||
|
|
||
| private static SSLContext sslContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is SSL a necessity for this test? Put another way, does SSL play a role in the connection leakage?
| } | ||
|
|
||
| /* | ||
| * Issues a burst of 100 HTTP/2 requests to the same server (host/port) and expects all of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numRequests is actually 20, not 100:
| * Issues a burst of 100 HTTP/2 requests to the same server (host/port) and expects all of | |
| * Issues a burst of HTTP/2 requests to the same server (host/port) and expects all of |
|
|
||
| // using reflection, return the jdk.internal.net.http.HttpClientImpl instance held | ||
| // by the given client | ||
| private static Object reflectHttpClientImplInstance(final HttpClient client) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, you can use Http3ConnectionAccess::impl too.
| * HttpClientImpl to verify that the client is holding on to at most 1 connection. | ||
| */ | ||
| @Test | ||
| void testOpenConnections() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we also introduce following tests?
- Verify secondary request bursts always reuse the pooled connection, and don't change the state of the pool. (Remember that, as reported in the associated ticket, the secondary bursts were causing the orphan connection pile up.)
- Repeat all tests with a small idle timeout, say, 5s, and ensure that after 5s pool is completely emptied.
Can I please get a review for this fix which addresses a connection leak in HttpClient when dealing with HTTP/2 requests?
I have added a comment in https://bugs.openjdk.org/browse/JDK-8326498 which explains what the issue is. The fix here addresses the issue by cleaning up the
Http2Connectionclosing logic and centralizing it to a connection terminator. The terminator then ensures that the right resources are closed (including the underlying SocketChannel) when the termination happens.A new jtreg test has been introduced which reproduces the issue and verifies the fix.
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28233/head:pull/28233$ git checkout pull/28233Update a local copy of the PR:
$ git checkout pull/28233$ git pull https://git.openjdk.org/jdk.git pull/28233/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28233View PR using the GUI difftool:
$ git pr show -t 28233Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28233.diff
Using Webrev
Link to Webrev Comment