-
Notifications
You must be signed in to change notification settings - Fork 3.9k
otel: subchannel metrics #12202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
otel: subchannel metrics #12202
Conversation
eabbf58
to
9595507
Compare
9595507
to
c713561
Compare
@@ -415,7 +415,7 @@ void exitIdleMode() { | |||
LbHelperImpl lbHelper = new LbHelperImpl(); | |||
lbHelper.lb = loadBalancerFactory.newLoadBalancer(lbHelper); | |||
// Delay setting lbHelper until fully initialized, since loadBalancerFactory is user code and | |||
// may throw. We don't want to confuse our state, even if we will enter panic mode. | |||
// may throw. We don't want to confuse our state, even if we enter panic mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is more accurate as it is. After entering panic mode there is nothing much we can do about state, the comment implies delaying entering that panic mode and maintaining a sane state for the channel for as long as possible before bringing in potential user code.
addressIndex.getCurrentEagAttributes(), NameResolver.ATTR_BACKEND_SERVICE), | ||
getAttributeOrDefault( | ||
addressIndex.getCurrentEagAttributes(), LoadBalancer.ATTR_LOCALITY_NAME), | ||
"Peer Pressure", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not in the gRFC? Instead there is a "List of allowed values for grpc.disconnect_error".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Phase 1 we won't be plumbing disconnect_error, will raise another PR with this as the base branch for the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So? We can use a valid value now: unknown
. Using "unknown" means we are technically implementing the gRFC with this change, and it is just an optimization to reduce the amount of unknowns.
core/src/main/java/io/grpc/internal/PickFirstLeafLoadBalancer.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/grpc/internal/DelayedClientTransport.java
Outdated
Show resolved
Hide resolved
* @param optionalLabelValues the optional label values for the metric. | ||
*/ | ||
@Override | ||
public void addLongUpDownCounter(LongUpDownCounterMetricInstrument metricInstrument, long value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are just wrappers on the underlying OTel API for UpDownCounter...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I do see logic here and there are still things to check. See that we have tests in MetricRecorderImplTest for addLongCounter(), which is very similar.
@@ -117,6 +119,22 @@ public void addLongCounter(LongCounterMetricInstrument metricInstrument, long va | |||
counter.add(value, attributes); | |||
} | |||
|
|||
@Override | |||
public void addLongUpDownCounter(LongUpDownCounterMetricInstrument metricInstrument, long value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit test.
* @return the newly created LongUpDownCounterMetricInstrument | ||
* @throws IllegalStateException if a metric with the same name already exists | ||
*/ | ||
public LongUpDownCounterMetricInstrument registerLongUpDownCounter(String name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit test.
* @param optionalLabelValues the optional label values for the metric. | ||
*/ | ||
@Override | ||
public void addLongUpDownCounter(LongUpDownCounterMetricInstrument metricInstrument, long value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I do see logic here and there are still things to check. See that we have tests in MetricRecorderImplTest for addLongCounter(), which is very similar.
addressIndex.getCurrentEagAttributes(), NameResolver.ATTR_BACKEND_SERVICE), | ||
getAttributeOrDefault( | ||
addressIndex.getCurrentEagAttributes(), LoadBalancer.ATTR_LOCALITY_NAME), | ||
"Peer Pressure", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So? We can use a valid value now: unknown
. Using "unknown" means we are technically implementing the gRFC with this change, and it is just an optimization to reduce the amount of unknowns.
Implements A94