[feat] [broker] Add broker health check status into prometheus metrics#10
[feat] [broker] Add broker health check status into prometheus metrics#10vineeth1995 wants to merge 1372 commits intomasterfrom
Conversation
d4fea23 to
a86f44f
Compare
rdhabalia
left a comment
There was a problem hiding this comment.
let's create an issue first before creating this PR.
We want at least 10 entries for you in your contribution list to start vote for committer
https://github.com/apache/pulsar/pulls?q=is%3Apr+assignee%3Avineeth1995
conf/broker.conf
Outdated
|
|
||
| includeHealthCheckInMetrics=true | ||
| healthCheckFrequencyInSeconds=60 | ||
| healthCheckInitialDelayInSecs=60 |
There was a problem hiding this comment.
instead, all 3 config variables just keep one variable: healthCheckMetricsUpdateTimeInSeconds=-1. here -1 means disable. and please add config documentation in config.
| import org.apache.pulsar.common.naming.SystemTopicNames; | ||
| import org.apache.pulsar.common.naming.TopicDomain; | ||
| import org.apache.pulsar.common.naming.TopicName; | ||
| import org.apache.pulsar.common.naming.*; |
There was a problem hiding this comment.
having * is not good. please update with the required imports.
| .numThreads(1) | ||
| .build(); | ||
|
|
||
| this.healthChecker = OrderedScheduler.newSchedulerBuilder() |
There was a problem hiding this comment.
initialize executor only if healthCheckMetricsUpdateTimeInSeconds >0 . so, just call method
void initializeHealthChecker() {
if(healthCheckMetricsUpdateTimeInSeconds > 0){
healthChecker = new ...
healthChecker.scheduleAtFixedRate(this::checkHealth,
healthCheckMetricsUpdateTimeInSeconds, healthCheckMetricsUpdateTimeInSeconds, TimeUnit.SECONDS);
}
}
| updateRates(); | ||
| } | ||
|
|
||
| protected void startHealthChecker() { |
There was a problem hiding this comment.
replaced by initializeHealthChecker(..)
| private final LongAdder connectionCreateFailCount; | ||
| private final LongAdder connectionTotalClosedCount; | ||
| private final LongAdder connectionActive; | ||
| private volatile Long healthCheckStatus; |
There was a problem hiding this comment.
default value -1 means unknown
1 = success
0 = fail
so, if no one enables this task then value should be -1 unknown.
89dcf3c to
3f9b152
Compare
| minValue = -1, | ||
| doc = "HealthCheck update frequency in seconds. Disable health check with value -1 (Default value 60)" | ||
| ) | ||
| private int healthCheckMetricsUpdateTimeInSeconds = 60; |
| .numThreads(1) | ||
| .build(); | ||
| int healthCheckFreqInSecs = config.getHealthCheckMetricsUpdateTimeInSeconds(); | ||
| int healthCheckInitialDelayInSecs = config.getHealthCheckMetricsUpdateTimeInSeconds(); |
There was a problem hiding this comment.
healthCheckFreqInSecs and healthCheckInitialDelayInSecs are same variable. we should use only 1 healthCheckFreqInSecs. please remove duplicate.
930f240 to
e929a12
Compare
| } | ||
|
|
||
| public void checkHealth() { | ||
| BrokersBase.internalRunHealthCheck(TopicVersion.V2, pulsar(), null).thenAccept(__ -> { |
There was a problem hiding this comment.
import static import org.apache.pulsar.broker.admin.impl.BrokersBase.internalRunHealthCheck;
internalRunHealthCheck(TopicVersion.V2, pulsar(), null)...
| } | ||
|
|
||
| public void recordHealthCheckStatusSuccess() { | ||
| System.out.println("recording health check success"); |
| private final LongAdder connectionCreateFailCount; | ||
| private final LongAdder connectionTotalClosedCount; | ||
| private final LongAdder connectionActive; | ||
| private volatile Long healthCheckStatus; |
There was a problem hiding this comment.
can we please add comment:
// 1=success, 0=failure, -1=unknown
f74cf57 to
8216576
Compare
|
The pr had no activity for 30 days, mark with Stale label. |
…, LeastLongTermMessageRate, ModularLoadManagerImpl. (apache#22889) Implementation PR: apache#22888 ### Motivation Initially, we introduce `loadBalancerCPUResourceWeight`, `loadBalancerBandwidthInResourceWeight`, `loadBalancerBandwidthOutResourceWeight`, `loadBalancerMemoryResourceWeight`, `loadBalancerDirectMemoryResourceWeight` in `ThresholdShedder` to control the resource weight for different resources when calculating the load of the broker. Then we let it work for `LeastResourceUsageWithWeight` for better bundle placement policy. But apache#19559 and apache#21168 have point out that the actual load of the broker is not related to the memory usage and direct memory usage, thus we have changed the default value of `loadBalancerMemoryResourceWeight`, `loadBalancerDirectMemoryResourceWeight` to 0.0. There are still some places where memory usage and direct memory usage are used to calculate the load of the broker, such as `OverloadShedder`, `LeastLongTermMessageRate`, `ModularLoadManagerImpl`. We should let the resource weight work for these places so that we can set the resource weight to 0.0 to avoid the impact of memory usage and direct memory usage on the load of the broker. ### Modifications - Let resource weight work for `OverloadShedder`, `LeastLongTermMessageRate`, `ModularLoadManagerImpl`.
### Motivation Those bundles that are filtered when try to unload them should not be included in the indicator. ### Modifications Increment the metric only when the bundle are unloaded.
…r Function runtimes (apache#22910)
…calling getPartitionedTopicMetadata (apache#22838)
…vice doesn't get closed (apache#22858)
…tensibleLoadManagerImpl only) (apache#22930)
…, LeastLongTermMessageRate, ModularLoadManagerImpl. (apache#22888)
…rect add-opens parameters (apache#22927)
…th Apache Pulsar Helm chart (apache#23362)
…m topic (ExtensibleLoadManagerImpl only) (apache#23381)
…hared implementation (apache#23219)
…egistry (ExtensibleLoadManagerImpl only) (apache#23382)
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matteo Merli <mmerli@apache.org>
50bb381 to
29f77e9
Compare
29f77e9 to
84c4a3e
Compare
Motivation
To add broker health check into prometheus metric.
Modifications
Schedule a job at 1 minute time interval which calls healthCheck API on broker and updates the pulsar stats based on the broker health.
Verifying this change
Unit test cases were added to verify this change.
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-complete