Skip to content

Conversation

@fmotrifork
Copy link
Contributor

When having many probes on a single board instance, the current implementation results in probes waiting for each other and not for the actual services.
This results in many probes that are marked as "slow" even though the problem stems from to many open probes at once.

This PR batches the probes in 8 at a time.
It also splits the http probe time into two:

  • Time from request is created and until first byte is received from the server
  • Time from request is created and until the server finishes sending bytes.

This allows us to better understand if a slow reading is from infrastructure / loadbalancers or from a slow http service.

Copy link
Owner

@Lesterpig Lesterpig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this pull request!

It actually changes a lot of things in the project (Dockerfile, Makefile, logging, prometheus...).
I would only consider changing the manager.go and probe/http.go files in this PR.
If needed, please open new pull requests for the proposed changes.

I also have put a few comments in the modified files.
If you have some time, could you please update your pull request? Otherwise I can do it for you 😃

for category, services := range manager.Services {
for _, service := range services {
// Batching the probes to spread the workload
if math.Mod(i, 8) == 0 {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is there a reason to use a float for i?
We could use integers and check with if i % 8 == 0.

Moreover, the i = 0 line might not be necessary (or the condition can just be i >= 8).

TLSClientConfig: &tls.Config{InsecureSkipVerify: !opts.VerifyCertificate},
}
tr := http.DefaultTransport.(*http.Transport).Clone()
tr.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the opts.VerifyCertificate option has been forgotten?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants