Skip to content

vr: fix backup router health check#4171

Merged
yadvr merged 1 commit intoapache:4.14from
shapeblue:fix-backup-vr-health-check-4163
Jun 25, 2020
Merged

vr: fix backup router health check#4171
yadvr merged 1 commit intoapache:4.14from
shapeblue:fix-backup-vr-health-check-4163

Conversation

@shwstppr
Copy link
Copy Markdown
Contributor

@shwstppr shwstppr commented Jun 24, 2020

Description

Fixes #4163

Added excluded tests in the code for BACKUP router.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

BACKUP Router,
Screenshot from 2020-06-24 17-47-02

Before fix,
Screenshot from 2020-06-24 17-32-04

After fix and excluding gateways_check.py for BACKUP router,
Screenshot from 2020-06-24 17-45-21

How Has This Been Tested?

Executing health checks for BACKUP router

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr shwstppr marked this pull request as ready for review June 24, 2020 12:21
@shwstppr
Copy link
Copy Markdown
Contributor Author

@Doni7722 @rhtyd please have a look

@yadvr yadvr added this to the 4.14.1.0 milestone Jun 24, 2020
@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 24, 2020

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos7 ✔debian. JID-1448

@Doni7722
Copy link
Copy Markdown

@shwstppr LGTM - Does it check the state of the Router on every health check (as the state could change anytime)?

@shwstppr
Copy link
Copy Markdown
Contributor Author

@Doni7722 createMonitorServiceCommand, where the exclusion has been added, is the method used to create command which will be sent to the host for running health. This will be called every time health checks run for a router. State is checked from DomainRouterVO which should hold the correct state even if it has been changed.

private SetMonitorServiceCommand createMonitorServiceCommand(DomainRouterVO router, List<MonitorServiceTO> services,

private boolean updateRouterHealthChecksConfig(DomainRouterVO router) {
if (!RouterHealthChecksEnabled.value()) {
return false;
}
SetMonitorServiceCommand command = createMonitorServiceCommand(router, null,true, true);
String controlIP = getRouterControlIP(router);
if (StringUtils.isBlank(controlIP) || controlIP.equals("0.0.0.0")) {
s_logger.debug("Skipping update data on router " + router.getUuid() + " because controlIp is not correct.");
return false;
}
s_logger.info("Updating data for router health checks for router " + router.getUuid());
Answer origAnswer = null;
try {
origAnswer = _agentMgr.easySend(router.getHostId(), command);
} catch (final Exception e) {
s_logger.error("Error while sending update data for health check to router: " + router.getInstanceName(), e);
return false;
}
if (origAnswer == null) {
s_logger.error("Unable to update health checks data to router " + router.getHostName());
return false;
}
GroupAnswer answer = null;
if (origAnswer instanceof GroupAnswer) {
answer = (GroupAnswer) origAnswer;
} else {
s_logger.error("Unable to update health checks data to router " + router.getHostName() + " Received answer " + origAnswer.getDetails());
return false;
}
if (!answer.getResult()) {
s_logger.error("Unable to update health checks data to router " + router.getHostName() + ", details : " + answer.getDetails());
}
return answer.getResult();
}

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-1840)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 35412 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4171-t1840-kvm-centos7.zip
Smoke tests completed. 83 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

Copy link
Copy Markdown
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issue rest lgtm

@yadvr yadvr merged commit b534d2b into apache:4.14 Jun 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants