Skip to content

Conversation

@kevinrr888
Copy link
Member

This PR addresses some follow on changes from #4524 (comment):

  • Add a toString() to FateKey
  • Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper
  • Periodic clean up of dead reservations should be increased from every 30 seconds to every few minutes
  • Add additional fate test case suggested in Fate reservations moved out of memory #4524 (comment)
    • From the new test case, discovered that verifyReserved() should also be checking if the tx has been deleted from the persisted store. This immediately prevents write ops from being attempted on deleted transactions. This caused UserFateStoreIT.delete() test to fail since delete() was called several times on the same tx store. Fixed this and cleaned up some code in the rest of this class (TestUserFateStore wasn't doing anything).

A couple questions:

  • I noticed that required status checks are only present in UserFateStore not MetaFateStore. Should this be added to MetaFateStore?
  • With the increased time of the dead reservation cleanup, testDeadReservationsCleanup() now takes a lot longer (6 mins for me locally). Will this be an issue when the test runs on Jenkins?

- Add a toString() to FateKey
- Move MetaFateStore to org.apache.accumulo.core.fate.zookeeper
- Periodic clean up of dead reservations increased from every 30 seconds to every few minutes
- New fate test case added to FateIT that ensures no write ops can be performed on a transaction after it has been deleted
	- Added new check to verifyReserved() that checks whether the transaction is deleted
	- Fixed UserFateStoreIT to work with new change and misc cleanup to the class
@kevinrr888 kevinrr888 self-assigned this Sep 20, 2024
@kevinrr888 kevinrr888 added this to the 4.0.0 milestone Sep 20, 2024
@keith-turner
Copy link
Contributor

With the increased time of the dead reservation cleanup, testDeadReservationsCleanup() now takes a lot longer (6 mins for me locally). Will this be an issue when the test runs on Jenkins?

To improve the test time could use a strategy similar to FlakyFateManager and a test like ComprehensiveFlakyFateIT. So could create a FastFateCleanupManager that creates fate with a much lower cycle time and then have the test use that manager class.

.allMatch(entry -> FateStore.FateReservation.locksAreEqual(entry.getLockID(), lock2));
return store2Reservations.keySet().equals(allIds) && allReservedWithLock2;
}, 60_000);
}, 60_000 * 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is needed because the manager cleans up dead reservations less frequently?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes
9599bd8 changed this to 30 seconds with use of FastFate

- created new class FastFate which performs the dead reservation cleanup
  more often
@kevinrr888
Copy link
Member Author

To improve the test time could use a strategy similar to FlakyFateManager and a test like ComprehensiveFlakyFateIT. So could create a FastFateCleanupManager that creates fate with a much lower cycle time and then have the test use that manager class.

Yeah good idea. I created FastFate in 9599bd8

Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinrr888 how long does the test take to run now?

@kevinrr888
Copy link
Member Author

kevinrr888 commented Sep 23, 2024

@keith-turner just 30-40 seconds

@kevinrr888 kevinrr888 merged commit f9d8afe into apache:main Oct 3, 2024
@kevinrr888 kevinrr888 deleted the pr-4524-follow-on branch November 1, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants