Skip to content

Conversation

@kevinrr888
Copy link
Member

@kevinrr888 kevinrr888 commented Apr 15, 2025

Code did not consider ROOT table, and was always returning 0 for the disk usage as it would scan the ROOT table if the input was the METADATA table and would scan the METADATA table otherwise. Should have been scanning ZK if the input was ROOT. Changed to use Ample instead, which handles this issue nicely.

Discovered when working on #5474

Code did not consider ROOT table, and was always returning 0 for the
disk usage as it would scan the ROOT table if the input was the METADATA
table and would scan the METADATA table otherwise. Should have been
scanning ZK if the input was ROOT. Changed to use Ample instead, which
handles this issue nicely.
@kevinrr888 kevinrr888 added this to the 2.1.4 milestone Apr 15, 2025
@kevinrr888 kevinrr888 self-assigned this Apr 15, 2025
@kevinrr888
Copy link
Member Author

Working on fixing the unit test which was mocking based on old impl

@dlmarion
Copy link
Contributor

I'm guessing it reported zero for the disk usage because the ROOT table is stored in ZooKeeper, not HDFS. Not suggesting your fix is not applicable, just providing possible background.

Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kevinrr888
Copy link
Member Author

kevinrr888 commented Apr 15, 2025

I'm guessing it reported zero for the disk usage because the ROOT table is stored in ZooKeeper, not HDFS. Not suggesting your fix is not applicable, just providing possible background.

Isn't it just the metadata for the files that's stored in ZK, not the files themselves? I think the files are written to HDFS even for ROOT, so "disk usage" still makes sense in this case

@kevinrr888
Copy link
Member Author

The failing unit test had to essentially be completely rewritten. Up to date now

final ServerContext client = EasyMock.createMock(ServerContext.class);
final Scanner scanner = EasyMock.createMock(Scanner.class);
mockScan(client, scanner, 1);
final ClientContext client = EasyMock.createMock(ClientContext.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more levels of mocking there is in test code, the harder it is to refactor code the cod under test. Sometimes its better to change the code to make it easy to test and to minimize the need for mocking. For example maybe could have changed :

 public static Map<SortedSet<String>,Long> getDiskUsage(Set<TableId> tableIds,
      AccumuloClient client) 

to :

 public static Map<SortedSet<String>,Long> getDiskUsage(Set<TableId> tableIds,
      Function<TableId, Iterable<Map<StoredTabletFile,DataFileValue>>> tableFilesProvider) 

Maybe this would make the code easier to test. Not a change for this PR as the changes w/ easy mock were already made. just something to consider for future changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree the mocking that's done for this test now is definitely not ideal. I did not consider doing something like that, but you're right it would have made things a lot simpler for the test here. I'll be keeping this in mind for future changes, thanks!

@kevinrr888 kevinrr888 merged commit aa75da3 into apache:2.1 Apr 18, 2025
8 checks passed
@kevinrr888 kevinrr888 deleted the GetDiskUsageBugFix branch April 18, 2025 14:11
ddanielr pushed a commit to keith-turner/accumulo that referenced this pull request Apr 24, 2025
* Fixes bug with getting disk usage of the ROOT table

Code did not consider ROOT table, and was always returning 0 for the
disk usage as it would scan the ROOT table if the input was the METADATA
table and would scan the METADATA table otherwise. Should have been
scanning ZK if the input was ROOT. Changed to use Ample instead, which
handles this issue nicely.

* rewrote TableDiskUsageTest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants