Open
Conversation
1st draft of schema
… script and will be referenced from the Hydrotables table now
…f HAND versions back in
Load script
metrics table linked to hydrotable through benchmark ROI and catchment geometries instead of huc8s. This simplifies things.
…e and discharge are now arrays which should save even more space since its a big reduction in rows
…l file with table schemas
…out not finding files
…g because uuids weren't unique per row
…3 or local since only the portion of the path after the hand version used generate uuid
…f strict data contract for hydrotable schema. We will need to meet with Fernando, Brad, and Rob about this
…anch, use pandas for more table handling
…n script and reduce repetition
Refactor batch insert and data communication with the .ddb file. Simplified hydrotable handling. Code readiblity improvements. Improved memory management.
h3 extension function to lookup cells covering polygon buggy so getting rid of lookup table. Spatial partitioning catchments and hydrotables by h3 id still seems to achieve speedups
Revised load.py so that it passes through the schema defined in the schema directory for a given version of HAND. It will aggregate into Hydrotable by hydroid depending on if the column in the duckdb Hydrotable schema has an array type or not. Updated Readme. Created a script analyze_hydrotable_columns.py that should be able to give you an indication of if a column should become an array type or not
* Update code to not leak memory load.py has been updated to use duckdb for catchment gpkg processing and for hydrotable aggregation. The query and schema have also been updated in some places when necessary to work with minor differences in the parquet files produced this way compared to the previous parquet files * Remove example-hydrotable.csv * optimize centroid calc * Merge geometries * Remove insertion order setting since closing connection now * Reduce default batch size and document sizing guidelines * Fix tolerance to be 100m * Update gitignore * Update load.py comments * Add staging for hydrotables. Remove json columns. * Add staging for catchments and modify raster table schema * Replace WKT writing with WKB writing for catchment geometries * Add more explanatory comments * Remove broken batch insertion print statements and estimates * Fix broken row insertion counts printout after table creation * Add branch lookup table to improve performance Create a branch lookup table that is used during hydrotable creation. This table is indexed by branch so should be faster than using the original catchment table. * Aggregating before joining in hydrotables * Flush staging hydrotable to avoid runing out of memory * Fix parquet index geometry handling to deal with wkb format Fixed the geometry loading in the test query script and in the visualization script * Fix new load.py new partition_tables_to_parquet partition_tables_to_parquet now partitions files one H3 index at a time to avoid running out of memory. Was previously not actually writing a file so query wasn't running sucessfully * Squash merge feature/simplify-hydrotable into fix/mem-leak. This gets rid of complex hydrotable handling in favor of just passing refs to the existing hydrotable csv's. This change reduced code complexity alot in this repo while requiring only minor changes to the autoeval coordinator.
* Update README, remove uneccessary arguments from load.py * Proofread README * Update README.md Minor syntax updates * Update README.md * update README to clarify .env creation * Remove .env from tracking --------- Co-authored-by: Brad <bradford.bates@ertcorp.com> Co-authored-by: Parallel Works app-run user <dylan.lee@mgmt-dylanlee-oefimbenchmarkstac-00067.optimizationuseast1-5.pw.local>
Merge staging main with hand index commit history into OWP main branch with template files
|
SWCM witness approval; release concurrence. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the code associated with NGWPC's hand-index repository to OWP. A HAND index allows for indexing a directory of HAND outputs in a way that allows for efficient spatial querying. The repository contains a script to create HAND indexes, an example query, and a schema describing the files necessary to inundate HAND REM's for the FIM100 version of HAND.