Skip to content

Fix to_yaml module and enhance converter with auto-detection and unified path handling (#566)#581

Merged
sunt05 merged 67 commits intomasterfrom
fix/566-to-yaml-module
Aug 11, 2025
Merged

Fix to_yaml module and enhance converter with auto-detection and unified path handling (#566)#581
sunt05 merged 67 commits intomasterfrom
fix/566-to-yaml-module

Conversation

@sunt05
Copy link

@sunt05 sunt05 commented Aug 6, 2025

Fix missing to_yaml module and comprehensive converter improvements (#566)

Summary

What started as a simple fix for ModuleNotFoundError when running python -m supy.cmd.to_yaml evolved into a comprehensive improvement of the SUEWS table converter, including refactoring, extensive testing, and documentation updates.

Key Improvements

1. Fixed Missing Module Error

  • Added proper module exposure in src/supy/cmd/__init__.py
  • Implemented lazy imports to avoid circular dependencies
  • Added suews-to-yaml console script entry point

2. Table Converter Refactoring

  • Unified path handling: All SUEWS versions (2016a-2025a) now consistently use RunControl.nml paths
  • Removed special cases: Eliminated version-specific path logic for cleaner, maintainable code
  • Improved robustness: Better file resolution with automatic fallback mechanisms

3. Comprehensive Test Suite

  • Added extensive tests covering 2016a (oldest format) to latest YAML conversion
  • Tests for auto-detection across all SUEWS versions (2016a-2025a)
  • File cleaning/sanitisation tests for messy legacy files
  • Validation of complete conversion chain from oldest to newest formats
  • All tests pass successfully, confirming reliable migration path for long-time users

4. Documentation Reorganisation

  • Moved converter docs from tables/ to inputs/ directory (better reflects dual table/YAML purpose)
  • Updated all examples to show correct directory structures (input dir contains RunControl.nml)
  • Clarified features: Specific details on file cleaning (removes inline comments, fixes whitespace, standardises delimiters)
  • British English: Consistent spelling throughout documentation
  • Verified accuracy: All documented features tested and confirmed working

Changes Made

Code Changes

  1. src/supy/cmd/__init__.py - Added missing import
  2. src/supy/cmd/to_yaml.py - Implemented lazy loading
  3. src/supy/util/converter/table/ - Refactored for consistent path handling
  4. test/core/test_cmd_to_yaml.py - New comprehensive test suite
  5. pyproject.toml - Added console script entry point
  6. src/supy/meson.build - Updated build configuration

Documentation Changes

  1. docs/source/inputs/converter.rst - Comprehensive converter documentation (moved from tables/)
  2. docs/source/inputs/transition_guide.rst - Updated with accurate examples
  3. docs/source/inputs/tables/input_converter.rst - Removed (content moved)
  4. Various RST files - Updated references to new location

Testing Coverage

  • ✅ 2016a → YAML (oldest format, complete chain)
  • ✅ 2024a → YAML (latest table format)
  • ✅ Auto-detection for all versions
  • ✅ File cleaning and sanitisation
  • ✅ Path resolution with RunControl.nml
  • ✅ Profile validation
  • ✅ All benchmark versions from SUEWS-Benchmark repository

Migration Path for Users

This PR ensures a smooth migration path for all SUEWS users:

  • Users with 2016a files (oldest) can convert directly to YAML
  • Auto-detection means users don't need to know their version
  • File cleaning handles messy legacy files automatically
  • Consistent path handling works with various directory structures

Test Plan

  • Verify suews-to-yaml --help works after installation
  • Test conversion of 2016a (oldest) files to YAML format
  • Test auto-detection across all SUEWS versions
  • Test file cleaning with messy legacy files
  • Verify all documentation examples work correctly
  • Confirm path resolution with RunControl.nml

Breaking Changes

None - all changes are backwards compatible.

Notes for Reviewers

  • The refactoring makes the converter more maintainable by removing special cases
  • Test suite provides confidence for future modifications
  • Documentation has been thoroughly checked against actual behaviour
  • This comprehensive improvement ensures smooth transition to YAML format for all users

Fixes #566

sunt05 and others added 2 commits August 6, 2025 08:05
- Added missing import of to_yaml function in supy.cmd.__init__.py
- Added suews-to-yaml console script entry point in pyproject.toml
- Implemented lazy imports in to_yaml.py to avoid circular import issues
- Note: python -m supy.cmd.to_yaml requires full supy installation

This fix ensures the documented command works correctly after installation,
and provides an alternative console script entry point.
Co-authored-by: sunt05 <1802656+sunt05@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Aug 6, 2025

🤖 I've automatically formatted the code in this PR using:

  • Python: ruff v0.8.6
  • Fortran: fprettify v0.3.7

Please pull the latest changes before making further edits.

@sunt05
Copy link
Author

sunt05 commented Aug 6, 2025

already fixed in #582

@sunt05 sunt05 closed this Aug 6, 2025
@sunt05 sunt05 reopened this Aug 6, 2025
@sunt05
Copy link
Author

sunt05 commented Aug 6, 2025

@MatthewPaskin - this is what I mentioned in #583

@sunt05
Copy link
Author

sunt05 commented Aug 6, 2025

already fixed in #582

not working - needs further work

@MatthewPaskin
Copy link

Thanks @sunt05, this seems to be working for me! I can access the command within python now.

@sunt05
Copy link
Author

sunt05 commented Aug 6, 2025

Great @MatthewPaskin – I’ll finalise this shortly by adding some tests, then ask you for a review.

@MatthewPaskin
Copy link

MatthewPaskin commented Aug 6, 2025

There are some further issues once using the to_yaml function depending on the version provided. Should we tackle these in this PR or with a new issue and new PR?
The main one being a bad delimiter for pandas to_csv() which I have come across before and thought I patched. This was because the same delimiter was used for quotes and separation

@sunt05
Copy link
Author

sunt05 commented Aug 6, 2025

Let's do that in this PR – I'll tag you for testing once it's in shape @MatthewPaskin

- Remove redundant 'yaml' option since 2025+ only supports YAML
- Automatically determine conversion type based on target version:
  - Pre-2025: table-to-table conversion
  - 2025+: convert to YAML format
- Fix all conversion issues preventing supy-to-yaml from working:
  - Handle empty DataFrame when STEBBS disabled
  - Add graceful handling of missing columns in legacy format
  - Fix water surface soilstore validation (allow 0 for water)
  - Provide defaults for missing config/description columns
- Update all documentation to reflect simplified interface
- Update tests to match new command structure

Fixes #581
github-actions bot and others added 9 commits August 6, 2025 11:42
Co-authored-by: sunt05 <1802656+sunt05@users.noreply.github.com>
- Remove duplicate suews_convert_guide.rst
- Update existing input_converter.rst with YAML conversion info
- Consolidate all converter documentation in one place
- Replace deprecated delim_whitespace=True with sep=r'\s+'
- Replace invalid quotechar=' ' with quoting=3 (QUOTE_NONE)
- Fixes CI failures on Python 3.13 Linux builds
Co-authored-by: sunt05 <1802656+sunt05@users.noreply.github.com>
…interface

Simplify suews-convert interface and fix remaining YAML conversion issues (builds on #581)
- Removed commented code with conflicting delimiter settings (quotechar and sep both as space)
- Added comprehensive delimiter handling tests to prevent CSV parsing issues
- Ensures table conversion maintains consistent space-separated format
- Tests verify no quote/delimiter conflicts in output files
The issue was only in commented-out code with bad delimiter settings.
No need for extensive tests since the active code already works correctly.
- Keep test/fixtures/legacy_format for legacy input files (not sample_data/Input)
- Use sample_data for only essential runtime files (config.yml and forcing data)
- Keep lazy imports in to_yaml.py to avoid circular dependencies
- Integrate CRU temperature data from master
- Update meson.build to include ext_data files
@sunt05 sunt05 self-assigned this Aug 6, 2025
sunt05 and others added 3 commits August 7, 2025 01:33
- Introduced new input files for the 2024 and 2025 legacy formats, including RunControl, InitialConditions, and various SUEWS-related data files.
- These additions support ongoing development and testing of the legacy input handling in the system.
- Ensured consistency in file structure and naming conventions across the new input files.
- Fixed debug directory (-d) option in suews-convert command
  - Debug directory is now properly created and preserved
  - Intermediate conversion steps are saved for debugging
  - Added snapshot functionality to track each conversion step

- Fixed legacy footer lines issue
  - Removed code that adds -9 footer lines during conversion
  - Footer lines with -999 values were being incorrectly parsed as data

- Partially fixed non-existent profile references
  - Updated some conversion rules to use profile 999 (placeholder) instead of non-existent codes
  - Changed profile codes 801-805, 44, 45, etc. to 999 in rules.csv

- Added .gitignore entries for debug files

Remaining issue: More comprehensive fix needed for all profile references in conversion rules

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
- Created ProfileManager class to handle missing profile references
- Integrated automatic profile creation during conversion
- Added --no-profile-validation flag for control
- Improved placeholder file creation for BiogenCO2 and ESTM files
- Added comprehensive tests for profile management
- Enhanced debug mode to preserve intermediate conversion steps
- Fixed missing profile references (codes 31, 999, etc.)

This addresses issues where conversion rules reference profile codes
that don't exist, preventing successful conversion from legacy formats.
sunt05 and others added 8 commits August 8, 2025 08:44
- Add explicit encoding='utf-8' to all file open operations
- Replace bare except clauses with specific Exception handling
- Fix variable shadowing in loops (renamed line to raw_line)
- Remove duplicate f90nml import (already imported at module level)
- Fix ambiguous variable names (l,r,c renamed to src,dst,weight)
- Add 'from e' to exception re-raising for proper error context
- Use augmented assignment operator (|=) where appropriate
- Fix unused function parameters with underscore prefix
- Auto-fix import sorting and formatting issues
- Remove unnecessary list comprehensions and redundant code

All critical linting issues resolved. Remaining warnings are complexity
metrics (too many branches/statements/variables) that would require
significant refactoring.
- Extract helper functions from detect_table_version():
  - _check_required_files()
  - _check_specific_files()
  - _check_columns_in_file()
  - _check_columns()
  - _check_negative_columns()
  - _check_nml_parameters()

- Refactor clean_legacy_table() with helpers:
  - _should_skip_line()
  - _process_line()
  - _ensure_consistent_columns()

- Refactor SUEWS_Converter_single() with helpers:
  - _copy_and_clean_files()
  - _handle_same_version_copy()
  - _build_file_list()

This reduces ruff complexity warnings from 17 to 9, making the code more maintainable while preserving all functionality. All tests pass.
- Move table.py, profile_manager.py, and rules.csv into table/ subdirectory
- Create table/__init__.py to maintain backward compatibility
- Update import paths to reflect new directory structure
- Update meson.build to include new file locations

This reorganization groups related table conversion functionality together
while maintaining all existing functionality. All 19 tests pass.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove temporary markdown documentation and test YAML files that were
created during development but are no longer needed.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add missing resample_output import to _supy_module.py
- Fix exception chaining with 'from e' in yaml.py
- Remove unused variable 'writeoutoption'
- Refactor convert_to_yaml() to reduce complexity:
  - Extract _prepare_processing_dir() helper function
  - Extract _configure_forcing_and_output() helper function
  - Reduce local variables from 18 to under 15
  - Reduce statements from 52 to under 50
- Fix docstring formatting and remove trailing whitespace
- Apply ruff formatting for consistent code style

All tests pass after refactoring.
- Remove special case handling for 2016a version
- All versions now read file paths from RunControl.nml
- Support both absolute and relative paths in RunControl.nml
- Update detection functions to use RunControl.nml paths
- Maintain backward compatibility with fallback to root/Input dirs

This ensures consistent path handling across all SUEWS versions and
respects user-specified paths in RunControl.nml instead of hardcoding
directory structures.
- Move converter docs from tables/ to inputs/ directory (more appropriate location)
- Update documentation to accurately reflect converter capabilities:
  - Auto-detection of source version
  - Support for 'latest' target version
  - Debug directory option
  - Profile validation control
- Add specific examples for all conversion scenarios
- Clarify what 'same-version conversion' does (reformatting)
- Fix path handling documentation to match actual behaviour
- Update all cross-references to new converter location
- Use British English spelling throughout
Co-authored-by: sunt05 <1802656+sunt05@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Aug 8, 2025

🤖 I've automatically formatted the code in this PR using:

  • Python: ruff v0.8.6
  • Fortran: fprettify v0.3.7

Please pull the latest changes before making further edits.

@sunt05 sunt05 changed the title Fix missing to_yaml module (#566) Fix to_yaml module and refactor converter for consistent path handling (#566) Aug 8, 2025
@sunt05 sunt05 changed the title Fix to_yaml module and refactor converter for consistent path handling (#566) Fix to_yaml module and enhance converter with auto-detection and unified path handling (#566) Aug 8, 2025
sunt05 and others added 3 commits August 8, 2025 13:36
- Merged test_2016a_to_latest_yaml and test_2024a_to_latest_yaml into single comprehensive test
- Renamed test_benchmark_versions_to_yaml to test_all_versions_to_yaml_with_auto_detection
- Simplified test_messy_file_cleaning to test_same_version_cleaning
- Updated test to include all versions (2024a and 2025a added to parameterized test)
- Improved docstrings to clearly describe what each test covers
- Removed unused legacy_2024a_dir fixture
- Fixed Site object validation to be more flexible with data model changes
Co-authored-by: sunt05 <1802656+sunt05@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Aug 8, 2025

🤖 I've automatically formatted the code in this PR using:

  • Python: ruff v0.8.6
  • Fortran: fprettify v0.3.7

Please pull the latest changes before making further edits.

sunt05 added 2 commits August 8, 2025 13:56
- Updated the test command in the Makefile to include the `--durations=10` option for both `uv` and Python pytest executions.
- This change allows for better performance insights by reporting the duration of the slowest 10 tests, aiding in identifying potential bottlenecks.
@sunt05 sunt05 marked this pull request as ready for review August 8, 2025 13:05
@sunt05 sunt05 requested a review from MatthewPaskin August 8, 2025 13:05
@sunt05
Copy link
Author

sunt05 commented Aug 8, 2025

Hi @MatthewPaskin, please try this version. I've also added a feature to auto-detect file versions for conversion and tested it with several previous files. The documentation has been updated to reflect these enhancements—please double-check in case I missed anything.

sunt05 and others added 3 commits August 8, 2025 14:11
- Keep only Kc_2012_data_60.txt in 2025a directory for tests
- Remove all other Kc forcing data files (36 files)
- Remove associated Kc output format files
- Reduces test fixture size from ~200MB to ~2MB
- All tests pass successfully with minimal fixtures
sunt05 added a commit that referenced this pull request Aug 10, 2025
…kflow

- Removed 5 stale UMEP HTML files from docs/source/images/
- Fixed outdated YAML API comment in workflow.rst (it's fully functional)
- Added reference to issue #581 for pending table-to-YAML conversion
- Removed deprecated warning from README.md developer section
- Improved error handling in generate_datamodel_rst.py with helpful messages
- Added setup-doc-env.sh for minimal documentation environment setup
- Added check_rst_staleness.py for incremental RST generation
- Updated Makefile with generate-rst-if-needed target for efficient builds

These changes improve documentation maintainability and developer experience
while removing outdated content that could confuse users.
sunt05 added a commit that referenced this pull request Aug 10, 2025
- Removed 5 stale UMEP HTML files from docs/source/images/
- Fixed outdated YAML API comment in workflow.rst (fully functional, not in development)
- Added reference to issue #581 for pending table-to-YAML conversion
- Removed deprecated warning from README.md developer section
MatthewPaskin
MatthewPaskin previously approved these changes Aug 11, 2025
Copy link

@MatthewPaskin MatthewPaskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, and I have tested for my case that was failing before. thanks for this!

Resolved conflicts:
- docs/source/tutorials/python/tutorial.rst: Use master's suews-convert to-yaml format
- docs/source/workflow.rst: Use master's suews-convert to-yaml format
@sunt05 sunt05 merged commit 32807fa into master Aug 11, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No module named supy.cmd.to_yaml

2 participants