Skip to content

Conversation

@Sudhendra
Copy link
Owner

Summary

  • add HF downloaders for code and conversation datasets to expand raw NL coverage
  • add adapter-driven synthetic generation and validation CLI scripts
  • add test coverage for corpus merge and dataset extraction helpers

Test Plan

  • pytest
  • pytest tests/test_generate_synthetic.py tests/test_validate_synthetic.py tests/test_adapter_generator.py -v

@Sudhendra Sudhendra merged commit a72448c into main Jan 31, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants