Problem or motivation
internal/fetcher/ has test files only for ModelAPIFetcher and ModelReadmeFetcher. The following have no tests at all:
DatasetAPIFetcher — fetches dataset metadata from the HF Hub API
DatasetReadmeFetcher — fetches and parses dataset README/model-card YAML
DatasetTreeFetcher — fetches the dataset file tree with security metadata (cursor-based pagination)
ModelTreeFetcher — fetches the model file tree (same pagination logic, different endpoint)
ModelSearcher — searches for models on the HF Hub
markdown_extraction.go — shared front-matter YAML splitting and string extraction helpers used by all readme fetchers
ModelTreeFetcher and DatasetTreeFetcher in particular implement cursor-based pagination (maxTreePages = 10) with non-trivial state. Errors in pagination would silently truncate security scan data.
Proposed solution
Add test files using httptest.NewServer (the same pattern already used in model_api_fetcher_test.go) for each untested fetcher. Cover: successful fetch, 404 response (HFError wrapping), pagination across two pages for tree fetchers, and empty result sets. For markdown_extraction.go, add table-driven tests for splitFrontMatter with missing delimiters, invalid YAML, and empty input.
Alternatives considered
None — httptest-based tests are already established in this package and straightforward to extend.
Additional context
Affected files: dataset_api_fetcher.go, dataset_readme_fetcher.go, dataset_tree_fetcher.go, model_tree_fetcher.go, model_search.go, markdown_extraction.go.
Problem or motivation
internal/fetcher/has test files only forModelAPIFetcherandModelReadmeFetcher. The following have no tests at all:DatasetAPIFetcher— fetches dataset metadata from the HF Hub APIDatasetReadmeFetcher— fetches and parses dataset README/model-card YAMLDatasetTreeFetcher— fetches the dataset file tree with security metadata (cursor-based pagination)ModelTreeFetcher— fetches the model file tree (same pagination logic, different endpoint)ModelSearcher— searches for models on the HF Hubmarkdown_extraction.go— shared front-matter YAML splitting and string extraction helpers used by all readme fetchersModelTreeFetcherandDatasetTreeFetcherin particular implement cursor-based pagination (maxTreePages = 10) with non-trivial state. Errors in pagination would silently truncate security scan data.Proposed solution
Add test files using
httptest.NewServer(the same pattern already used inmodel_api_fetcher_test.go) for each untested fetcher. Cover: successful fetch, 404 response (HFErrorwrapping), pagination across two pages for tree fetchers, and empty result sets. Formarkdown_extraction.go, add table-driven tests forsplitFrontMatterwith missing delimiters, invalid YAML, and empty input.Alternatives considered
None — httptest-based tests are already established in this package and straightforward to extend.
Additional context
Affected files:
dataset_api_fetcher.go,dataset_readme_fetcher.go,dataset_tree_fetcher.go,model_tree_fetcher.go,model_search.go,markdown_extraction.go.