Skip to content

Implement Vale for Organization-Wide Documentation Linting #94

@vedantvakharia

Description

@vedantvakharia

Summary

Implement Vale, a prose linter, as a reusable workflow in the .github repository to ensure consistent, high-quality documentation across all Open Climate Fix repositories.

Motivation

Currently, OCF has no automated documentation linting in place. While we have excellent code linting (Ruff, Mypy), our documentation quality relies solely on manual review. This creates:

  • Inconsistent style across different repositories
  • Higher review burden on maintainers to catch documentation issues
  • Accessibility concerns with complex or unclear prose
  • Missed typos and grammatical errors that slip through reviews
    Vale would provide automated checks for:
  • Grammatical correctness
  • Consistent terminology and style
  • Readability and accessibility
  • Project-specific documentation standards

Current State

I've audited the OCF organization and confirmed:

  • No .vale.ini configuration files in any repository
  • No Vale workflows in .github/workflows/
  • No Vale pre-commit hooks
  • No documentation linting of any kind
    Documentation-heavy repositories that would benefit most:
  1. climatehackai - Extensive ML template documentation
  2. graph_weather - Research implementation guides
  3. skillful_nowcasting - DeepMind DGMR implementation
  4. quartz-frontend - Frontend project with 90+ issues
  5. .github - Organization-wide community health files

Proposed Solution

Implement Vale as a reusable GitHub Actions workflow in the .github repository, similar to existing workflows like branch_ci.yml. This allows any OCF repository to adopt Vale with minimal configuration.

Implementation Components

  1. Vale Configuration (.vale.ini)
  • Base configuration with style rules
  • Path-specific configuration to only check documentation:
# Only check these file patterns
[*.md]
# Only check these directories
BasedOnStyles = Vale, Google

# Explicitly include documentation paths
[docs/*.md]
[*.{md,mdx,rst}]

# Exclude code directories
[!src/**]
[!test/**]
[!**/node_modules/**]
  • Custom vocabulary for OCF-specific terms (e.g., "nowcasting", "GSP", "PV")
  1. Scope Limitations
  • Only checks documentation files, not code
  • Typical paths to check:
  • README.md, CONTRIBUTING.md, CHANGELOG.md
  • docs/ directory
  • *.md and *.rst files in root
  • Excludes:
  • Source code directories (src/, lib/, etc.)
  • Test files
  • Build artifacts and dependencies
  • Configuration files (JSON, YAML, etc.)
  • Configurable per-repository based on their structure
  1. Style Rules (styles/ directory)
  • Google Developer Documentation Style Guide- Custom OCF rules for project terminology
  1. Reusable Workflow (.github/workflows/documentation_lint.yml)
  • Runs Vale only on documentation files in PRs
  • Posts inline comments on documentation issues
  • Configurable file patterns via workflow inputs:
inputs:
files:
description: 'File patterns to check'
default: '*.md docs/**/*.md'
  1. Documentation
  • README section explaining how to use the workflow
  • Guide for repositories to adopt Vale
  • Instructions for customizing paths per repository
  • Instructions for adding custom vocabulary

Prior Experience

I have already implemented Vale for the Mesa organization during GSoC, including:

  • Complete Vale configuration with custom styles
  • Properly scoped to documentation folders only (not code)
  • GitHub Actions workflow integration
  • Pre-commit hooks setup
  • Documentation for contributors
    This means minimal work is required to adapt my existing implementation for OCF. I can:
  1. Port the configuration files with proper path scoping
  2. Customize the vocabulary for OCF-specific terminology
  3. Configure exclusions for code directories
  4. Test on documentation-heavy repos
  5. Create the reusable workflow

Deliverables

  1. Configuration Files
  • .vale.ini - Main configuration
  • styles/ - Style guide rules and custom vocabulary
2. Reusable Workflow
  • .github/workflows/documentation_lint.yml
  • Can be used by other repos via:
uses: openclimatefix/.github/.github/workflows/documentation_lint@main
  1. Documentation
  • README section on using Vale
  • Migration guide for repositories
  • Vocabulary contribution guidelines
  1. Pilot Implementation
  • Test on 1-2 documentation-heavy repos (e.g., climatehackai)
  • Demonstrate value with before/after examples

Benefits

  • Consistency: Uniform documentation style across all repositories
  • Quality: Catch grammatical errors and improve readability
  • Efficiency: Reduce reviewer burden on documentation PRs
  • Accessibility: Ensure documentation is clear and inclusive
  • Scalability: Easy for new repositories to adopt via reusable workflow
  • Customizable: Add OCF-specific terminology and rules

Implementation Plan

Phase 1: Setup

  • Create Vale configuration (.vale.ini)
  • Set up styles directory with base rules
  • Define OCF-specific vocabulary and terms
  • Create reusable GitHub Actions workflow

Phase 2: Pilot

  • Test on climatehackai repository
  • Gather feedback and refine rules
  • Fix any false positives
  • Document common issues and solutions
    
### Phase 3: Rollout
  • Create PR to .github repository- [ ] Update README with Vale documentation
  • Create migration guide for other repos
  • Announce to OCF organization
    
### Phase 4: Adoption
  • Help interested repositories adopt the workflow
  • Maintain and update style rules based on feedback
  • Expand custom vocabulary as needed

References

I'm happy to implement this and would appreciate assignment to this issue. Given my prior Vale experience with Mesa, I can deliver this in 2-3 days with minimal effort.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions