Skip to content

tasks.org.ai: Refactor parser to use graphdl correctly and all 217 verbs #50

@nathanclevenger

Description

@nathanclevenger

Problem

The O*NET task parser in ai/packages/tasks.org.ai/src/parser-graphdl.ts has several architecture issues:

1. Duplicate Casing Logic

  • tasks.org.ai/src/casing.ts duplicates toPascalCase and toCamelCase functions
  • graphdl/src/casing.ts already exports these same functions
  • Should: Import from graphdl instead of duplicating

2. Hardcoded Verbs (63 instead of 217)

  • isLikelyVerb() function has only 63 hardcoded verbs (lines 62-129)
  • All 217 verbs exist in ai/sources/verbs/*.mdx with complete conjugation data
  • Should: Import verb list from a verbs.org.ai package or loader module

3. Hardcoded Prepositions

  • Parser has 23 hardcoded prepositions (lines 17-41)
  • Should: Import from prepositions.org.ai package (if it exists)

4. Parser-Specific Logic (Correct)

  • O*NET task parsing logic (extracting components, cleaning phrases, handling alternatives) is appropriate in tasks.org.ai
  • This is package-specific transformation, not general-purpose logic

Recommended Architecture

// graphdl (shared core) - ALREADY HAS:
- toCamelCase, toPascalCase, toKebabCase
- toThirdPerson, toEvent, toActivity, toSubjectNoun, toObjectNoun, toInverse

// verbs.org.ai package (NEW) - SHOULD HAVE:
- export const VERBS: Set<string> = new Set([...217 verb names])
- export const VERB_CONJUGATIONS: Map<string, VerbConjugation>
- export function isVerb(word: string): boolean

// prepositions.org.ai package (if exists) - SHOULD HAVE:
- export const PREPOSITIONS: Set<string>

// tasks.org.ai (package-specific) - KEEP:
- parseGraphDL() - O*NET-specific parsing logic
- cleanObjectPhrase() - O*NET-specific cleanup
- expandAlternatives() - O*NET-specific handling
- BUT IMPORT: casing from graphdl, verbs from verbs.org.ai, prepositions from prepositions.org.ai

Action Plan

  1. Create verb loader module to read all 217 verbs from ai/sources/verbs/*.mdx
  2. Export verb utilities from verbs.org.ai package
  3. Remove tasks.org.ai/src/casing.ts and import from graphdl instead
  4. Update isLikelyVerb() to use complete verb ontology (217 verbs)
  5. Check if prepositions.org.ai exists or create loader for it
  6. Keep O*NET-specific parsing logic in tasks.org.ai
  7. Test parser with full verb list
  8. Commit to ai/ submodule

Benefits

  • DRY: Eliminates duplicate casing logic
  • Complete: Uses all 217 verbs instead of 63
  • Maintainable: Single source of truth for verbs in ai/sources/verbs/
  • Correct Architecture: Parser uses graphdl utilities, packages don't reinvent the wheel

Files to Modify

  • ai/packages/tasks.org.ai/src/parser-graphdl.ts - Import casing from graphdl, import verbs from verbs.org.ai
  • ai/packages/tasks.org.ai/src/casing.ts - DELETE (use graphdl instead)
  • ai/packages/verbs.org.ai/src/index.ts - ADD verb loader and utilities
  • ai/packages/tasks.org.ai/package.json - Add dependency on verbs.org.ai

Context

This came up when investigating whether O*NET tasks are properly semantically parsed. The parser works, but only uses 63 hardcoded verbs when we have 217 fully conjugated verbs available in ai/sources/verbs/.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions