-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
Description
Problem
Prompt doesn't provide specific guidance for different field types, leading to inconsistent extraction:
- Boolean: Type added in PR Add boolean type support and comprehensive model benchmark #29 but no prompt guidance
- Date: No format specification (ISO 8601 expected but not stated)
- Semantic types: No guidance for email, URL, phone formats
- Array/Object: Not yet supported, will need clear instructions
Proposed Solution
Add type-specific instructions in the schema section:
Field Types:
- string: text values
- number: numeric values (integer or decimal)
- boolean: true or false only
- date: ISO 8601 format (YYYY-MM-DD)
- enum: one of the allowed values listed
- email: valid email format
- url: valid URL format
- phone: E.164 format preferred
For each field, show the type with clear expectations.
Acceptance Criteria
- Boolean type extraction guidance added
- Date format (ISO 8601) explicitly stated
- All current types have clear format expectations
- Prepare for future array/object support
- Tests verify type guidance is included
- All 5 benchmark examples work correctly
Related
- Parent: Improve LLM prompt engineering and clarity #30
- Issue Add support for additional data types #26 (Type system expansion)
- PR Add boolean type support and comprehensive model benchmark #29 (Boolean type support)
Reactions are currently unavailable