-
Notifications
You must be signed in to change notification settings - Fork 1
feat: Ecological Navigation API - situated cognition integration #36
Description
Summary
The navigation-based workflow engine (#34, #35) provides deterministic workflow traversal through opaque state tokens and server-enforced transitions. While the design aligns with situated cognition theory (Gibson, Turvey), this alignment was identified post-hoc rather than used to drive design decisions.
This issue proposes enhancing the navigation API to more genuinely integrate ecological psychology principles, moving from "theory validates pattern" to "theory shapes design."
Critical Question: Paradigm Translation
To what extent is the organism-in-environment paradigm directly translatable to the agent-uses-tool-surface paradigm?
Key Tensions
| Ecological Psychology | LLM Agent Context | Translation Problem |
|---|---|---|
| Organism is embodied | Agent is disembodied | No physical substrate for "effectivities" |
| Environment is rich, continuous | Tool surface is discrete JSON | Loss of perceptual richness |
| Perception is direct | Perception is mediated (tokens) | Gibson was explicitly anti-representationalist |
| Affordances are discovered | Actions are enumerated | No exploration, just parsing |
| Time is continuous | Request-response cycles | Perception-action coupling breaks down |
| Organism acts in environment | Agent calls tools | Causal relationship is inverted |
Fundamental Mismatches
-
Gibson's Anti-Representationalism: Gibson argued against internal representations - organisms perceive affordances directly without constructing internal models. LLMs are fundamentally representational systems. Every "perception" is a token representation. This breaks the core Gibsonian claim.
-
Embodiment: Affordances in ecological psychology are body-relative. "Climbable" depends on leg length. What is an LLM's "body"? Its context window? Its tool permissions? The metaphor becomes strained.
-
Discovery vs Specification: In the real world, affordances are discovered through active exploration. In our API, we tell the agent what actions are available. This is the opposite of ecological discovery.
-
Causal Direction: An organism acts in its environment. An agent calls a tool which returns information. The environment doesn't "afford" - the API specifies. The relationship is inverted.
What Might Actually Transfer
Despite these tensions, some concepts may genuinely translate:
| Concept | Potential Translation |
|---|---|
| Affordance salience | Actions should be immediately interpretable, not require inference |
| Perceptual restriction | Invalid options should be absent, not blocked |
| Invariants | Some state properties should remain stable across calls |
| Effectivities | Agent skills that match workflow requirements |
What Needs New Concepts
Some aspects may require new theoretical frameworks:
- Mediated Agency: A theory of action where perception is always through representations
- Discrete Situation: How does "situation" work in request-response cycles?
- Tool-Surface Ecology: What is the "environment" for a disembodied agent?
- Representational Affordances: Can affordances exist in symbolic systems?
Research Questions
Before implementing, we should consider:
- Is situated cognition the right theoretical lens, or are we forcing a metaphor?
- What would a genuinely agent-native theory of tool navigation look like?
- Are there better frameworks from HCI, cognitive systems, or distributed cognition?
- Should we develop an adapted framework rather than borrow ecological psychology?
Skill Restructuring: Effectivities as Workflow-Agnostic Capabilities
Current State
Current skills conflate two distinct concerns:
- Workflow execution skills (e.g.,
workflow-execution,activity-resolution) - manage traversal - Capability skills (e.g.,
code-review,test-review) - domain expertise
Proposed Separation
With the navigation engine handling workflow traversal, skills should be restructured:
| Category | Location | Role | Examples |
|---|---|---|---|
| Workflow execution | Subsumed by engine | No longer needed as skills | workflow-execution, activity-resolution |
| Agent effectivities | Standalone, workflow-agnostic | Agent capabilities that match affordances | code-review, test-review, security-audit |
Effectivity Matching
┌─────────────────────┐ ┌─────────────────────┐
│ WORKFLOW (Env) │ │ AGENT │
│ │ │ │
│ Step: "code-review"│ │ Effectivities: │
│ requiredSkills: │◄────┤ - code-review ✓ │
│ - code-review │ │ - test-review ✓ │
│ - security-audit │ │ - security-audit ✓ │
│ │ │ │
│ Affordance: │ │ Actualized when │
│ "Complete review" │────►│ effectivities match│
└─────────────────────┘ └─────────────────────┘
Implementation
Workflow step specifies required effectivities:
[step.code-review]
name = "Code Review"
requiredEffectivities = ["code-review"]
optionalEffectivities = ["security-audit"]Navigation response indicates fit:
"affordances": [{
"action": "complete_step",
"step": "code-review",
"requiredEffectivities": ["code-review"],
"fit": "full",
"skillResources": ["skills/code-review.toon"]
}]Agent loads skill only when needed to actualize the affordance:
1. Engine presents affordance with requiredEffectivities
2. Agent checks: "Do I have code-review skill?"
3. If yes: loads skill, performs review, completes step
4. If no: reports inability (effectivity mismatch)
Benefits
- Cleaner separation: Workflow traversal ≠ domain capability
- Workflow-agnostic skills:
code-reviewworks in any workflow - True effectivities: Skills represent what the agent can do
- Match semantics: Affordance requires effectivity → actualized action
- Skill adjacency optional: Skills may reference workflow data or be fully independent
Background
Situated cognition theory emphasizes:
- Affordances: Action possibilities offered by the environment
- Effectivities: Agent capabilities that match affordances
- Direct perception: Information specifies action without inference
- Perceptual restriction: Invalid options are not perceived, not just blocked
- Invariants: Stable patterns across transformations
Current State
The current navigation response:
{
"position": { "activity": "implement", "step": "analyze" },
"availableActions": {
"required": [{ "action": "complete_step", "step": "analyze" }],
"optional": [{ "action": "get_resource" }],
"blocked": [{ "action": "transition", "reason": "Activity not complete" }]
},
"state": "v1.gzB64...."
}Issues from an ecological perspective:
- Actions listed without meaning/context
- Agent must infer what to do from action names
- Blocked actions are perceived (violates perceptual restriction)
- No agent capability modeling
- State changes completely on each action (no invariants)
Proposed Enhancements
1. Affordance Salience
Embed action meaning directly:
"affordances": [{
"action": "complete_step",
"step": "code-review",
"specifies": "Complete code review for current task",
"enables": ["test step becomes available"],
"requiredEffectivities": ["code-review"],
"skillResources": ["skills/code-review.toon"]
}]2. Effectivities Modeling
Skills become workflow-agnostic agent capabilities:
"agentEffectivities": ["code-review", "test-review", "security-audit"],
"affordances": [{
"action": "complete_step",
"requiredEffectivities": ["code-review"],
"fit": "full" // Agent has required capability
}]3. True Perceptual Restriction
Remove blocked actions from response entirely. If an action is invalid, the agent should not perceive it as an option at all.
4. Ecological Information / Direct Specification
Provide narrative and next-action guidance:
"situation": {
"narrative": "Task 2 implementation. Code review step awaits completion.",
"nextAction": {
"action": "complete_step",
"instruction": "Load code-review skill and perform review."
}
}5. Invariants
Expose stable patterns across state changes:
"invariants": {
"workflowId": "work-package",
"workflowPhase": "implementation",
"progressRatio": 0.4,
"tasksRemaining": 5
}6. Skill Restructuring
Subsume workflow execution into engine; skills become pure effectivities:
| Current | Proposed |
|---|---|
meta/skills/workflow-execution.toon |
Subsumed by navigation engine |
meta/skills/activity-resolution.toon |
Subsumed by navigation engine |
work-package/skills/code-review.toon |
Standalone effectivity skill |
work-package/skills/test-review.toon |
Standalone effectivity skill |
Tasks
Phase 1: Theoretical Grounding
- Research: Is situated cognition applicable to disembodied agents?
- Research: Alternative frameworks (distributed cognition, activity theory, HCI)
- Document: Paradigm translation analysis - what transfers, what doesn't
- Decide: Adopt, adapt, or develop new theoretical framework
Phase 2: Skill Restructuring
- Identify skills to subsume into engine (workflow execution)
- Identify skills that are agent effectivities (code-review, etc.)
- Define effectivity schema for workflow steps
- Implement effectivity matching in navigation response
Phase 3: Navigation API Enhancement
- Design ecological response format specification
- Implement affordance salience (descriptions, enables/requires)
- Add effectivities to navigation response
- Implement true perceptual restriction
- Add situation narrative generation
- Add invariants extraction
Phase 4: Validation
- Test whether changes improve agent behavior
- Measure: Does "ecological" design reduce workflow violations?
- Update documentation with theoretical rationale
Success Criteria
- Theoretical foundation is explicit and justified
- Paradigm translation problems are acknowledged and addressed
- Workflow execution skills subsumed by engine
- Remaining skills are workflow-agnostic effectivities
- Effectivity matching implemented in navigation API
- Response format directly specifies actions without agent inference
- Design decisions traceable to chosen theoretical framework
- Measurable improvement in agent workflow fidelity
References
From Knowledge Base
- Robbins, P. & Aydede, M. (Eds.) (2008). The Cambridge Handbook of Situated Cognition
- Clancey, W.J. (1997). Situated Cognition: On Human Knowledge and Computer Representations
- Gibson and Turvey citations derived from Cambridge Handbook content
Additional (not from KB)
- Hutchins, E. (1995). Cognition in the Wild (distributed cognition)
- Suchman, L. (1987). Plans and Situated Actions (HCI perspective)
Implementation
- PR feat: Navigation-based workflow engine for deterministic traversal #35: Navigation-based workflow engine implementation
Priority
Low - Enhancement to existing navigation API. Current implementation is functional. This work requires theoretical grounding before implementation.