Skip to content

feat: Ecological Navigation API - situated cognition integration #36

@m2ux

Description

@m2ux

Summary

The navigation-based workflow engine (#34, #35) provides deterministic workflow traversal through opaque state tokens and server-enforced transitions. While the design aligns with situated cognition theory (Gibson, Turvey), this alignment was identified post-hoc rather than used to drive design decisions.

This issue proposes enhancing the navigation API to more genuinely integrate ecological psychology principles, moving from "theory validates pattern" to "theory shapes design."


Critical Question: Paradigm Translation

To what extent is the organism-in-environment paradigm directly translatable to the agent-uses-tool-surface paradigm?

Key Tensions

Ecological Psychology LLM Agent Context Translation Problem
Organism is embodied Agent is disembodied No physical substrate for "effectivities"
Environment is rich, continuous Tool surface is discrete JSON Loss of perceptual richness
Perception is direct Perception is mediated (tokens) Gibson was explicitly anti-representationalist
Affordances are discovered Actions are enumerated No exploration, just parsing
Time is continuous Request-response cycles Perception-action coupling breaks down
Organism acts in environment Agent calls tools Causal relationship is inverted

Fundamental Mismatches

  1. Gibson's Anti-Representationalism: Gibson argued against internal representations - organisms perceive affordances directly without constructing internal models. LLMs are fundamentally representational systems. Every "perception" is a token representation. This breaks the core Gibsonian claim.

  2. Embodiment: Affordances in ecological psychology are body-relative. "Climbable" depends on leg length. What is an LLM's "body"? Its context window? Its tool permissions? The metaphor becomes strained.

  3. Discovery vs Specification: In the real world, affordances are discovered through active exploration. In our API, we tell the agent what actions are available. This is the opposite of ecological discovery.

  4. Causal Direction: An organism acts in its environment. An agent calls a tool which returns information. The environment doesn't "afford" - the API specifies. The relationship is inverted.

What Might Actually Transfer

Despite these tensions, some concepts may genuinely translate:

Concept Potential Translation
Affordance salience Actions should be immediately interpretable, not require inference
Perceptual restriction Invalid options should be absent, not blocked
Invariants Some state properties should remain stable across calls
Effectivities Agent skills that match workflow requirements

What Needs New Concepts

Some aspects may require new theoretical frameworks:

  1. Mediated Agency: A theory of action where perception is always through representations
  2. Discrete Situation: How does "situation" work in request-response cycles?
  3. Tool-Surface Ecology: What is the "environment" for a disembodied agent?
  4. Representational Affordances: Can affordances exist in symbolic systems?

Research Questions

Before implementing, we should consider:

  1. Is situated cognition the right theoretical lens, or are we forcing a metaphor?
  2. What would a genuinely agent-native theory of tool navigation look like?
  3. Are there better frameworks from HCI, cognitive systems, or distributed cognition?
  4. Should we develop an adapted framework rather than borrow ecological psychology?

Skill Restructuring: Effectivities as Workflow-Agnostic Capabilities

Current State

Current skills conflate two distinct concerns:

  1. Workflow execution skills (e.g., workflow-execution, activity-resolution) - manage traversal
  2. Capability skills (e.g., code-review, test-review) - domain expertise

Proposed Separation

With the navigation engine handling workflow traversal, skills should be restructured:

Category Location Role Examples
Workflow execution Subsumed by engine No longer needed as skills workflow-execution, activity-resolution
Agent effectivities Standalone, workflow-agnostic Agent capabilities that match affordances code-review, test-review, security-audit

Effectivity Matching

┌─────────────────────┐     ┌─────────────────────┐
│   WORKFLOW (Env)    │     │   AGENT             │
│                     │     │                     │
│  Step: "code-review"│     │  Effectivities:     │
│  requiredSkills:    │◄────┤  - code-review ✓    │
│    - code-review    │     │  - test-review ✓    │
│    - security-audit │     │  - security-audit ✓ │
│                     │     │                     │
│  Affordance:        │     │  Actualized when    │
│  "Complete review"  │────►│  effectivities match│
└─────────────────────┘     └─────────────────────┘

Implementation

Workflow step specifies required effectivities:

[step.code-review]
name = "Code Review"
requiredEffectivities = ["code-review"]
optionalEffectivities = ["security-audit"]

Navigation response indicates fit:

"affordances": [{
  "action": "complete_step",
  "step": "code-review",
  "requiredEffectivities": ["code-review"],
  "fit": "full",
  "skillResources": ["skills/code-review.toon"]
}]

Agent loads skill only when needed to actualize the affordance:

1. Engine presents affordance with requiredEffectivities
2. Agent checks: "Do I have code-review skill?"
3. If yes: loads skill, performs review, completes step
4. If no: reports inability (effectivity mismatch)

Benefits

  1. Cleaner separation: Workflow traversal ≠ domain capability
  2. Workflow-agnostic skills: code-review works in any workflow
  3. True effectivities: Skills represent what the agent can do
  4. Match semantics: Affordance requires effectivity → actualized action
  5. Skill adjacency optional: Skills may reference workflow data or be fully independent

Background

Situated cognition theory emphasizes:

  • Affordances: Action possibilities offered by the environment
  • Effectivities: Agent capabilities that match affordances
  • Direct perception: Information specifies action without inference
  • Perceptual restriction: Invalid options are not perceived, not just blocked
  • Invariants: Stable patterns across transformations

Current State

The current navigation response:

{
  "position": { "activity": "implement", "step": "analyze" },
  "availableActions": {
    "required": [{ "action": "complete_step", "step": "analyze" }],
    "optional": [{ "action": "get_resource" }],
    "blocked": [{ "action": "transition", "reason": "Activity not complete" }]
  },
  "state": "v1.gzB64...."
}

Issues from an ecological perspective:

  1. Actions listed without meaning/context
  2. Agent must infer what to do from action names
  3. Blocked actions are perceived (violates perceptual restriction)
  4. No agent capability modeling
  5. State changes completely on each action (no invariants)

Proposed Enhancements

1. Affordance Salience

Embed action meaning directly:

"affordances": [{
  "action": "complete_step",
  "step": "code-review",
  "specifies": "Complete code review for current task",
  "enables": ["test step becomes available"],
  "requiredEffectivities": ["code-review"],
  "skillResources": ["skills/code-review.toon"]
}]

2. Effectivities Modeling

Skills become workflow-agnostic agent capabilities:

"agentEffectivities": ["code-review", "test-review", "security-audit"],
"affordances": [{
  "action": "complete_step",
  "requiredEffectivities": ["code-review"],
  "fit": "full"  // Agent has required capability
}]

3. True Perceptual Restriction

Remove blocked actions from response entirely. If an action is invalid, the agent should not perceive it as an option at all.

4. Ecological Information / Direct Specification

Provide narrative and next-action guidance:

"situation": {
  "narrative": "Task 2 implementation. Code review step awaits completion.",
  "nextAction": {
    "action": "complete_step",
    "instruction": "Load code-review skill and perform review."
  }
}

5. Invariants

Expose stable patterns across state changes:

"invariants": {
  "workflowId": "work-package",
  "workflowPhase": "implementation",
  "progressRatio": 0.4,
  "tasksRemaining": 5
}

6. Skill Restructuring

Subsume workflow execution into engine; skills become pure effectivities:

Current Proposed
meta/skills/workflow-execution.toon Subsumed by navigation engine
meta/skills/activity-resolution.toon Subsumed by navigation engine
work-package/skills/code-review.toon Standalone effectivity skill
work-package/skills/test-review.toon Standalone effectivity skill

Tasks

Phase 1: Theoretical Grounding

  • Research: Is situated cognition applicable to disembodied agents?
  • Research: Alternative frameworks (distributed cognition, activity theory, HCI)
  • Document: Paradigm translation analysis - what transfers, what doesn't
  • Decide: Adopt, adapt, or develop new theoretical framework

Phase 2: Skill Restructuring

  • Identify skills to subsume into engine (workflow execution)
  • Identify skills that are agent effectivities (code-review, etc.)
  • Define effectivity schema for workflow steps
  • Implement effectivity matching in navigation response

Phase 3: Navigation API Enhancement

  • Design ecological response format specification
  • Implement affordance salience (descriptions, enables/requires)
  • Add effectivities to navigation response
  • Implement true perceptual restriction
  • Add situation narrative generation
  • Add invariants extraction

Phase 4: Validation

  • Test whether changes improve agent behavior
  • Measure: Does "ecological" design reduce workflow violations?
  • Update documentation with theoretical rationale

Success Criteria

  • Theoretical foundation is explicit and justified
  • Paradigm translation problems are acknowledged and addressed
  • Workflow execution skills subsumed by engine
  • Remaining skills are workflow-agnostic effectivities
  • Effectivity matching implemented in navigation API
  • Response format directly specifies actions without agent inference
  • Design decisions traceable to chosen theoretical framework
  • Measurable improvement in agent workflow fidelity

References

From Knowledge Base

  • Robbins, P. & Aydede, M. (Eds.) (2008). The Cambridge Handbook of Situated Cognition
  • Clancey, W.J. (1997). Situated Cognition: On Human Knowledge and Computer Representations
  • Gibson and Turvey citations derived from Cambridge Handbook content

Additional (not from KB)

  • Hutchins, E. (1995). Cognition in the Wild (distributed cognition)
  • Suchman, L. (1987). Plans and Situated Actions (HCI perspective)

Implementation

Priority

Low - Enhancement to existing navigation API. Current implementation is functional. This work requires theoretical grounding before implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions