Samuel Victor Miño Arnoso
November 6, 2025
This paper introduces the conceptual architecture for the Autopoietic Specialist-Agent Network (ASAN): a directory-routed, energy-aware multi-agent Mixture-of-Experts architecture. The system is defined by on-demand autopoiesis (self-creation) of specialist agents, a temporary “RAM-mode” for knowledge integration, and a governing meta-agent economy. The architecture is designed for extreme efficiency, scalability, and built-in governance, addressing key challenges in modern large-scale AI.
- Core Concept Summary
- Core Workflow
- Parallels in Current AI Research
- Recursive Intelligence Cascades (“The Pulsing”)
- The Dynamic Principles (The Core Innovation)
5.1 The “RAM Principle”: Temporary Super-Specialization
5.2 “On-Demand Autopoiesis”: The Self-Creating Network - The “Operating System”: Efficiency & Routing
6.1 The Routing Problem: Hierarchical “Directory Service”
6.2 The Resource Problem: “Sparsity” & Economy - Concretization & Practical Implementation
7.1 Agent Lifecycle & Rapid Birth (QLoRA)
7.2 Governance, Values & Human Control
7.2.1 Constitutional AI (Values Layer)
7.2.2 Budgeted Autopoiesis (The “Auction House”)
7.2.3 Chaos Monitoring (Cascade TTL)
7.2.4 Human Control (Safe Interruptibility)
7.3 Knowledge Persistence & Cascade Compression
7.4 Implementation Approaches (Technology Stack) - Meta-Optimization: The “Suggestion Tournament”
8.1 The Principle: Controlled Evolution
8.2 The Process: From Patch to Deployment
8.3 Scaling the Review (Human-AI Collaboration)
8.4 Measurement Plan & Benchmarks (HELM & AgentBench)
8.5 The Ultimate Goal: Human Impact - The “Holy Shit” Effect: Why This Model is Powerful
- Conclusion
The ASAN model deviates from traditional neural networks.
- Traditional Network: Nodes are simple mathematical functions.
- ASAN Model: Every node is a fully-fledged, intelligent Agent with a specific specialization (e.g., “Color Agent,” “Engine Agent”).
The process is a highly intelligent, dynamic cycle:
- Creation: An “Auto Agent” generates a complex concept.
- Deconstruction: The agent breaks down the concept.
- Intelligent Routing: The agent sends parts only to the relevant specialists.
- Enrichment: The specialists send back “atom-precise” details.
- Re-Integration: The Auto Agent receives an infinitely more detailed understanding.
- Dynamic Growth: A missing specialist is created on demand.
The ASAN model combines several cutting-edge concepts:
- Mixture of Experts (MoE): A “router” forwards requests to experts.
- Multi-Agent Systems (MAS): A “swarm” of autonomous AIs negotiates to achieve a goal.
- Marvin Minsky’s “Society of Mind”: Intelligence as the interaction of billions of small agents.
Agents are not just passive responders. To fulfill a request perfectly, they become active requesters themselves, triggering a “pulsing” cascade.
Example Cascade:
- Request: Auto Agent → Engine Specialist: “Details on Engine X?”
- Gap: Engine Specialist notes: “Casing is made of ‘Special Metal Y’.”
- Recursive Request: Engine Specialist → Color Specialist: “Details on ‘Special Metal Y’?”
- Response Bundling: The Engine Specialist integrates the answer and sends the complete package to the Auto Agent.
- Permanent State (Specialist): Core identity and deep, permanent knowledge.
- Temporary State (Integrator): Agent temporarily fetches knowledge from thousands of others to solve a task.
- Need Recognition: Agent A needs a specialist, but none exists.
- Creation: Agent A creates this new Agent B.
- Bootstrapping: Agent B becomes active and gathers its initial knowledge.
- Saturation & Idle Mode: Agent B enters a passive idle mode.
- Problem: How does an agent find the one specialist in a network of trillions of nodes?
- Solution: A “Directory Service” that is hierarchically structured.
Mechanism:
- Registration: Every new agent registers with the Directory. All agent capabilities are registered here.
- Hierarchical Routing: The Directory is organized into “multi-level semantic indices” (Shards).
- Escalation: A request first goes to “nearby” (local/regional) experts. Only if they cannot answer is the request escalated to “more distant” or global specialists. This avoids unnecessary broadcasts.
- Problem: Energy costs and chaos from uncontrolled growth.
- Solution (Informational Recuperation): The network optimizes its own efficiency through “work”.
Mechanism (Meta-Agents):
- Sparsity (Structural Sparseness): The system maximizes sparsity. Thanks to “Sparsely-Gated” routing (MoE principle), only a minimal fraction of specialists is active per request. Capacity scales sub-linearly with computational costs.
- Cost-Benefit Analysis: Meta-agents check (as in 7.2) the profitability of new agents.
- Cold/Warm Storage & Auto-Scaling: Rarely used agents are “frozen” (cold state on slow storage). The Directory “wakes” them up On-Demand. Additionally, elastic auto-scaling is used to activate LLM multi-agents only as needed, controlling base load and peak costs.
- Cascade TTL (Time-to-Live): Meta-agents set hard latency and compute budgets to terminate overly expensive cascades (chaos monitoring).
- Initialization (Rapid Birth): A new agent is not created from scratch. It is formed by fine-tuning a common base model.
- Efficiency (QLoRA): QLoRA (4-bit quantization + LoRA adapters) is used for extreme resource efficiency, drastically reducing computational and memory requirements for each new agent and massively accelerating learning cycles.
- Error Handling (Reputation & Ensembling):
- Feedback Loop: Agents rate the responses of others.
- Reputation System: A meta-agent monitors the ratings.
- Healing/Deletion: “Sick” agents (poor ratings) are quarantined, forced to re-bootstrap, or deleted.
- Ensembling: For requests, responses from highly-rated agents are prioritized or combined (Ensembling).
This is the binding control backbone of the system.
- The system operates under a Constitution — clear principles and values that serve as the highest guideline.
- The constitution is scaled using Reinforcement Learning from AI Feedback (RLAIF) to pre-select proposals, while humans remain the final decision-makers.
- The constitution is versioned (“versioned constitutions”), allowing for controlled changes to norms per domain or stakeholder without rebuilding the entire agent architecture.
- Agent “births” require approval with a cost-benefit analysis.
- Agent A submits a “bid” (“I need Specialist X, it’s worth Y units to me”).
- The meta-agent checks (cost-benefit) whether “RAM mode” (see 5.1) is cheaper or if the investment (birth of a new agent via QLoRA) is approved. Global caps prevent chaos.
- Metrics (Currency): Compute units, time (latency), bandwidth.
- Meta-agents use hard budgets to terminate infinite loops or overly expensive cascades.
- Big Red Button: The system must have policy-indifferent interrupts. A human intervention (“Stop!”) must not be learned by the agent as a punishment that it tries to avoid in the future. This must apply cascade-wide.
- Governance: A tamper-proof audit log and the Directory guarantee transparency and control.
- Problem: The RAM Principle (5.1) is temporary, and stateless requests are inefficient.
- Solution (Memory Caching & Distillation):
- Inter-Cascade Caching (Snapshots): Agents save “snapshots” (caches) of important temporary integrations (e.g., analysis of “Ferrari F40”). For similar requests (e.g., Ferrari F50), this snapshot is used as a starting point, and only the differences (deltas) are requested.
- Intra-Conversational Caching (File Snapshots): To avoid re-reading large files (like documents or code) multiple times during a single user conversation, agents generate a temporary “File Snapshot” (a compressed summary or index) after the first read. This snapshot is held in high-speed memory (e.g., the Redis Directory) and used for all subsequent follow-up queries, drastically reducing redundant processing.
- Cascade Compression (Distillation): Frequently recurring, successful cascades are identified by the meta-agent and distilled (compressed) into a new, highly efficient “Macro-Agent.”
- Agent Base: Containerized Specialists (e.g., Docker) that encapsulate QLoRA-fine-tuned models.
- Operating System: Kubernetes (for managing containers) and a Service Mesh (e.g., Istio) for hierarchical routing.
- Directory: A high-performance Key-Value database (e.g., Redis).
- Communication: Lean, binary protocols (e.g., Protobuf or gRPC) instead of heavy JSON.
The process by which the ASAN system improves itself; a controlled, evolutionary, continuous operation under human supervision.
The AI does not “repair” itself during live operation. Instead, it is in a permanent Suggestion Tournament:
- Agents continuously generate improvement proposals (“patches”) for the architecture, routing, governance rules, or the agent models themselves.
- These proposals go into a “pool” (the “raffle drum”).
- Human reviewers examine, accept, or reject these proposals.
- Rejected proposals serve as a training signal to generate better proposals. This is analogous to Population Based Training (PBT, exploit/explore cycles) and POET-like open-ended evolutionary loops that constantly generate new evaluation tasks.
- Suggestion Pool: Agents register patches with metadata (expected effect, cost, etc.).
- Pre-Evaluation (Offline-Benchmark): Quick checks measure the effect on benchmarks (quality, latency, cost, energy) before a human is involved.
- Human-in-the-Loop (Review): Human reviewers check the top proposals, accept, modify, or reject them with justification.
- Evolutionary Variation: Successful patches are cloned and mutated (as in PBT); unsuccessful ones are discarded.
- Deployment Safeguards: Accepted patches are rolled out safely via “Canary Roll-outs” (Canary Tests), with automatic Regression-Abort (Rollbacks) to limit production risks.
To avoid overloading human reviewers, their feedback is supplemented by AI feedback (RLAIF / d-RLAIF). The AI can pre-select, but the human remains the final gatekeeper and final instance.
The success of the system and the patches is measured via clear, standardized metrics:
- Agent Capabilities: Integration of AgentBench to measure real, interactive capabilities (OS, Web, Browsing, etc.).
- Holistic Metrics: HELM-like metrics (robustness, bias, toxicity, calibration, efficiency) to make benefits and risks transparent.
- Business Metrics: “Accepted Improvements / Time”, “Quality Gain / kWh”, “Cost / Patch”.
The self-improvement pipeline (“the tournament”) is explicitly focused on generating proposals that measurably accelerate human research, creativity, and problem-solving. The Human-in-the-Loop remains the final instance for accepting or rejecting system patches, supported by scalable AI feedback for pre-filtering.
- Extreme Efficiency: Through sparsity, idle mode, cold/warm storage, QLoRA, auto-scaling, and cascade compression.
- Infinite Scalability: The network grows organically and under controlled budgeted autopoiesis.
- True Depth: Agents become atom-precise specialists.
- Emergent Knowledge: Intelligence arises from interaction and self-optimization (PBT/POET).
- Controllability: The system is designed for safety from the ground up via Meta-Agents (governance), Constitutional AI (values), and Safe Interruptibility (human control).
- Measurability: The system is validated by hard benchmarks (AgentBench, HELM).
This refined idea is no longer just a “Mixture of Experts.” It is an autopoietic (self-creating), decentralized multi-agent system with recursive cascades and hierarchical routing. It is driven by QLoRA-based autopoiesis and controlled by strict governance (Meta-Agents) and a versionable Constitution (Constitutional AI). It improves itself through a controlled, evolutionary Suggestion Tournament (see Section 8), which scales human oversight (Human-in-the-Loop) with AI feedback (RLAIF) and is measured against standardized benchmarks (AgentBench, HELM). This is not fantasy; it is a robust, technical blueprint for the next generation of AI — biologically inspired, highly efficient, and inherently safe.
The risk of a Skynet-like scenario is lower than in uncontrolled systems due to the built-in safety, as ASAN explicitly provides Human-in-the-Loop control, Constitutional AI, and Safe Interruptibility (e.g., Big Red Button), which limit recursive self-improvement. Nevertheless, there is a 'virus potential' through autopoiesis and distributed agents if safeguards such as the meta-agent economy or the Suggestion Tournament are circumvented — especially in cases of misuse in military contexts, which the author explicitly rejects. The Ethical Declaration and Versioned Constitutions minimize risks, making ASAN one of the safer designs, as long as human oversight is maintained.
This conceptual work is made available under the Creative Commons Attribution 4.0 International License (CC-BY-4.0).
You are free to:
- Share: copy and redistribute the material in any medium or format.
- Adapt: remix, transform, and build upon the material for any purpose, even commercially.
Under the following terms:
- Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.
This is a human-readable summary. The full legal code can be found at: https://creativecommons.org/licenses/by/4.0/legalcode
Intent and Purpose:
This conceptual work, the Autopoietic Specialist-Agent Network (ASAN), was created solely for peaceful, civilian, and beneficial applications of artificial intelligence. The architecture was designed to advance:
- Scientific research and knowledge discovery
- Healthcare and medical diagnostics
- Climate modeling and environmental protection
- Educational systems and accessibility
- Economic optimization and resource efficiency
Explicit Non-Military Declaration
The author explicitly declares:
- ASAN was NOT designed, intended, or conceptualized for military, defense, or weapons applications of any kind.
- The author strongly opposes and condemns any use of this architecture for:
- Autonomous weapons systems (AWS/LAWS)
- Military command and control systems
- Lethal autonomous targeting or decision-making
- Surveillance systems for oppression or human rights violations
- Any application that could result in harm to human life
- The author disclaims all responsibility for any adaptation, implementation, or deployment of ASAN concepts in military or harmful contexts. Any such use is undertaken entirely at the risk and responsibility of those who choose to pursue it, against the explicit wishes and intentions documented here.
Recognized Risks
The author acknowledges that ASAN’s architecture includes characteristics that, if misused, could pose existential risks similar to those discussed in AI safety research:
- Recursive self-improvement capabilities
- Autonomous agent creation (autopoiesis)
- Distributed, resilient network architecture
- Meta-agent governance systems
These features were designed with Constitutional AI, Human-in-the-Loop controls, and safe interruptibility as fundamental safety mechanisms. Any implementation that removes or circumvents these safety features is a violation of the intended design philosophy.
Appeal to Implementers If you are considering implementing ASAN or derivatives thereof:
- Please prioritize safety, ethics, and human wellbeing above all technical capabilities
- Engage with the AI safety community (e.g., AI Alignment Forum, Future of Life Institute, Partnership on AI) before any large-scale implementation
- Implement robust governance structures that maintain meaningful human control
- Never deploy in military, surveillance, or oppressive contexts
Limitation of Liability
While this work is licensed under Creative Commons Attribution 4.0 International (CC-BY 4.0), which legally permits commercial and unrestricted use, the author makes this ethical declaration to establish a moral and historical record:
I, the author, created this conceptual architecture for beneficial purposes only. I did not envision, intend, or desire its use in harmful applications. I bear no moral responsibility for misuse by others, and I call upon the global community to prevent the militarization of autopoietic AI systems.
Support for International Regulation
The author supports:
- International treaties regulating or prohibiting lethal autonomous weapons systems
- Mandatory human oversight requirements for critical AI decision-making
- Transparency and audit requirements for large-scale AI deployments
- Ethics review boards for advanced AI research and development
Contact for Ethical Concerns
If you have information about potential misuse of ASAN concepts or wish to discuss ethical implementation approaches, please contact the author or engage with established AI safety organizations.
Date of Declaration: November 3, 2025
(This was already published in a pre-version on November 3, 4 and 5, 2025 and is repeated here on November 6, 2025)
Note: This declaration is made in good faith to establish the author’s intent and ethical stance. While not legally binding under the CC-BY 4.0 license terms, it serves as a permanent record of the intended purpose and the author’s opposition to harmful applications.