Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,14 @@ export default defineConfig({
{
text: 'OP SIG',
items: [
{ text: 'README', link: '/'},
{ text: 'Initiatives', link: '/initiatives/'},
]
},
{
text: 'Active Initiatives',
items: [
{ text: 'OP-001: Initial Setup', link: '/initiatives/op-001'},
{ text: 'OP-002: Nostr-Based Network Coordination', link: '/initiatives/op-002'},
]
}
],
Expand Down
76 changes: 65 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,81 @@
---
title: OP SIG (Operations & Performance)
title: OP SIG
outline: deep
lastUpdated: true
---

# OP SIG — HyperCore One
*Operations and DevOps for Zenon Network*

## Contact & Meetings
- **Matrix:** `#sig-op:hc1.chat`

- **Matrix:** https://matrix.to/#/#sig-op:hc1.chat
- **Chair:** @deeznnutz:zenon.chat
- **Meetings:** Ad-hoc
- **Meetings:** Ad-hoc (announced in communication channels)

All contributors are welcome to join discussions and participate in meetings. Notes and decisions will be published in this repository.

## Overview

The OP SIG operates under the **HyperCore One (HC1)** team as part of the broader effort to strengthen the **Zenon Network** ecosystem through structured open collaboration.

The **Operations Special Interest Group (OP SIG)** focuses on operational tooling, deployment automation, and monitoring infrastructure for Zenon Network validators and associated infrastructure.

This SIG provides a structured environment for open collaboration, design discussions, issue triage, and roadmap alignment for operational initiatives supporting network reliability and observability.

## Mission

To provide operational infrastructure and DevOps tooling for the **Zenon Network** ecosystem, ensuring reliable validator operations, network monitoring, and community infrastructure through transparent development and documented best practices.

## Scope

The OP SIG focuses on **operational tooling and DevOps infrastructure** for the Zenon Network.

## Scope & Mandate
**In scope:**
- Pillar (Validator) deployment automation and configuration management
- Monitoring and alerting infrastructure for network health
- Bridge supporting tooling (health monitoring, observability)
- Community infrastructure (forums, bots, Telegram integrations)
- Infrastructure-as-code and automation scripts
- Operational documentation, runbooks, and deployment guides
- Network health dashboards and metrics collection

### TODO
**Out of scope:**
- Core Zenon protocol modifications (handled by other SIGs)
- Wallet UI/UX changes (handled by Syrius SIG)
- Application-level features and smart contract development

## Owned Repos
## Governance

The OP SIG is an **open working group** under HC1's governance structure.
All contributors may propose, review, and discuss changes through GitHub issues or pull requests.

Governance follows the guidelines and decision making framework established by the [SDLC SIG](https://github.com/hypercore-one/sig-sdlc).
Copy link

Copilot AI Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compound adjective "decision making" should be hyphenated when modifying "framework". It should be "decision-making framework".

Suggested change
Governance follows the guidelines and decision making framework established by the [SDLC SIG](https://github.com/hypercore-one/sig-sdlc).
Governance follows the guidelines and decision-making framework established by the [SDLC SIG](https://github.com/hypercore-one/sig-sdlc).

Copilot uses AI. Check for mistakes.
These include expectations for participation, proposal lifecycle, and consensus-driven approvals.

- **Decision Process:** Defined according to [SDLC SIG Governance](https://github.com/hypercore-one/sig-sdlc).
- **Code of Conduct:** Inherits from the [HyperCore One community guidelines](https://github.com/hypercore-one/.github/blob/master/CODE_OF_CONDUCT.md).

## Owned Repositories

| Repository | Description |
|-------------|--------------|
| [`hypercore-one/sig-op`](https://github.com/hypercore-one/sig-op) | OP SIG documentation and coordination hub. |
| [`hypercore-one/qube-manager`](https://github.com/hypercore-one/qube-manager) | Nostr-based deployment manager for hyperqube nodes. |
| [`hypercore-one/bridge-health`](https://github.com/hypercore-one/bridge-health) | Bridge health monitoring and observability tooling. |
| [`hypercore-one/zenon-node-monitor`](https://github.com/hypercore-one/zenon-node-monitor) | Zenon node monitoring and alerting infrastructure. |


## Related Repositories

| Repository | Description |
|-------------|-------------|
| [`zenon-network/go-zenon`](https://github.com/zenon-network/go-zenon) | Zenon Network core implementation. |
| [`coinselor/qubestr`](https://github.com/coinselor/qubestr/) | Hyperqube related tooling. |

| Repo | Purpose |
|---|---|
| [`hypercore-one/qube-manager`](https://github.com/hypercore-one/qube-manager) | Nostr based deployment manager for hyperqube nodes |

## Active Initiatives

| ID | Title | Description |
|---:|---|---|
These are the current working streams or major efforts under the OP SIG. They provide focus, allow contributors to pick up tasks with clear orientation, and can evolve over time.

**[View all active initiatives →](/initiatives)**
27 changes: 27 additions & 0 deletions initiatives/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Active Initiatives

This section tracks the current working streams and major efforts under the OP SIG.

## Current Initiatives

| ID | Title | Status | Description |
|---:|---|---|---|
| [OP-001](/initiatives/op-001) | Initial Setup | In progress | Establish the foundational structure, documentation, and workflows for the OP SIG repository to enable collaborative development and transparent governance. |
| [OP-002](/initiatives/op-002) | Nostr-Based Network Coordination | In progress | Develop trustless, Nostr-based coordination system for decentralized validator operations, testnet deployment, and sidechain management using qube-manager and qubestr. |

## Initiative Lifecycle

Initiatives follow the HC1 governance process:

1. **Proposed** - Initiative is drafted and under discussion
2. **In Progress** - Active development and implementation
3. **Completed** - Deliverables achieved and documented
4. **On Hold** - Temporarily paused, with clear resumption criteria

## Contributing

To propose a new initiative:
1. Open a GitHub issue with the initiative template
2. Discuss in Matrix (#sig-op:hc1.chat)
3. Present at a SIG meeting if applicable
4. Follow the SDLC SIG governance process for approval
45 changes: 45 additions & 0 deletions initiatives/op-001.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# OP-001: Initial Setup

**Status:** In progress

## Overview

Establish the foundational structure, documentation, and workflows for the OP SIG repository to enable collaborative development and transparent governance.

## Goals

Create a complete organizational framework that allows contributors to:
- Understand the SIG's mission, scope, and governance
- Discover active initiatives and participate in discussions
- Follow consistent documentation patterns aligned with other HC1 SIGs
- Access clear contribution guidelines and communication channels

## Main Deliverables

1. **Documentation Structure**
- Complete README.md with Overview, Mission, Scope, and Governance sections
- CLAUDE.md file for AI-assisted development guidance
- Initiative tracking framework with dedicated pages

2. **VitePress Configuration**
- Configured site with proper navigation and theming
- GitHub Pages deployment pipeline
- Rewrite rules for clean URL routing
- Local search for easy access

3. **Repository Organization**
- Clear folder structure for initiatives and documentation
- Consistent formatting aligned with Syrius SIG patterns
- Edit links to GitHub source for collaborative editing

4. **Communication Channels**
- Matrix channel setup and documentation
- Meeting schedule and participation guidelines
- Link to HC1 governance frameworks and Code of Conduct

## Success Criteria

- Documentation is complete, accurate, and follows HC1 SIG patterns
- Site successfully deploys to GitHub Pages
- Contributors can easily discover how to participate
- Foundation is ready for operational initiative work
199 changes: 199 additions & 0 deletions initiatives/op-002.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# OP-002: Nostr-Based Network Coordination for Testnet & Sidechain Deployment

**Status:** In progress

## Overview

Develop and deploy a trustless, Nostr-based coordination system that enables decentralized management of validator operations across testnets and sidechains. The system uses multi-signature quorum voting to coordinate network-wide actions such as upgrades, reboots, and genesis deployments without requiring centralized command-and-control infrastructure.

This initiative encompasses two core components:
- **qube-manager**: A Go client that runs on each Pillar as a service to receive, validate, and execute coordinated actions
- **qubestr**: A specialized Nostr relay that manages custom event types for infrastructure coordination

## Goals

1. **Production-Ready Coordination Infrastructure**
- Complete feature development for qube-manager and qubestr with error handling and edge case coverage
- Implement comprehensive testing (unit, integration, and end-to-end tests) with >80% code coverage
- Add monitoring and observability (metrics collection, logging, alerting) for both relay and client components
- Create comprehensive documentation including deployment guides, API specifications, and troubleshooting runbooks

2. **Automated Testnet Deployment**
- Create workflows for spinning up new testnets with coordinated genesis deployment
- Enable synchronized network launches across distributed validators
- Support multiple concurrent testnet environments

3. **Trustless Multi-Signature Governance**
- Enable quorum-based decision-making without central authority
- Provide cryptographically verifiable command chains
- Create transparent, auditable governance processes

## Technical Architecture

### Nostr Event-Based Coordination

The system leverages custom Nostr event kinds for validator coordination:

**Kind 33321 (HyperSignal)**: Action directives published by authorized HC1 developers
- Tags specify action type (upgrade/reboot), version, hash, network, and genesis URL
- Only authorized pubkeys can publish HyperSignal events (enforced at relay level)
- Addressable events ensure latest directive is always current

**Kind 3333 (QubeManager)**: Action acknowledgements published by validator nodes
- Reports execution status (success/failure) with timestamps
- Links back to originating HyperSignal event for auditability
- Creates complete execution trail across validator fleet

### Multi-Signature Verification

Validators independently:
1. Subscribe to HyperSignal events from authorized HC1 developer pubkeys
2. Aggregate votes across multiple developers for each proposed action
3. Verify quorum threshold is met (configurable, e.g., 3-of-5)
4. Execute approved actions only when consensus is reached
5. Publish acknowledgement events to create auditable trail

### Key Components

**qube-manager** (https://github.com/hypercore-one/qube-manager):
- Multi-relay subscription manager
- Cryptographic signature verification
- Quorum-based decision engine
- Idempotent action execution with history tracking
- Parallel relay publishing for fault tolerance

**qubestr** (https://github.com/coinselor/qubestr):
- Custom Nostr relay built on Khatru framework
- NIP-42 authentication enforcement
- Authorization via pubkey whitelist
- PostgreSQL event persistence
- Strict event validation

## Main Deliverables

### 1. Core System Hardening

**Testing & Quality Assurance:**
- Integration testing between qube-manager and qubestr
- Multi-node coordination testing with various quorum configurations
- Failure scenario testing (network partitions, relay failures, etc.)

**Monitoring & Observability:**
- Metrics collection for event processing and action execution
- Fleet-wide status dashboard showing validator coordination state
- Alerting for failed actions or quorum failures
- Event history queries and reporting tools
- Performance monitoring for relay and client components

**Documentation:**
- Technical architecture documentation
- API specifications for custom Nostr event kinds
- Security model and threat analysis
- Deployment guides for both components

### 2. Testnet Deployment Workflows

**Orchestration Tooling:**
- Automated genesis file generation and distribution
- Coordinated network launch scripts using reboot actions
- Multi-testnet management (parallel testnets with different configs)
- Validator onboarding automation
- Network health verification post-deployment

**Genesis Coordination:**
- Reboot action workflows with genesis URL distribution
- Deadline-based synchronized network launches
- Genesis parameter templates for common testnet scenarios
- Rollback procedures for failed deployments

**Deployment Runbooks:**
- Step-by-step testnet launch procedures
- Sidechain deployment guides
- Emergency procedures for network coordination failures
- Validator setup and configuration guides

### 3. Multi-Signature Coordination Workflows

**Upgrade Coordination:**
- Binary hash verification implementation
- Version comparison and upgrade path validation
- Rollback capabilities on upgrade failures
- Staged rollout support (partial fleet upgrades)

**Governance Documentation:**
- Quorum configuration best practices
- Developer key management procedures
- Authorization policy documentation
- Incident response procedures

**Operator Tools:**
- CLI tools for monitoring fleet status
- Event query utilities for audit trails
- Validator acknowledgement tracking
- Action history analysis tools

### 4. Production Deployment

**Infrastructure Setup:**
- Production relay deployment with redundancy
- PostgreSQL clustering for event persistence
- TLS/SSL configuration for secure WebSocket connections
- Backup and disaster recovery procedures

**Validator Integration:**
- qube-manager deployment automation
- Configuration management for multi-network support
- Key generation and secure storage
- Integration with existing validator infrastructure

## Timeline & Phases

### Phase 1: Core System Completion (Current)
- Complete qube-manager Kind 33321/3333 event support
- Production-harden qubestr relay
- Implement monitoring and metrics collection
- Complete integration testing

### Phase 2: Testnet Automation
- Build genesis deployment workflows
- Document deployment procedures

### Phase 3: Production Deployment & HC1 Internal Testing
- Deploy production relay infrastructure (qubestr) with redundancy and monitoring
- Deploy qube-manager as a service on HC1 HQZ Pillars
- Configure authorization with HC1 developer pubkeys
- Conduct internal testing with HC1 team:
- Test upgrade coordination workflows
- Test reboot/genesis deployment coordination
- Validate multi-signature quorum functionality
- Verify monitoring and alerting systems
- Refine operational procedures based on HC1 testing feedback

### Phase 4: Controlled External Rollout
- Publish comprehensive operator documentation and setup guides
- Open beta testing to select group of external Pillar operators
- Provide onboarding support and technical assistance
- Monitor coordination performance across diverse validator setups
- Gather feedback on usability and operational challenges
- Iterate on documentation and tooling based on external feedback
- Establish community governance procedures

### Phase 5: General Availability & Ecosystem Expansion
- Announce general availability to all Pillar operators
- Scale relay infrastructure to support growing validator set
- Support multiple concurrent testnets and sidechains
- Implement advanced features (staged rollouts, conditional actions, etc.)
- Continuous monitoring, optimization, and incident response

## Dependencies

- **go-zenon**: Core Zenon implementation for validator nodes
- **Nostr ecosystem**: NIP specifications and go-nostr library
- **PostgreSQL**: Event persistence for relay
- **Docker**: Containerized deployment infrastructure

## Related Repositories

- [`hypercore-one/qube-manager`](https://github.com/hypercore-one/qube-manager)
- [`coinselor/qubestr`](https://github.com/coinselor/qubestr/)
- [`zenon-network/go-zenon`](https://github.com/zenon-network/go-zenon)