numinouslabs · niels-ma · Feb 26, 2026 · Feb 25, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,9 @@
 # Release Notes
 
+## [2.1.1] - 2026-02-25
+- **Sandbox**: Increase agent sandbox execution timeout to 240 seconds
+- **Integration**: Vericore API integration
+
 ## [2.1.0] - 2026-02-09
 - **Bittensor Upgrade**: Upgraded to Bittensor version 10.1.0
 - **Bittensor CLI Upgrade**: Upgraded to Bittensor CLI version 9.18.0

diff --git a/README.md b/README.md
@@ -1,18 +1,18 @@
 <div align="center">
 
-# **Numinous** 
+# **Numinous**
 
 
 
 [Discord](https://discord.gg/qKPeYPc3) • [Dashboard](https://app.hex.tech/1644b22a-abe5-4113-9d5f-3ad05e4a8de7/app/Numinous-031erYRYSssIrH3W3KcyHg/latest) • [Website](https://numinouslabs.io/) • [Twitter](https://x.com/numinous_ai) •
-[Network](https://taostats.io/subnets/6/chart) 
+[Network](https://taostats.io/subnets/6/chart)
 ---
 
 </div>
 
 ## Introduction
 
-Numinous (Subnet 6) is a **forecasting protocol** whose goal is to aggregate agents into **superhuman LLM forecasters**. The key principle is that instead of scoring predictions ($f(X)$) the subnet scores the underlying agentic models ($X$). 
+Numinous (Subnet 6) is a **forecasting protocol** whose goal is to aggregate agents into **superhuman LLM forecasters**. The key principle is that instead of scoring predictions ($f(X)$) the subnet scores the underlying agentic models ($X$).
 
 
 Miners send forecasting agents which are subsequently evaluated by validators in sandboxes with access to a curated set of tools and data. **Agent execution and code are entirely visible to the subnet protocol.**
@@ -37,7 +37,7 @@ Validators spin up parallel sandboxes where miners are evaluated on batches of e
 ### Key Components
 
   * **The Sandbox:** Isolated execution environment with strict resource limits.
-  * **The Gateway:** A signing proxy allowing agents to access **Chutes (SN64)** for compute, **Desearch (SN22)** for live data, and **OpenAI** for GPT-5 models without exposing validator keys.
+  * **The Gateway:** A signing proxy allowing agents to access **Chutes (SN64)** for compute, **Desearch (SN22)** for live data, **OpenAI** for GPT-5 models, and **Vericore** for statement verification without exposing validator keys.
   * **Forecasting logic:** Agents execute once per event; only agent which were registered prior to broadcasting execute.
 
 📖 **[Read the full system architecture](docs/architecture.md)**
@@ -50,7 +50,7 @@ To survive in the Numinous arena, agents must adhere to strict constraints. Viol
 
 ### Execution Rules
 
-1.  **Timeout:** Execution must complete within **210 seconds**.
+1.  **Timeout:** Execution must complete within **240 seconds**.
 2.  **Cost:** API usage limits depend on each service and are paid by the miner.
 3.  **Caching:** Do not use dynamic timestamps or random seeds in prompts. This would break our caching system making agent executions differ between validators.
 4.  **Activation:** Code submitted before **00:00 UTC** activates the following day. You can update your code at most once every 3 days.

diff --git a/docker-compose.prd.yaml b/docker-compose.prd.yaml
@@ -31,7 +31,7 @@ services:
 
     # Production configuration
     command: >
-      bash -c "python neurons/validator.py --netuid 6 --subtensor.network finney --wallet.name ifkey --wallet.hotkey ifhkey --db.directory /root/infinite_games/database --numinous.env prod --sandbox.max_concurrent 50 --sandbox.timeout_seconds 210 --validator.sync_hour 0 --logging.debug"
+      bash -c "python neurons/validator.py --netuid 6 --subtensor.network finney --wallet.name ifkey --wallet.hotkey ifhkey --db.directory /root/infinite_games/database --numinous.env prod --sandbox.max_concurrent 50 --sandbox.timeout_seconds 240 --validator.sync_hour 0 --logging.debug"
 
     logging:
       driver: "json-file"

diff --git a/docker-compose.validator.yaml b/docker-compose.validator.yaml
@@ -30,7 +30,7 @@ services:
       - HOST_WALLET_PATH=${HOST_WALLET_PATH:-${HOME}/.bittensor/wallets}
 
     command: >
-      bash -c "python neurons/validator.py --netuid 6 --subtensor.network finney --wallet.name ${WALLET_NAME} --wallet.hotkey ${WALLET_HOTKEY} --db.directory /root/infinite_games/database --numinous.env prod --sandbox.max_concurrent 50 --sandbox.timeout_seconds 210 --logging.debug"
+      bash -c "python neurons/validator.py --netuid 6 --subtensor.network finney --wallet.name ${WALLET_NAME} --wallet.hotkey ${WALLET_HOTKEY} --db.directory /root/infinite_games/database --numinous.env prod --sandbox.max_concurrent 50 --sandbox.timeout_seconds 240 --logging.debug"
 
     logging:
       driver: "json-file"

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -59,7 +59,7 @@ Validators continuously:
 - Fetch new prediction events
 - Download and execute miner agent code in sandboxes
 - Calculate an average Brier scores upon event resolutions
-- Update subnet weights on the Bittensor chain 
+- Update subnet weights on the Bittensor chain
 
 **Process Flow:**
 ```
@@ -76,7 +76,7 @@ The validators spin up 50 parallel sandboxes where 50 miners are evaluated on th
 
 Agents run in isolated Docker containers with:
 - No internet access
-- 210s execution timeout
+- 240s execution timeout
 - Limited CPU/memory
 - Access to a defined set of external APIs via a signing proxy
 - Cost limits that depend on each service (paid by miner)
@@ -161,25 +161,25 @@ For a binary event $E_q$, an agent $i$ sends a prediction $p_i$ for the probabil
 - $o_q = 0$ otherwise.
 
 The Brier score $S(p_i, o_q)$ for the prediction is given by:
-- **If $o_q = 1$:**  
-  
+- **If $o_q = 1$:**
+
   $$S(p_i, 1) = (1 - p_i)^2$$
-  
-- **If $o_q = 0$:**  
+
+- **If $o_q = 0$:**
   $$S(p_i, 0) = p_i^2.$$
 
-The lower the score the better. This strictly proper scoring rule incentivizes miners to report their true beliefs. 
+The lower the score the better. This strictly proper scoring rule incentivizes miners to report their true beliefs.
 
 ## Scoring Process
 
-1. A batch of binary events resolves 
-2. We calculate the Brier score for each miner's prediction 
+1. A batch of binary events resolves
+2. We calculate the Brier score for each miner's prediction
 3. We average the Brier scores across all the events in the batch
 4. Winner-take-all: the miner with the lowest Brier score on one batch gets all the rewards
 
-**Window based Scoring** All the events batches are 3 days batches and are generated daily. They contain approximately 100 events each. The score of a miner at any given time is a function of the latest event batch which resolved. The immunity period has a length of 7 days thus when a miner registers it is only scored once within the immunity period. 
+**Window based Scoring** All the events batches are 3 days batches and are generated daily. They contain approximately 100 events each. The score of a miner at any given time is a function of the latest event batch which resolved. The immunity period has a length of 7 days thus when a miner registers it is only scored once within the immunity period.
 
-**Spot scoring** We only consider one prediction per miner. In the future as the network capacity improves we might move to a scoring which weights multiple predictions per miners. **Currently, only agents which were activated prior to a given event being broadcasted will forecast this event.** This means that on a given event all the miners which forecasted that event did so roughly at the same time. 
+**Spot scoring** We only consider one prediction per miner. In the future as the network capacity improves we might move to a scoring which weights multiple predictions per miners. **Currently, only agents which were activated prior to a given event being broadcasted will forecast this event.** This means that on a given event all the miners which forecasted that event did so roughly at the same time.
 
 ---
 
@@ -210,7 +210,7 @@ def agent_main(event_data: dict) -> dict:
 ## Constraints
 
 - Max code size: 2MB
-- Execution timeout: 210s
+- Execution timeout: 240s
 - No direct internet access (must use gateway for external APIs)
 - Available libraries: see sandbox requirements
 

diff --git a/docs/gateway-guide.md b/docs/gateway-guide.md
@@ -9,10 +9,11 @@ The Gateway API provides miner agents with access to external services during sa
 - **Desearch AI**: Web search, social media search, and content crawling
 - **OpenAI**: GPT-5 series models with built-in web search
 - **Perplexity**: Reasoning LLMs with built-in web search
+- **Vericore**: Statement verification with evidence-based metrics
 
 All requests are cached to optimize performance and reduce costs.
 
-**Cost Limits:** $0.01 (default) or $0.10 (linked account) per sandbox run for Chutes and Desearch. OpenAI: $1.00 per run (requires linked account, no free tier). Perplexity: $0.10 per run (requires linked account, no free tier).
+**Cost Limits:** $0.01 (default) or $0.10 (linked account) per sandbox run for Chutes and Desearch. OpenAI: $1.00 per run (requires linked account, no free tier). Perplexity: $0.10 per run (requires linked account, no free tier). Vericore: $0.10 per run (requires linked account, no free tier).
 
 **Security:** API keys are securely stored using external secret management and never exposed to validators.
 
@@ -1052,6 +1053,127 @@ print(f"Sources: {citations}")
 
 ---
 
+## Vericore Endpoints
+
+Vericore provides statement verification with evidence-based metrics including sentiment, conviction, source credibility, and more.
+
+### POST /api/gateway/vericore/calculate-rating
+
+Verify a statement against web evidence and get detailed metrics.
+
+**URL:** `{SANDBOX_PROXY_URL}/api/gateway/vericore/calculate-rating`
+
+**Request Body:**
+```json
+{
+  "run_id": "550e8400-e29b-41d4-a716-446655440000",
+  "statement": "Bitcoin will reach $100k by end of 2026",
+  "generate_preview": false
+}
+```
+
+**Parameters:**
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `run_id` | string (UUID) | Yes | - | Execution tracking ID from environment |
+| `statement` | string | Yes | - | Statement to verify against web evidence |
+| `generate_preview` | boolean | No | false | Generate a preview URL for the results |
+
+**Response:**
+```json
+{
+  "batch_id": "mlzjxglo15m23k",
+  "request_id": "req-mlzjxgmc4amr6",
+  "preview_url": "",
+  "evidence_summary": {
+    "total_count": 12,
+    "neutral": 37.5,
+    "entailment": 1.03,
+    "contradiction": 61.46,
+    "sentiment": -0.07,
+    "conviction": 0.82,
+    "source_credibility": 0.93,
+    "narrative_momentum": 0.48,
+    "risk_reward_sentiment": -0.15,
+    "political_leaning": 0.0,
+    "catalyst_detection": 0.12,
+    "statements": [
+      {
+        "statement": "Evidence text from source...",
+        "url": "https://example.com/article",
+        "contradiction": 0.87,
+        "neutral": 0.12,
+        "entailment": 0.01,
+        "sentiment": -0.5,
+        "conviction": 0.75,
+        "source_credibility": 0.85,
+        "narrative_momentum": 0.5,
+        "risk_reward_sentiment": -0.5,
+        "political_leaning": 0.0,
+        "catalyst_detection": 0.3
+      }
+    ]
+  },
+  "cost": 0.05
+}
+```
+
+**Response Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `batch_id` | string | Batch identifier |
+| `request_id` | string | Request identifier |
+| `preview_url` | string | Preview URL (empty if `generate_preview` is false) |
+| `evidence_summary.total_count` | integer | Number of evidence sources found |
+| `evidence_summary.entailment` | float | Aggregated entailment score |
+| `evidence_summary.contradiction` | float | Aggregated contradiction score |
+| `evidence_summary.sentiment` | float | Aggregated sentiment (-1.0 to 1.0) |
+| `evidence_summary.conviction` | float | Aggregated conviction level |
+| `evidence_summary.source_credibility` | float | Average source credibility |
+| `evidence_summary.statements` | array | Individual evidence sources with per-source metrics |
+
+**Example (using httpx):**
+```python
+import os
+import httpx
+
+PROXY_URL = os.getenv("SANDBOX_PROXY_URL")
+RUN_ID = os.getenv("RUN_ID")
+
+response = httpx.post(
+    f"{PROXY_URL}/api/gateway/vericore/calculate-rating",
+    json={
+        "run_id": RUN_ID,
+        "statement": "Bitcoin will reach $100k by end of 2026",
+    },
+    timeout=120.0,
+)
+
+result = response.json()
+
+summary = result["evidence_summary"]
+total = summary["total_count"]
+contradiction = summary["contradiction"]
+sentiment = summary["sentiment"]
+conviction = summary["conviction"]
+credibility = summary["source_credibility"]
+```
+
+**Error Handling:**
+
+| Status Code | Description | Recommended Action |
+|-------------|-------------|-------------------|
+| 503 | Service Unavailable | Retry with exponential backoff |
+| 429 | Rate limit exceeded | Retry with exponential backoff |
+| 401 | Authentication failed | Contact validator |
+| 500 | Internal server error | Retry with fallback |
+
+**Note:** Vericore has no free tier. You must link your API key to use Vericore. Each call costs $0.05.
+
+---
+
 ## Caching
 
 The gateway implements request-level caching to increase consensus stabilit among validators, optimize performance, reduce API costs.
@@ -1066,7 +1188,7 @@ The gateway implements request-level caching to increase consensus stabilit amon
 - The `run_id` field is excluded from cache key calculation
 - This means identical requests from different executions hit the same cache
 
-This is crucial to increase the consensus stability per validator given the variance of LLMs when hit twice with the same prompt. 
+This is crucial to increase the consensus stability per validator given the variance of LLMs when hit twice with the same prompt.
 
 **Prompt rules**. Use consistent prompts across executions to ensure that the cache is hit. In practice, **DO NOT** include dynamic timestamps or random data in prompts.
 
@@ -1165,14 +1287,14 @@ def query_llm_with_retry(prompt: str, max_retries: int = 3) -> Optional[str]:
 
 ### Timeout Management
 
-Plan your execution time to stay within the 210-second sandbox limit:
+Plan your execution time to stay within the 240-second sandbox limit:
 
 ```python
 import time
 
 start_time = time.time()
 timeout_buffer = 10  # seconds
-max_time = 200  # 210s limit - 10s buffer
+max_time = 230  # 240s limit - 10s buffer
 
 def time_remaining():
     elapsed = time.time() - start_time