Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
4441471
Implement KnowledgeGraphTripleDataset for relational reasoning
research-developer Oct 20, 2025
2b5f1f2
Add comprehensive tests for KnowledgeGraphTripleDataset
research-developer Oct 20, 2025
467d2ff
Add knowledge graph dataset example and visualization
research-developer Oct 20, 2025
46cdacc
Add evaluation metrics for knowledge graph reasoning
research-developer Oct 20, 2025
9128a19
Add implementation summary and documentation
research-developer Oct 20, 2025
e0acbc5
Merge main foundation components (NSM-17, NSM-15, NSM-16) into KG branch
research-developer Oct 20, 2025
e20e45e
Merge hierarchical model (pooling + WHY/WHAT) from main
research-developer Oct 20, 2025
a718fca
Merge branch 'main' into dataset-knowledge-graph
research-developer Oct 20, 2025
58d7ba0
Add NSM-23 training script and metrics for knowledge graph domain
research-developer Oct 20, 2025
b0bf9ea
Merge branch 'main' into dataset-knowledge-graph
research-developer Oct 20, 2025
62423f9
Fix KG dataset: Add balanced negative sampling (50/50 split)
research-developer Oct 20, 2025
ddc8ff8
Merge validation framework from main into dataset-knowledge-graph
research-developer Oct 20, 2025
9dacbe9
Update .gitignore to ignore generated files (NSM-28)
research-developer Oct 20, 2025
cb99aa3
Fix KG label shape to match test expectations
research-developer Oct 20, 2025
295d918
Enable 3-level hierarchy for knowledge graph domain (Phase 1.5)
research-developer Oct 20, 2025
64c4432
Improve 3-level hierarchy documentation and metrics
research-developer Oct 24, 2025
8812248
Merge main into dataset-kg-3level
research-developer Oct 24, 2025
3e2f6b6
Add merge verification test for single-pass and dual-pass modes
research-developer Oct 24, 2025
622468a
Make dual-pass mode the default for 3-level hierarchy
research-developer Oct 24, 2025
076b687
Fix critical issues identified in PR review
research-developer Oct 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ env/
*.pt
*.ckpt
checkpoints/
checkpoints/*/ # All subdirectories
*.pkl

# Jupyter
Expand Down Expand Up @@ -102,8 +103,15 @@ datasets/
!configs/*.yaml
!configs/*.yml

# Domain-specific data (generated at runtime)
data/causal/
data/kg/
data/planning/
!data/*/.gitkeep

# Results & Outputs
results/
results/*/ # All subdirectories
outputs/
figures/
plots/
Expand Down Expand Up @@ -131,4 +139,15 @@ Desktop.ini
.git/worktrees/

# Keep empty directories
!.gitkeep
!.gitkeep

# Branch-specific summary documents (NSM-26)
*_DATASET_SUMMARY.md
*_SUMMARY.md
*_ANALYSIS.md
*_FINDINGS.md
*_STATUS.md

# Auto-generated scripts (NSM-26)
experiments/run_*.sh
experiments/training_log.jsonl
245 changes: 245 additions & 0 deletions KG_IMPLEMENTATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
# Knowledge Graph Dataset Implementation Summary

## Overview

Successfully implemented a comprehensive Knowledge Graph dataset generator for NSM Phase 1 exploration (NSM-10). This is one of three parallel dataset explorations to empirically validate the best domain for 2-level hierarchical reasoning.

## What Was Implemented

### 1. KnowledgeGraphTripleDataset (`nsm/data/knowledge_graph_dataset.py`)

**Core Features:**
- Generates 20K synthetic triples across 5K entities
- 50+ predicate types spanning biographical, geographic, creative, and conceptual relations
- 2-level hierarchy: L1 (facts/instances) and L2 (types/categories)
- Confidence scores 0.5-1.0 for partial observability
- 6 entity categories: People, Places, Organizations, Concepts, Awards, Dates

**Entity Generation:**
- Named entities: Einstein, Curie, Newton, Paris, London, MIT, Harvard, etc.
- Rich biographical data: born_in, works_at, won, studied_at
- Geographic hierarchies: city → country → continent
- Type hierarchies: Person → Living_Being → Entity

**Reasoning Support:**
- Multi-hop query generation (2-hop paths)
- Type consistency checking pairs
- Analogical reasoning support
- Link prediction labels

### 2. Comprehensive Tests (`tests/data/test_kg_dataset.py`)

**21 Test Cases:**
- Dataset generation and initialization
- Triple structure validation
- Level distribution (L1 vs L2)
- Confidence variance and diversity
- Predicate type coverage (50+)
- Entity diversity and categories
- Named entity inclusion
- Multi-hop reasoning paths
- Type hierarchy validation
- PyG interface compliance
- Caching and reproducibility

**Test Results:**
- ✅ 21/21 tests passing
- ✅ 98% code coverage
- ✅ Reproducibility verified
- ✅ PyG DataLoader compatible

### 3. Example Script (`examples/knowledge_graph_example.py`)

**16 Demonstration Sections:**
1. Dataset creation and statistics
2. Sample triples (L1 and L2)
3. Predicate type distribution
4. Entity category breakdown
5. Named entity examples
6. PyG graph structure
7. Multi-hop reasoning queries
8. Type consistency pairs
9. Biographical reasoning chains
10. Type hierarchy display
11. Confidence distribution
12. PyG DataLoader batching
13. Reasoning pattern examples
14. Instance-of relations
15. Geographic chains
16. Professional relations

### 4. Evaluation Metrics (`nsm/evaluation/kg_metrics.py`)

**Metrics Implemented:**
- **Link Prediction:** Hits@K, MRR, Mean/Median Rank
- **Analogical Reasoning:** A:B :: C:D with vector arithmetic
- **Type Consistency:** Precision, Recall, F1, Confusion Matrix
- **Multi-hop Reasoning:** Exact match, Hits@K, Average Precision
- **Calibration:** ECE, MCE, Calibration Curves

## Design Decisions

### 1. Entity-Centric Knowledge Representation
**Rationale:** Knowledge graphs excel at entity relationships and type hierarchies, making them ideal for testing hierarchical abstraction in NSM.

### 2. 50+ Predicate Types
**Rationale:** Rich relation vocabulary enables diverse reasoning patterns and tests R-GCN's basis decomposition (NSM-17).

### 3. Confidence Variance (0.5-1.0)
**Rationale:** Partial observability tests NSM's confidence propagation (NSM-12) and provenance semiring implementation.

### 4. Named Entity Inclusion
**Rationale:** Real-world entities (Einstein, Paris) make debugging and interpretation easier during development.

### 5. Reproducible Generation with Seeds
**Rationale:** Essential for comparing across exploration branches (NSM-10, NSM-12, NSM-11).

## Integration Points

### With NSM-18 (PyG Infrastructure):
- ✅ Extends `BaseSemanticTripleDataset`
- ✅ Uses `GraphConstructor` for graph building
- ✅ Compatible with `TripleVocabulary`
- ✅ Returns PyG `Data` objects

### With NSM-17 (R-GCN):
- ✅ Edge types for 50+ predicates
- ✅ Confidence as edge attributes
- ✅ Typed relations ready for basis decomposition

### With NSM-12 (Confidence Exploration):
- ✅ Wide confidence range (0.5-1.0)
- ✅ Product semiring evaluation ready
- ✅ Calibration metrics implemented

### With NSM-14 (Training Loop):
- ✅ Link prediction labels
- ✅ Batch loading compatible
- ✅ Evaluation metrics ready

## Testing Results

```
======================== 21 passed, 3 warnings in 4.43s ========================

Coverage:
nsm/data/knowledge_graph_dataset.py: 98%
nsm/data/dataset.py: 69%

Key Metrics:
- 5000 triples generated
- 1298+ unique entities
- 66 predicates (50+ expected, extras from random generation)
- L1/L2 ratio: ~87%/13% (facts vs types)
- Average confidence: 0.77
```

## Commits Made

1. **4441471** - Implement KnowledgeGraphTripleDataset for relational reasoning
- Core dataset class with entity/predicate generation
- Multi-hop query support
- Type hierarchy implementation
- Fix for PyTorch 2.6 weights_only parameter

2. **2b5f1f2** - Add comprehensive tests for KnowledgeGraphTripleDataset
- 21 test cases covering all functionality
- 98% code coverage
- Reproducibility and caching tests

3. **467d2ff** - Add knowledge graph dataset example and visualization
- 16 demonstration sections
- Reasoning chain examples
- PyG DataLoader integration

4. **46cdacc** - Add evaluation metrics for knowledge graph reasoning
- Link prediction (Hits@K, MRR)
- Analogical reasoning
- Type consistency checking
- Calibration metrics (ECE/MCE)

## Next Steps for NSM-10 Evaluation

### Comparison Criteria (from CLAUDE.md):
1. **Task accuracy (40%):** Link prediction, type inference
2. **Calibration (20%):** ECE on confidence scores
3. **Multi-hop (20%):** 2-hop reasoning accuracy
4. **Interpretability (20%):** Debugging and explainability

### Evaluation Protocol:
```bash
# Run evaluation suite
python -m tests.evaluation_suite --dataset knowledge_graph --output results/kg.json

# Compare with other branches
python compare_results.py results/kg.json results/planning.json results/causal.json
```

### Expected Strengths:
- ✅ Rich predicate diversity (50+ types)
- ✅ Clear type hierarchies (instance_of, subclass_of)
- ✅ Multi-hop paths (2-hop queries)
- ✅ Entity-centric interpretability

### Potential Weaknesses:
- ⚠️ Less hierarchical structure than planning domain
- ⚠️ May need deeper hierarchies for full NSM evaluation
- ⚠️ Random relation generation may create noise

## Files Changed

```
nsm/data/knowledge_graph_dataset.py (new, 682 lines)
nsm/data/dataset.py (modified, +1 line for weights_only fix)
tests/data/test_kg_dataset.py (new, 394 lines)
examples/knowledge_graph_example.py (new, 216 lines)
nsm/evaluation/__init__.py (new, 13 lines)
nsm/evaluation/kg_metrics.py (new, 449 lines)
```

**Total:** 1,755 lines of new code

## Domain Properties

### Level 1 (Facts/Instances):
- **Biographical:** born_in, died_in, works_at, studied_at, won
- **Geographic:** located_in, capital_of, borders, adjacent_to
- **Creative:** created, authored, composed, designed, invented
- **Professional:** employed_by, founded, leads, member_of
- **Temporal:** occurred_in, started_on, ended_on

### Level 2 (Types/Categories):
- **Type hierarchy:** instance_of, subclass_of, category_of
- **Generalizations:** typically_has, usually_in, commonly_has
- **Abstract:** related_to, similar_to, implies, requires, enables

### Mathematical Foundation:
```
Knowledge Graph G = (E, R, T) where:
- E: Set of entities (5K)
- R: Set of typed relations (50+)
- T ⊆ E × R × E: Set of triples (20K)
- L: Level function L: T → {1, 2}
- C: Confidence function C: T → [0.5, 1.0]
```

## Conclusion

✅ **Implementation Complete**
- Fully functional KG dataset generator
- Comprehensive test coverage (21/21 passing)
- Rich evaluation metrics
- Ready for NSM-10 parallel exploration

✅ **NSM-18 Integration Verified**
- Compatible with BaseSemanticTripleDataset
- PyG Data objects working
- Vocabulary and graph construction validated

✅ **Ready for Evaluation**
- Evaluation metrics implemented
- Comparison protocol defined
- Documentation complete

**Branch:** dataset-knowledge-graph
**Status:** ✅ Ready for evaluation and PR (once NSM-10 exploration complete)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Branch-Specific Summary Document Not Ignored

This summary document should be ignored according to the .gitignore rules added in this same PR (line 146: "*_SUMMARY.md"). The file KG_IMPLEMENTATION_SUMMARY.md matches this pattern and should not be committed. This appears to be a branch-specific summary document that was meant to be excluded per NSM-26.

Fix in Cursor Fix in Web

Loading
Loading