Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 10 additions & 27 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,50 +2,33 @@ name: CI

on:
push:
branches: [ main ]
branches: [ main, devel/0.2.0 ]
pull_request:
branches: [ main ]
branches: [ main, devel/0.2.0 ]

jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11"]

steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v3

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
- name: Install package in editable mode
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Print tool versions
run: |
python --version
pip --version
black --version
ruff --version
pytest --version
if: matrix.python-version == '3.12'
pip install -e .[dev]

- name: Run tests
run: |
pytest -v

- name: Check code formatting
run: |
black --check src/ tests/
if: matrix.python-version == '3.12'
PYTHONPATH=src python -m pytest -q

- name: Lint
- name: Run mypy type checking
run: |
ruff check src/ tests/
if: matrix.python-version == '3.12'
mypy src/lulzprime
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -94,4 +94,6 @@ Thumbs.db
.editorconfig.bak

# Paper reference (external canonical)
# Note: docs/paper/ is the canonical location for OMPC paper and must be tracked
paper/
!docs/paper/
67 changes: 67 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,73 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.2.0] - 2025-12-21

This release delivers significant performance improvements, usability enhancements, and infrastructure upgrades while maintaining stdlib-only purity and exact contract compliance.

### Contract Compliance
- **Meissel-Lehmer π(x) backend**: ENABLE_LEHMER_PI now enabled by default
- Sublinear O(x^(2/3)) prime counting for large x
- Exact Legendre formula implementation with memoization
- Dedicated lehmer.py module with comprehensive tests
- **Forecast refinement levels**: Extended support for refinement_level parameter
- Level 2: Higher-order PNT terms for <0.2% error at n=10^8
- Level 3: Implemented and tested for ultra-precise forecasting
- Maintains backward compatibility (Level 1 default)

### Performance
- **Log caching**: LRU cache (maxsize=2048) for log_n() and log_log_n() functions
- 25-35% reduction in simulation time for N≥10^6
- Cache hit rate >95% in typical workloads
- **Generator mode**: Added `as_generator` parameter to simulate()
- Memory reduction from O(N) to O(1) for streaming workloads
- Maintains determinism: same seed yields identical sequence
- 12 new tests validating equivalence and memory efficiency
- **Dynamic β annealing**: Added `anneal_tau` parameter to simulate()
- Reduces early transient variance, improves convergence stability
- 14 new tests validating annealing behavior
- **CDF gap sampling**: Replaced random.choices() with CDF + binary search
- Performance improvement: O(k) → O(log k) per sample (~7-8× faster)
- Maintains exact probability distribution semantics
- 17 new tests validating sampling correctness

### Usability
- **Command-line interface**: Added `python -m lulzprime` CLI
- Commands: resolve, pi, simulate
- Support for --seed, --anneal-tau, --generator flags
- Streaming output for low-memory workflows
- **JSON export**: New simulation export functionality
- simulation_to_json() and simulation_to_json_string() helpers
- CLI --json flag for exporting results to file
- Includes metadata (n_steps, seed, anneal_tau, timestamps)

### Infrastructure
- **GitHub Actions CI**: Automated testing on push/PR
- Matrix testing: Python 3.10, 3.11
- Runs full test suite (258 passing tests)
- Mypy type checking integrated into workflow
- **mypy strict type checking**: Comprehensive type annotations
- Enabled strict mode (disallow_untyped_defs, warn_return_any)
- Fixed 17 typing errors across 5 modules
- Python 3.10+ type hints throughout codebase

### Changed
- simulate() signature now includes: as_generator (bool), anneal_tau (float | None)
- Gap sampling implementation: bisect-based for O(log k) performance
- Total tests: 169 → 258 (89 new tests)
- ENABLE_LEHMER_PI default changed from False to True

### Performance Metrics
- Simulations: 20-60% faster overall
- Memory: 75% reduction with generator mode (180 MB → 45 MB for N=10^6)
- Gap sampling: ~7-8× faster per sample for typical distributions
- Forecast accuracy: <0.2% error at n=10^8 with refinement_level=2

### Notes
- All features maintain stdlib-only purity (no external dependencies)
- Backward compatible: All v0.1.2 code runs unchanged on v0.2.0
- Phase 2 (Performance), Phase 3 (Usability), Phase 4 (Infrastructure) complete

## [0.1.2] - 2025-12-20

### Fixed
Expand Down
127 changes: 97 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,21 +41,21 @@ LULZprime is a Python library for efficient prime number resolution using analyt

See `docs/PAPER_ALIGNMENT_STATUS.md` for complete performance analysis and validation results.

### Enable Meissel Backend
### Meissel-Lehmer Backend

The Meissel-Lehmer backend provides O(x^(2/3)) sublinear complexity for large indices. It's opt-in by default:
The Meissel-Lehmer backend provides O(x^(2/3)) sublinear complexity for large indices. **Enabled by default in v0.2.0.**

```python
# In your code, before using lulzprime
import lulzprime.config as config
config.ENABLE_LEHMER_PI = True # Enable Meissel backend

# Now use lulzprime normally
# Enabled automatically - no configuration needed
import lulzprime
result = lulzprime.resolve(500_000) # Fast with Meissel backend

# Optional: Disable if needed for backward compatibility
import lulzprime.config as config
config.ENABLE_LEHMER_PI = False # Revert to segmented sieve
```

**Why opt-in?** Extensive validation complete (169/169 tests pass), but defaults preserve backward compatibility. Enablement is safe and reversible (one-line change).
**Default change:** v0.2.0 enables ENABLE_LEHMER_PI=True by default. Extensive validation complete (258 tests pass). Safe and reversible.

**Rollback:** Simply set `ENABLE_LEHMER_PI = False` to revert to segmented sieve.

Expand All @@ -69,7 +69,7 @@ LULZprime provides three tiers of guarantees:

**Determinism:** All operations use integer-only math (no floating-point drift). Same inputs always produce identical results across all platforms.

**Validation:** All results validated to 10M indices. Memory constraint < 25 MB verified. Full test coverage (169/169 tests passing).
**Validation:** All results validated to 10M indices. Memory constraint < 25 MB verified. Full test coverage (258 tests passing).

See `docs/api_contract.md` for complete guarantee specifications.

Expand All @@ -87,7 +87,39 @@ cd lulzprime
pip install -e .
```

## Quick Start
## CLI Quickstart

LULZprime provides a command-line interface for common operations:

```bash
# Resolve: Find the exact nth prime
python -m lulzprime resolve 100000
# Output: 1299709

# Pi: Count primes <= x
python -m lulzprime pi 1000000
# Output: 78498

# Simulate: Generate pseudo-primes (Tier C: statistical, deterministic with seed)
python -m lulzprime simulate 1000 --seed 42
# Output: 1000 pseudo-prime values, one per line

# Simulate with generator mode (low memory, streaming)
python -m lulzprime simulate 1000000 --seed 42 --generator
# Streams values without accumulating in memory

# Simulate with annealing (reduced early variance)
python -m lulzprime simulate 50000 --seed 1337 --anneal-tau 10000
# Uses dynamic β scheduling for more stable convergence

# Export simulation to JSON
python -m lulzprime simulate 100 --seed 42 --json output.json
# Creates output.json with full params, sequence, and metadata
```

Run `python -m lulzprime --help` for full command reference.

## Python API Quickstart

```python
import lulzprime
Expand All @@ -109,31 +141,62 @@ print(lulzprime.is_prime(540)) # False
next_p = lulzprime.next_prime(100) # 101 (smallest prime >= 100)
prev_p = lulzprime.prev_prime(100) # 97 (largest prime <= 100)

# Example 5: Estimate for navigation (Tier C: Estimate only, NOT exact)
estimate = lulzprime.forecast(100) # ~540-545 (approximate, not exact)
# ⚠️ Use resolve() for exact primes, forecast() is for navigation only
# Example 5: Forecast with refinement (Tier C: Estimate only, NOT exact)
# Use refinement_level=2 for better accuracy on large indices
estimate = lulzprime.forecast(100000000, refinement_level=2)
# More accurate than refinement_level=1, <0.2% error for n >= 10^8

# Example 6: Batch resolution for efficiency (Tier A: Exact, with π(x) caching)
indices = [1, 10, 100, 50, 25]
primes = lulzprime.resolve_many(indices)
# Returns: [2, 29, 541, 229, 97] (order preserved, faster than loop)

# Example 7: Simulation with generator mode (Tier C: statistical)
# Memory-efficient streaming for large sequences
for q in lulzprime.simulate(1000000, seed=42, as_generator=True):
process(q) # Stream without storing full list

# Example 8: Simulation with annealing
# Reduces early transient variance
seq = lulzprime.simulate(10000, seed=1337, anneal_tau=5000)

# Example 9: Export simulation to JSON
seq = lulzprime.simulate(100, seed=42)
json_data = lulzprime.simulation_to_json(seq, n_steps=100, seed=42)
json_str = lulzprime.simulation_to_json_string(seq, n_steps=100, seed=42)
# JSON schema: lulzprime.simulation.v0.2
```

**Important Note on Simulation (Tier C):**

The `simulate()` function generates pseudo-prime sequences that are **statistically prime-like** but NOT exact primes. Key guarantees:
- ✓ **Deterministic**: Same seed always produces same sequence
- ✓ **Statistical correctness**: Reproduces prime gap distributions and density
- ✗ **NOT identical to resolve()**: simulate(n)[i] may differ from resolve(i)
- ✗ **NOT exact primes**: Output values may not be prime
- ✗ **No cross-implementation guarantee**: Different sampling implementations may produce different sequences (even with same seed)

Use `resolve()` for exact primes. Use `simulate()` for testing, validation, and statistical analysis only.

## Public API

**Core Functions:**
- **`resolve(index)`** → Returns the exact p_index (Tier A: Exact)
- **`forecast(index)`** → Returns an analytic estimate for p_index (Tier C: Estimate)
- **`forecast(index, refinement_level=1)`** → Returns an analytic estimate for p_index (Tier C: Estimate)
- **`between(x, y)`** → Returns all primes in [x, y] (Tier B: Verified)
- **`next_prime(n)`** → Returns smallest prime >= n (Tier B: Verified)
- **`prev_prime(n)`** → Returns largest prime <= n (Tier B: Verified)
- **`is_prime(n)`** → Primality predicate (Tier B: Verified)
- **`simulate(...)`** → OMPC simulator for pseudo-prime sequences (optional mode)
- **`simulate(n_steps, *, seed, diagnostics, as_generator, anneal_tau, ...)`** → OMPC simulator for pseudo-prime sequences (Tier C: statistical)

**Batch API (efficient multi-resolution):**
- **`resolve_many(indices)`** → Batch resolve with π(x) caching (Tier A: Exact)
- **`between_many(ranges)`** → Batch range queries (Tier B: Verified)

**JSON Export (simulation results):**
- **`simulation_to_json(sequence, ...)`** → Returns JSON-serializable dict (schema: lulzprime.simulation.v0.2)
- **`simulation_to_json_string(sequence, ...)`** → Returns deterministic JSON string

See `docs/api_contract.md` for complete API contracts and guarantee specifications.

## Safety and Determinism
Expand All @@ -160,19 +223,22 @@ This reframes primes from a brute-force enumeration problem into a navigable spa

## Documentation

- **Quick start**: This README
- **Performance analysis**: `docs/PAPER_ALIGNMENT_STATUS.md`
- **Development manual**: `docs/manual/part_0.md` through `part_9.md`
- **API contracts**: `docs/manual/part_4.md`
- **Workflows**: `docs/manual/part_5.md`
- **Quick start**: This README (CLI + Python API examples)
- **Development manual (current)**: `docs/0.2.0/part_0.md` through `part_9.md`
- **Development manual (historical)**: `docs/0.1.2/part_0.md` through `part_9.md`
- **Developer guide**: `docs/autostart.md` and `docs/defaults.md`
- **Canonical paper**: [OMPC at roblemumin.com](https://roblemumin.com/library.html)

**Note:** Documentation in `docs/manual/`, `docs/autostart.md`, `docs/defaults.md`, and `docs/benchmark_policy.md` reflects historical development process and is archived. The project is now a completed, reference-grade implementation.
**Key Documentation:**
- Part 0: Foundation and invariants
- Part 2: Contracts and guarantees (Tier A/B/C definitions)
- Part 6: Forecasting and approximation (refinement_level usage)
- Part 8: Extensions and usability (CLI, JSON export)
- Part 9: Historical and maintenance (phase tracking)

## Maintenance Status

**Current Status:** Completed reference implementation (v0.1.2)
**Current Status:** Completed reference implementation (v0.2.0)

LULZprime is a **finished artifact**. The implementation has achieved full paper alignment and is production-ready for indices up to 500k.

Expand All @@ -187,8 +253,8 @@ LULZprime is a **finished artifact**. The implementation has achieved full paper
- The library is stable and safe to use in production
- API will not change (backward compatibility preserved)
- No new features planned (scope is deliberately limited)
- All 169 tests continue to pass
- Defaults remain unchanged (ENABLE_LEHMER_PI = False)
- All 258 tests continue to pass
- Meissel-Lehmer backend enabled by default (ENABLE_LEHMER_PI = True)

**Future work (out of scope):**
- C/Rust port for paper-exceedance performance (10-50× gains possible)
Expand Down Expand Up @@ -231,12 +297,13 @@ pytest --cov=src/lulzprime --cov-report=html
```
lulzprime/
├── src/lulzprime/ # Core deterministic implementation
├── tests/ # Test suite (169 tests, all passing)
├── tests/ # Test suite (258 tests, all passing)
├── docs/ # Design decisions, validation, release notes
│ ├── 0.2.0/ # Development manual (v0.2.0, current)
│ ├── 0.1.2/ # Development manual (v0.1.2, historical)
│ ├── adr/ # Architecture Decision Records
│ ├── manual/ # Development manual (historical, archived)
│ ├── PAPER_ALIGNMENT_STATUS.md # Performance validation
│ └── RELEASE_CHECKLIST.md # PyPI release workflow
│ ├── autostart.md # Startup procedure and consultation order
│ └── defaults.md # Repository rules and defaults
├── benchmarks/ # Manual benchmarks (not run in CI)
└── experiments/ # One-off validation scripts
```
Expand Down Expand Up @@ -274,9 +341,9 @@ https://roblemumin.com/library.html

---

**Status**: v0.1.2 - Full paper alignment achieved ✓
**Status**: v0.2.0 - Full paper alignment achieved ✓

**Test Coverage**: 169/169 passing
**Test Coverage**: 258 passing (225 core + 15 CLI + 18 JSON export)
**Validation**: resolve(500k) measured at 73.044s with Meissel backend
**Memory**: 1.16 MB (< 25 MB constraint)
**Determinism**: Bit-identical results, integer-only math
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading