Flaky test: test_split_independence in planning dataset tests

## Description

The test `TestPlanningTripleDataset::test_split_independence` in `tests/data/test_planning_dataset.py` is flaky due to random seed sensitivity in dataset splitting.

## Error

```
FAILED tests/data/test_planning_dataset.py::TestPlanningTripleDataset::test_split_independence
AssertionError: Train ratio 0.5711529184756392 not ~0.7
assert 0.65 < 0.5711529184756392
```

## Root Cause

The test expects exact split ratios (train ~0.7, val ~0.15, test ~0.15) but random sampling can cause variance. The test uses:

```python
assert 0.65 < train_ratio < 0.75, f"Train ratio {train_ratio} not ~0.7"
```

However, with small dataset sizes or insufficient random seeding, the actual ratio can fall outside this range.

## Impact

- **Severity**: Low
- **Scope**: Test infrastructure only
- **User Impact**: None - does not affect production code
- **Merge Impact**: Does not block merges (24/25 tests pass)

## Discovered In

- **PR**: #18 (Phase 1b.1: Sync dataset-planning with main infrastructure)
- **Context**: Branch merge validation testing
- **Test Command**: `pytest tests/data/test_planning_dataset.py -v`

## Suggested Fixes

1. **Widen tolerance**: Change assertions to allow more variance
   ```python
   assert 0.60 < train_ratio < 0.80, f"Train ratio {train_ratio} not ~0.7"
   ```

2. **Fix random seed**: Ensure deterministic seeding before split
   ```python
   random.seed(42)
   torch.manual_seed(42)
   ```

3. **Use larger sample**: Increase dataset size for split test to reduce variance

4. **Statistical approach**: Use confidence intervals instead of hard thresholds

## Related

- Tests: `tests/data/test_planning_dataset.py::TestPlanningTripleDataset::test_split_independence`
- Merge PRs: #16, #17, #18, #19

## Labels

- bug
- tests
- flaky-test
- low-priority

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky test: test_split_independence in planning dataset tests #20

Description

Error

Root Cause

Impact

Discovered In

Suggested Fixes

Related

Labels

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Flaky test: test_split_independence in planning dataset tests #20

Description

Description

Error

Root Cause

Impact

Discovered In

Suggested Fixes

Related

Labels

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions