Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 137 additions & 61 deletions Cargo.lock

Large diffs are not rendered by default.

228 changes: 228 additions & 0 deletions PROPTEST_FINDINGS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
# Proptest Findings

This document tracks all issues discovered through property-based testing with proptest. Each finding will eventually be converted into a Linear ticket for tracking and resolution.

## Purpose

This file tracks bugs discovered by proptest to enable organizing fixes into separate jj changes:
- **Change 1**: Test infrastructure (proptest setup, generators, property tests)
- **Change 2+**: Individual bug fixes (one change per bug)

Each bug has a corresponding `#[ignore]` test case in the code that will be enabled when the bug is fixed.

## Table of Contents
- [ClickHouse Type Parser](#clickhouse-type-parser)
- [SQL Parser](#sql-parser)
- [String Parsers](#string-parsers)
- [Summary Statistics](#summary-statistics)

---

## ClickHouse Type Parser

Issues found in `apps/framework-cli/src/infrastructure/olap/clickhouse/type_parser.rs`

### Status Legend
- 🔴 **Critical**: Causes crashes, data corruption, or security issues
- 🟡 **High**: Incorrect parsing/serialization, affects functionality
- 🟢 **Medium**: Edge cases, minor inconsistencies
- ⚪ **Low**: Documentation, optimization opportunities

### Issue #1: JSON Type with Empty Parameters Fails Roundtrip ✅ FIXED

**Severity**: 🟡 High
**Parser**: ClickHouse Type Parser
**Date Found**: 2025-11-12
**Date Fixed**: 2025-11-12
**Status**: ✅ **RESOLVED**

**Failing Input** (minimized by proptest):
```rust
ClickHouseTypeNode::JSON(Some([]))
```

**Original Problem**:
- Serialization: `JSON(Some([]))` → `"JSON"`
- Parsing: `"JSON"` → `JSON(None)`
- Result: `JSON(Some([])) != JSON(None)` ✗ (roundtrip failed)

**Fix Applied**:
Implemented **Option C**: Custom `PartialEq` implementation that treats `JSON(Some([]))` and `JSON(None)` as semantically equivalent.

**Implementation** (type_parser.rs:342-382):
```rust
impl PartialEq for ClickHouseTypeNode {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
// Normalize JSON(Some([])) to JSON(None) for comparison
(Self::JSON(Some(params1)), Self::JSON(Some(params2))) if params1.is_empty() && params2.is_empty() => true,
(Self::JSON(Some(params)), Self::JSON(None)) | (Self::JSON(None), Self::JSON(Some(params))) if params.is_empty() => true,
// ... rest of equality checks
}
}
}
```

**Verification**:
- ✅ `test_regression_json_empty_params_roundtrip()` passes
- ✅ `test_roundtrip_property()` passes with 1000 test cases
- ✅ All other type parser tests pass

**Linear Ticket**: [To be created]

---

## SQL Parser

Issues found in `apps/framework-cli/src/infrastructure/olap/clickhouse/sql_parser.rs`

---

**✅ All tests passed! No issues found.**

The SQL parser's manual extraction functions (`extract_engine_from_create_table`, `extract_table_settings_from_create_table`, `extract_sample_by_from_create_table`, `extract_indexes_from_create_table`) handle edge cases correctly:
- No panics on arbitrary strings
- Correct nested parentheses handling
- Proper string escape handling
- Keyword termination works as expected

---

## String Parsers

Issues found in:
- `apps/framework-cli/src/framework/versions.rs`
- `apps/framework-cli/src/framework/scripts/utils.rs`
- Other string parsing functions

### Issue #2: parse_timeout_to_seconds Panics on Multi-byte UTF-8 Characters ✅ FIXED

**Severity**: 🔴 Critical
**Parser**: `parse_timeout_to_seconds` in `apps/framework-cli/src/framework/scripts/utils.rs`
**Date Found**: 2025-11-12
**Date Fixed**: 2025-11-12
**Status**: ✅ **RESOLVED**

**Failing Input** (minimized by proptest):
```rust
parse_timeout_to_seconds("®") // Previously panicked!
```

**Original Problem**:
- Function used byte-level `.split_at(timeout.len() - 1)`
- For multi-byte UTF-8 like "®" (2 bytes), this tried to split in the middle of the character
- Result: **Panic** with "byte index 1 is not a char boundary"

**Fix Applied**:
Implemented **character-aware string slicing** using `.chars().last()` and `.len_utf8()`.

**Implementation** (utils.rs:60-67):
```rust
// Use character-aware slicing to handle multi-byte UTF-8 characters correctly
let unit_char = timeout
.chars()
.last()
.ok_or_else(|| TemporalExecutionError::TimeoutError("Timeout string is empty".to_string()))?;

// Get the byte index where the last character starts
let value_str = &timeout[..timeout.len() - unit_char.len_utf8()];
```

**Why This Works**:
- `.chars().last()` correctly extracts the last Unicode character
- `.len_utf8()` returns the correct byte length of that character (1-4 bytes)
- Slicing is done at proper UTF-8 boundaries, never in the middle of a character

**Verification**:
- ✅ `test_regression_timeout_multibyte_utf8()` passes (tests "®" specifically)
- ✅ `test_parse_timeout_never_panics()` passes with 1000 arbitrary UTF-8 strings
- ✅ `test_timeout_valid_formats()` passes (ensures valid inputs still work)
- ✅ No panics on any input, only proper errors

**Impact Resolution**:
- ✅ No more panics on multi-byte UTF-8 input
- ✅ Function now safely returns errors for invalid input
- ✅ Follows Rust error handling best practices

**Linear Ticket**: [To be created]

---

### Version Parser Results

**✅ All tests passed! No issues found.**

The version parser (`parse_version`, `version_to_string`, `Version` type) handles edge cases correctly:
- No panics on arbitrary strings
- Roundtrip property holds
- Comparison is consistent
- `as_suffix` works correctly

### Schedule Parser Results

**✅ All tests passed! No issues found.**

The schedule parser (`parse_schedule`) handles edge cases correctly:
- No panics on arbitrary strings
- Minute/hour formats produce correct cron expressions
- Cron expressions are preserved

---

## Summary Statistics

| Parser | Tests Run | Issues Found | Fixed | Critical | High | Medium | Low |
|--------|-----------|--------------|-------|----------|------|--------|-----|
| ClickHouse Type Parser | 5 tests | 1 | ✅ 1 | 0 | 0 | 0 | 0 |
| SQL Parser | 7 tests | 0 | - | 0 | 0 | 0 | 0 |
| String Parsers | 12 tests | 1 | ✅ 1 | 0 | 0 | 0 | 0 |
| **Total** | **24 tests** | **2** | **✅ 2** | **0** | **0** | **0** | **0** |

**All discovered bugs have been fixed! 🎉**

---

## Finding Template

Use this template when documenting new findings:

```markdown
### Issue #N: [Brief Description]

**Severity**: 🔴/🟡/🟢/⚪
**Parser**: [Parser name]
**Date Found**: YYYY-MM-DD

**Failing Input** (minimized by proptest):
\`\`\`
[minimal failing test case]
\`\`\`

**Expected Behavior**:
[What should happen]

**Actual Behavior**:
[What actually happens]

**Reproduction**:
\`\`\`rust
// Minimal test case to reproduce
\`\`\`

**Impact**:
[How this affects users/system]

**Proposed Fix**:
[Brief description of potential solution]

**Linear Ticket**: [To be created]
```

---

## Notes

- All findings are discovered through automated property-based testing
- Proptest automatically minimizes failing test cases to their simplest form
- Tests are configured in `apps/framework-cli/proptest.toml`
- Test code is colocated with parser implementations in their respective files
2 changes: 2 additions & 0 deletions apps/framework-cli/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ assert_fs = "1.0.13"
predicates = "3.0.4"
reqwest = { version = "0.12", features = ["blocking", "json"] }
serial_test = "3.1.1"
proptest = "1.4"
proptest-derive = "0.4"

[build-dependencies]
protobuf-codegen = "3.7"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Seeds for failure cases proptest has generated in the past. It is
# automatically read and these particular cases re-run before any
# novel cases are generated.
#
# It is recommended to check this file in to source control so that
# everyone who runs the test benefits from these saved cases.
cc 5a68274d7b26fe9918b1ad0c8384af85108124bc20aa488d1ec0ea40d229ae80 # shrinks to s = "®"
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Seeds for failure cases proptest has generated in the past. It is
# automatically read and these particular cases re-run before any
# novel cases are generated.
#
# It is recommended to check this file in to source control so that
# everyone who runs the test benefits from these saved cases.
cc 49804330e6626508a770bfa9454f4e37af23e6058a59b118d05ea40635cb2ce0 # shrinks to node = JSON(Some([]))
26 changes: 26 additions & 0 deletions apps/framework-cli/proptest.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Proptest configuration for framework-cli tests
# See: https://altsysrq.github.io/proptest-book/proptest/tutorial/config.html

# Number of test cases to generate (per test)
# Higher values find more bugs but take longer
# CI should override this with PROPTEST_CASES env var
cases = 1000

# Maximum shrink iterations when a test fails
# Proptest will try to find the minimal failing case
max_shrink_iters = 10000

# Maximum time to spend shrinking (in milliseconds)
max_shrink_time = 60000 # 1 minute

# Timeout for each individual test case (in milliseconds)
timeout = 5000 # 5 seconds per test case

# RNG seed source
# "default" uses a random seed, "deterministic" uses a fixed seed
# Use deterministic for reproducible test runs
source_file = "default"

# Whether to persist failed test cases
# If true, failed cases are stored and replayed on subsequent runs
persist_failed = true
Loading
Loading