Skip to content

Conversation

@Koopa0
Copy link
Owner

@Koopa0 Koopa0 commented Dec 22, 2025

Changes:

  • Replace 4 channels with single streamEvent union channel
  • Add FuzzPromptInjection test with 20+ attack vectors
  • Fix GetHistory sentinel error propagation (ErrSessionNotFound)
  • Add b.ReportAllocs() to all session benchmarks
  • Document homoglyph detection limitation in prompt.go
  • Update SQLC queries to use source_type column

  Changes:
  - Replace 4 channels with single streamEvent union channel
  - Add FuzzPromptInjection test with 20+ attack vectors
  - Fix GetHistory sentinel error propagation (ErrSessionNotFound)
  - Add b.ReportAllocs() to all session benchmarks
  - Document homoglyph detection limitation in prompt.go
  - Update SQLC queries to use source_type column
@Koopa0 Koopa0 merged commit f4bf4e6 into main Dec 22, 2025
6 of 8 checks passed
Comment on lines +44 to +47
model, err := tui.New(ctx, runtime.Flow, sessionID)
if err != nil {
return fmt.Errorf("failed to create TUI: %w", err)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tui.New allocates resources or has side effects, and an error occurs after model creation but before or during program.Run(), there is no explicit cleanup for the model. Consider ensuring that any resources allocated by the model are properly released in case of early returns due to errors.

Comment on lines 78 to 87
}
})

tuiModel := tui.New(ctx, runtime.Flow, "test-session-id")
tuiModel, err := tui.New(ctx, runtime.Flow, "test-session-id")
if err != nil {
t.Fatalf("Failed to create TUI: %v", err)
}

// Test /help command by simulating the message flow
// Note: Full TUI testing requires teatest or similar framework

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion in this test only checks that View().Content is not nil, which is a weak guarantee and may not detect regressions in TUI behavior. Consider asserting on specific expected content or properties of the view to ensure the TUI responds correctly to simulated commands.

Recommendation:
Replace:

if view.Content == nil {
    t.Error("View content should not be nil")
}

with a more robust assertion, e.g.:

if !strings.Contains(view.String(), "expected help text") {
    t.Error("/help command did not produce expected output")
}

This will improve test coverage and reliability.

-- Note: Only drop if no other schemas depend on it
-- ============================================================================

DROP EXTENSION IF EXISTS vector;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping the vector extension unconditionally with DROP EXTENSION IF EXISTS vector; may cause issues if other schemas or database objects depend on it. The script only provides a comment warning (lines 41-42), but does not enforce any dependency checks. Recommended solution: Before dropping the extension, implement logic to check for dependent objects or manually verify that no other parts of the database require the extension. Alternatively, consider removing the extension drop from the migration and handling extension management separately.

Comment on lines +29 to +30
CREATE INDEX idx_documents_metadata_gin
ON documents USING GIN (metadata jsonb_path_ops);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GIN index on the metadata column uses jsonb_path_ops, which only supports a limited set of queries and does not support the @> operator commonly used for containment queries. For broader compatibility and to ensure efficient use of the index with @> queries, consider using jsonb_ops instead:

CREATE INDEX idx_documents_metadata_gin
    ON documents USING GIN (metadata jsonb_ops);

This change will improve query performance and compatibility for typical JSONB containment queries.

Comment on lines +96 to +97
CREATE INDEX idx_message_content_gin
ON message USING GIN (content jsonb_path_ops);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GIN index on the content column uses jsonb_path_ops, which may not support all query patterns, especially those using the @> operator for containment. For maximum compatibility and performance with typical JSONB queries, use jsonb_ops:

CREATE INDEX idx_message_content_gin
    ON message USING GIN (content jsonb_ops);

This will ensure that queries like WHERE content @> ... can efficiently utilize the index.

Comment on lines 156 to 163
return nil
}

resp, err := framework.Agent.ExecuteStream(ctx, sessionID, branch,
resp, err := framework.Agent.ExecuteStream(ctx, sessionID,
"Write a very long story",
false,
callback,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context cancellation test (TestChatAgent_StreamingContextCancellation), the callback cancels the context after the first chunk, but the test does not assert that only one chunk is processed post-cancellation. If the streaming implementation does not respect context cancellation immediately, additional chunks could be processed, and the test would not detect this.

Recommendation: Add an assertion after streaming completes to verify that the number of processed chunks matches the expected value (e.g., assert.Equal(t, 1, chunks, "Should process only one chunk after cancellation")).

Comment on lines 78 to 86
defer cleanup()

t.Run("handles empty input gracefully", func(t *testing.T) {
ctx, sessionID, branch := newInvocationContext(context.Background(), framework.SessionID)
ctx, sessionID := newInvocationContext(context.Background(), framework.SessionID)

resp, err := framework.Agent.Execute(ctx, sessionID, branch, "")
resp, err := framework.Agent.Execute(ctx, sessionID, "")
// Should handle empty input without crashing
// Either returns error or empty response
if err == nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling test for empty input allows both error and non-error cases to pass:

if err == nil {
    assert.NotNil(t, resp)
}

This makes the test non-deterministic and may mask unexpected behavior. It is recommended to explicitly assert the expected outcome (either an error or a specific response) to ensure the agent's behavior is well-defined and the test is reliable.

Recommended solution:
Decide on the expected behavior for empty input (e.g., should always return an error, or always return a specific response), and assert accordingly:

require.Error(t, err)

or

require.NoError(t, err)
assert.NotNil(t, resp)
assert.NotEmpty(t, resp.FinalText)

Comment on lines 163 to 169
var wg sync.WaitGroup
wg.Add(numConcurrentQueries)

ctx, sessionID, branch := newInvocationContext(context.Background(), framework.SessionID)
ctx, sessionID := newInvocationContext(context.Background(), framework.SessionID)

// Collect results safely
type result struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the concurrent execution test, the same ctx and sessionID are shared across all goroutines. If the underlying agent or session store is not designed to be thread-safe, this could lead to data races or inconsistent state during concurrent access.

Recommended solution:
Verify that the agent and session store are thread-safe for concurrent use. If not, consider isolating sessionIDs per goroutine or using a session store implementation that guarantees safe concurrent access.

Comment on lines +29 to +52
func retryableError(err error) bool {
if err == nil {
return false
}

errStr := err.Error()

// Rate limit errors - always retry
if containsAny(errStr, "rate limit", "quota exceeded", "429") {
return true
}

// Transient server errors - retry
if containsAny(errStr, "500", "502", "503", "504", "unavailable") {
return true
}

// Network errors - retry
if containsAny(errStr, "connection reset", "timeout", "temporary") {
return true
}

return false
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brittle error classification in retryableError

The retryableError function uses substring matching on error messages to determine retryability. This approach is fragile and may not reliably capture all transient or retryable errors, especially if error messages change or are localized. It may also misclassify errors, leading to either unnecessary retries or missed retry opportunities.

Recommended solution:

  • Where possible, use error types or error wrapping (e.g., errors.Is, errors.As) to classify errors more robustly.
  • If substring matching is necessary, consider centralizing error codes or patterns and documenting them clearly.

Comment on lines +82 to +86
if c.rateLimiter != nil {
if err := c.rateLimiter.Wait(ctx); err != nil {
return nil, fmt.Errorf("rate limit wait: %w", err)
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context cancellation not checked before rate limiter wait

In executeWithRetry, the context is only checked for cancellation during the backoff sleep (lines 119-124), but not before calling c.rateLimiter.Wait(ctx) or before executing the prompt. This can result in unnecessary waiting or prompt execution after the context has been canceled, wasting resources.

Recommended solution:

  • Check ctx.Err() before entering the rate limiter and before executing the prompt, and return early if the context is already canceled.

@Koopa0 Koopa0 deleted the refactor/arch branch December 22, 2025 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants