revert: remove redundant retry functionality #5720

mldangelo · 2025-09-24T01:18:17Z

Summary

Remove the retry command and --retry-errors flag functionality that was added in #5647, as it is redundant with existing filtering capabilities.

Problem Analysis

The retry functionality provided two ways to re-run failed tests:

promptfoo retry <evalId> - retries ERROR results from a specific evaluation
promptfoo eval --retry-errors - retries ERROR results from latest evaluation

However, the same practical outcome can be achieved more simply with the existing:

promptfoo eval --filter-errors-only <evalId>

Why the existing approach is better

Feature	Retry Commands	Existing --filter-errors-only
Database modification	❌ Deletes ERROR results	✅ No database changes
Implementation complexity	❌ High (deletion + metric recalculation)	✅ Low (simple filtering)
Data integrity	❌ Risk of corruption	✅ Preserves historical data
Result tracking	❌ Replaces deleted results	✅ Creates new evaluation
Same practical outcome	✅ Re-runs error tests	✅ Re-runs error tests

Changes Made

✅ Removed src/commands/retry.ts (entire retry command implementation)
✅ Removed --retry-errors flag and logic from src/commands/eval.ts
✅ Removed retry command registration from src/main.ts
✅ Removed retry documentation from CLI docs
✅ Removed retry testing examples from examples/retry-testing/
✅ Reverted schema changes in site/static/config-schema.json
✅ Reverted type changes in src/types/index.ts

Quality Assurance

✅ All linting and formatting checks pass (npm run l && npm run f)
✅ TypeScript compilation successful (npm run tsc)
✅ All existing tests pass (npm test)
✅ No test changes needed (retry functionality had no dedicated tests)

Implications

Users who were relying on the retry commands can achieve the same result with:

Instead of:

promptfoo retry <evalId>
# or
promptfoo eval --retry-errors

Use:

promptfoo eval --filter-errors-only <evalId>

This approach provides the same functionality while being simpler, safer, and more consistent with the existing codebase architecture.

Remove retry command and --retry-errors flag as they are redundant with existing --filter-errors-only capability. The retry functionality provided two ways to re-run failed tests: 1. promptfoo retry <evalId> 2. promptfoo eval --retry-errors However, the same outcome can be achieved more simply with: promptfoo eval --filter-errors-only <evalId> The existing filter approach is preferable because: - Simpler implementation without database manipulation - No risk of corrupting evaluation data - Preserves evaluation history - Same practical result (re-running error tests) Reverted changes from commit ab6ae2b: - Removed src/commands/retry.ts - Removed --retry-errors flag and logic from eval command - Removed retry command registration from main.ts - Removed retry documentation - Removed retry testing examples - Reverted schema and type changes

use-tusk · 2025-09-24T01:18:27Z

⏩ No test execution environment matched (5c65cd8) View output ↗

View check history

Commit	Status	Output	Created (UTC)
`d46e3dc`	⏩ No test execution environment matched	Output	Sep 24, 2025 1:18AM
`5c65cd8`	⏩ No test execution environment matched	Output	Sep 29, 2025 6:21AM

View output in GitHub ↗

coderabbitai · 2025-09-24T01:26:24Z

📝 Walkthrough

Walkthrough

This change removes the retry-errors functionality across the codebase. It deletes the retry CLI command (src/commands/retry.ts) and its wiring (src/main.ts), removes the --retry-errors option from eval (src/commands/eval.ts), and updates related schemas/types (site/static/config-schema.json, src/types/index.ts). Documentation is updated to remove the option and “Retry Errors” section (site/docs/usage/command-line.md). The retry-testing example and its files are removed (examples/retry-testing/*). Eval command logic is adjusted to rely on resumeRaw for flow control and option application, replacing prior resumeEval checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "revert: remove redundant retry functionality" is concise, accurately summarizes the main change (removal of the retry command and --retry-errors flag), and clearly communicates the primary intent to reviewers.
Description Check	✅ Passed	The PR description provides a clear summary, rationale, affected files, and QA steps that match the raw summary of changes, so it is directly related to and explanatory of the changeset.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/revert-retry-functionality

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 995fde4 and d46e3dc.

📒 Files selected for processing (9)

examples/retry-testing/README.md (0 hunks)
examples/retry-testing/errorProvider.js (0 hunks)
examples/retry-testing/promptfooconfig.yaml (0 hunks)
site/docs/usage/command-line.md (0 hunks)
site/static/config-schema.json (0 hunks)
src/commands/eval.ts (5 hunks)
src/commands/retry.ts (0 hunks)
src/main.ts (0 hunks)
src/types/index.ts (0 hunks)

💤 Files with no reviewable changes (8)

site/static/config-schema.json
examples/retry-testing/promptfooconfig.yaml
site/docs/usage/command-line.md
examples/retry-testing/errorProvider.js
examples/retry-testing/README.md
src/types/index.ts
src/commands/retry.ts
src/main.ts

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Prefer not to introduce new TypeScript types; use existing interfaces whenever possible

**/*.{ts,tsx}: Follow consistent import order (Biome will handle sorting)
Use consistent curly braces for all control statements
Prefer const over let; avoid var
Use object shorthand syntax whenever possible
Use async/await for asynchronous code
Use consistent error handling with proper type checks

Files:

src/commands/eval.ts

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.606Z
Learning: Applies to site/src/pages/**/*.md : Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.606Z
Learning: Applies to site/blog/**/*.md : Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.606Z
Learning: Applies to site/src/pages/**/*.mdx : Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.

Learnt from: CR
PR: promptfoo/promptfoo#0
File: .cursor/rules/docusaurus.mdc:0-0
Timestamp: 2025-07-18T17:24:58.606Z
Learning: Applies to site/docs/**/*.md : Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.

🧬 Code graph analysis (1)

src/commands/eval.ts (1)

src/index.ts (1)

loadApiProvider (164-164)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: Redteam (Production API)
GitHub Check: Share Test
GitHub Check: Build Docs
GitHub Check: webui tests
GitHub Check: Redteam (Staging API)
GitHub Check: Run Integration Tests
GitHub Check: Test on Node 22.x and ubuntu-latest
GitHub Check: Test on Node 24.x and windows-latest
GitHub Check: Test on Node 24.x and ubuntu-latest
GitHub Check: Test on Node 22.x and macOS-latest
GitHub Check: Test on Node 22.x and windows-latest
GitHub Check: Test on Node 20.x and macOS-latest
GitHub Check: Test on Node 20.x and windows-latest
GitHub Check: Test on Node 20.x and ubuntu-latest
GitHub Check: Generate Assets
GitHub Check: Build on Node 24.x
GitHub Check: Build on Node 20.x
GitHub Check: Style Check
GitHub Check: Build on Node 22.x
GitHub Check: Analyze (javascript-typescript)

🔇 Additional comments (6)

src/commands/eval.ts (6)

292-304: Gating test filtering behind non-resume is correct

Skipping filters during resume preserves test indices and result alignment. Looks good.

316-321: Provider filtering correctly skipped during resume

Avoiding provider filtering on resume is consistent with preserving prior run topology. LGTM.

341-348: Grader provider injection conditioned on non-resume is reasonable

Setting defaultTest.options.provider only when not resuming avoids changing persisted behavior. LGTM.

349-355: Merging --var into defaultTest only when not resuming is correct

This prevents altering variable sets for resumed runs. LGTM.

356-358: generateSuggestions gating matches intent

Not enabling suggestions for a resumed run is consistent with reproducibility. LGTM.

699-765: Watch mode disabled during resume is appropriate

Avoids unexpected re-entry into a resumed run and aligns with reproducibility. LGTM.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mldangelo requested a review from typpo September 29, 2025 06:20

Merge branch 'main' into feature/revert-retry-functionality

5c65cd8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

revert: remove redundant retry functionality #5720

revert: remove redundant retry functionality #5720

mldangelo commented Sep 24, 2025

Uh oh!

use-tusk bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 24, 2025

Walkthrough

Estimated code review effort

Uh oh!

Uh oh!

Uh oh!

revert: remove redundant retry functionality #5720

Are you sure you want to change the base?

revert: remove redundant retry functionality #5720

Conversation

mldangelo commented Sep 24, 2025

Summary

Problem Analysis

Why the existing approach is better

Changes Made

Quality Assurance

Implications

Uh oh!

use-tusk bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Sep 24, 2025

Walkthrough

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

Uh oh!

use-tusk bot commented Sep 24, 2025 •

edited

Loading