ML Intern Backlog Prioritization
Generated: 2026-05-04T13:18:56.926357+00:00
Model: anthropic/claude-opus-4-6
Sources: github_issue=24, github_pr=50, hf_discussion=14
Summary
Critical security PRs (#96 , #85 , #83 ) and agent-quality fix (#88 ) are reviewed and ready to merge. Hosted Space has multiple reliability regressions (session zombies, Pro detection, onboarding CSS, blank pages) blocking paying users. Provider expansion (local models, Bedrock, Azure, Gemini) is the top feature theme with 6+ PRs. Cost guardrails and session durability are strategic priorities.
Can Be Closed
No high-confidence resolved-in-main candidates found.
Highest Impact Next
Merge 3 security PRs: CVE patch, sandbox auth, SSRF fix (impact 5/5, effort 1/5, confidence 0.95)
Merge thinking_blocks fix (PR fix: thread thinking_blocks through history so extended thinking survives tool turns #88 ) (impact 5/5, effort 1/5, confidence 0.95)
Fix onboarding CSS overflow blocking Start button (impact 5/5, effort 1/5, confidence 0.95)
Recommendation: Change onboarding container overflow:hidden to overflow-y:auto — root cause fully documented with screenshots and confirmed fix
Rationale: Completely blocks first-time users on 1366x768 screens; users must open DevTools to use the product
Next action: Push 1-line CSS fix to main
Sources: hf_discussion#19
Fix MCP tool stall (hub_repo_details hangs 5+ min) (impact 5/5, effort 2/5, confidence 0.9)
Fix session zombie state and quota cleanup (impact 5/5, effort 3/5, confidence 0.85)
Fix Pro plan detection and entitlement checks (impact 4/5, effort 2/5, confidence 0.85)
Merge Bedrock region prefix fix (PR fix(research): inherit main model's Bedrock region prefix in sub-agent #185 ) (impact 4/5, effort 1/5, confidence 0.9)
Add LICENSE file to unblock adoption (impact 4/5, effort 1/5, confidence 0.9)
Recommendation: Add Apache 2.0 LICENSE file — 5 thumbs-up, enterprise users explicitly blocked from testing
Rationale: No license = legal blocker for enterprises and external contributors; 5 reactions + multiple +1 comments
Next action: Add LICENSE file and README badge; use PR License and Citation #178 as reference
Sources: github_issue#41 , github_pr#178
Features
Local model support (Ollama/vLLM/OpenAI-compat) (impact 5/5, effort 3/5, confidence 0.75)
Provider adapter refactor + Bedrock/Azure/Gemini (impact 4/5, effort 3/5, confidence 0.75)
Image, file, and dataset attachments for web+CLI (impact 4/5, effort 4/5, confidence 0.6)
OpenRouter / custom OpenAI-compatible base URL (impact 4/5, effort 1/5, confidence 0.8)
Recommendation: Implement OPENAI_BASE_URL passthrough — minimal change, follows ecosystem convention, unlocks 300+ models
Next action: Accept a small PR respecting OPENAI_BASE_URL env var
Sources: github_issue#197 , github_pr#188
Cost guardrails: prompt caching, iteration cap, research concurrency (impact 5/5, effort 4/5, confidence 0.85)
Recommendation: Sprint on P0 items: add cache_control markers, lower max_iterations 300→40, cap concurrent research subagents, default cheaper research model
Next action: Internal sprint; start with max_iterations cap and prompt caching
Sources: github_issue#61
Expand pre-flight checks for hf_jobs approval (impact 4/5, effort 2/5, confidence 0.7)
Recommendation: Add 4 static checks (timeout, hub_model_id, flash-attn, trackio) + credit pre-check to prevent wasted GPU spend
Next action: Accept PR for reliability_checks.py expansion; add test coverage
Sources: github_issue#203 , github_issue#125
Notify user when agent needs input/approval (impact 3/5, effort 1/5, confidence 0.7)
Evaluation and benchmarking CLI (impact 4/5, effort 4/5, confidence 0.5)
Background sessions on Mongo control plane (impact 5/5, effort 5/5, confidence 0.7)
Recommendation: Fix P0 blockers (_enforce_gated_model_quota crash, reconnect 404) before merge; PR has 83 tests and thorough design
Next action: Author to fix remaining P0s; re-trigger automated review
Sources: github_pr#206
Claude Code project-mode / plugin support (impact 4/5, effort 4/5, confidence 0.5)
Opt-in LangFuse observability callback (impact 3/5, effort 2/5, confidence 0.65)
Frontend UX: example prompts, sidebar, copy/regenerate (impact 3/5, effort 3/5, confidence 0.8)
Fixes
CVE-2026-27962 authlib upgrade (impact 5/5, effort 1/5, confidence 0.9)
Recommendation: Fix title mismatch (says 1.6.9, is 1.7.0), vet joserfc dep, merge urgently
Sources: github_pr#96
Sandbox API unauthenticated RCE (impact 5/5, effort 1/5, confidence 0.95)
Recommendation: Rebase and merge — reviewed APPROVE, adds bearer-token auth to all sandbox endpoints
Sources: github_pr#85
SSRF in fetch_hf_docs leaks HF token (impact 5/5, effort 1/5, confidence 0.95)
Recommendation: Merge — reviewed safe-to-merge, origin validation with good test coverage
Sources: github_pr#83
Extended thinking drops after first tool call (impact 5/5, effort 1/5, confidence 0.95)
Recommendation: Merge — 51 lines, 1 file, 'SAFE TO MERGE, LOW RISK'
Sources: github_pr#88
Start session button hidden by CSS overflow (impact 5/5, effort 1/5, confidence 0.95)
Recommendation: 1-line CSS fix: overflow:hidden → overflow-y:auto on onboarding container
Sources: hf_discussion#19
Concurrent plan overwrites + CORS rejection on Spaces (impact 4/5, effort 1/5, confidence 0.8)
Read tool crashes on string offset/limit (impact 4/5, effort 1/5, confidence 0.9)
Recommendation: Merge after also fixing bash_handler timeout (same bug); reviewed 'ready to merge'
Sources: github_pr#110
HF Pro subscription not detected after upgrade (impact 4/5, effort 2/5, confidence 0.85)
Stopped sessions don't free quota; catch-up stuck (impact 5/5, effort 3/5, confidence 0.85)
Chat messages spontaneously disappear mid-session (impact 4/5, effort 3/5, confidence 0.65)
Dotenv ignored when shell has stale env var (impact 3/5, effort 1/5, confidence 0.8)
Recommendation: Confirm maintainer intent (ambiguous close comment); merge if acceptable — reviewed LGTM
Sources: github_pr#120
401 Client Error on sandbox creation (impact 4/5, effort 3/5, confidence 0.55)
Other / Watchlist
Add LICENSE file (Apache 2.0) (impact 4/5, effort 1/5)
Add CONTRIBUTING.md (impact 2/5, effort 1/5)
Update README model example + env var annotations (impact 2/5, effort 1/5)
Fix CI review workflow for fork PRs (impact 3/5, effort 1/5)
Recommendation: Rebase and merge — blocks automated review for all external contributors
Sources: github_pr#107
Restructure system prompt tone (impact 3/5, effort 2/5)
Recommendation: Re-add 3 dropped behavioral rules flagged in review before merging
Sources: github_pr#33
Use huggingface_hub.get_token() as fallback (impact 3/5, effort 1/5)
Clusters
Security vulnerabilities
Session reliability & state persistence
Pro/payment detection failures
Provider expansion (local, Azure, Bedrock, Gemini)
Summary: Users locked to Anthropic/OpenAI APIs; demand for local models, Bedrock, Azure, Gemini, OpenRouter across 10+ PRs and issues
Sources: github_issue#94 , github_pr#166 , github_pr#44 , github_pr#55 , github_pr#60 , github_pr#66 , github_pr#80 , github_pr#131 , github_pr#95 , github_pr#182 , github_issue#197 , github_pr#188 , github_pr#68
File & data uploads
Notifications when agent needs input
Agent stalls & cost runaway
Bedrock region & sub-agent routing
Summary: Hard-coded 'us.' prefix breaks research sub-agent for all non-US AWS regions; 17-line fix ready
Sources: github_issue#184 , github_pr#185
Sandbox creation & auth errors
Claude Code integration
Documentation & i18n
Observability & evaluation
ML Intern Backlog Prioritization
Generated: 2026-05-04T13:18:56.926357+00:00
Model:
anthropic/claude-opus-4-6Sources: github_issue=24, github_pr=50, hf_discussion=14
Summary
Critical security PRs (#96, #85, #83) and agent-quality fix (#88) are reviewed and ready to merge. Hosted Space has multiple reliability regressions (session zombies, Pro detection, onboarding CSS, blank pages) blocking paying users. Provider expansion (local models, Bedrock, Azure, Gemini) is the top feature theme with 6+ PRs. Cost guardrails and session durability are strategic priorities.
Can Be Closed
No high-confidence resolved-in-main candidates found.
Highest Impact Next
Merge 3 security PRs: CVE patch, sandbox auth, SSRF fix (impact 5/5, effort 1/5, confidence 0.95)
Merge thinking_blocks fix (PR fix: thread thinking_blocks through history so extended thinking survives tool turns #88) (impact 5/5, effort 1/5, confidence 0.95)
Fix onboarding CSS overflow blocking Start button (impact 5/5, effort 1/5, confidence 0.95)
Fix MCP tool stall (hub_repo_details hangs 5+ min) (impact 5/5, effort 2/5, confidence 0.9)
hub_repo_detailswith no progress logs #127 persistsFix session zombie state and quota cleanup (impact 5/5, effort 3/5, confidence 0.85)
Fix Pro plan detection and entitlement checks (impact 4/5, effort 2/5, confidence 0.85)
Merge Bedrock region prefix fix (PR fix(research): inherit main model's Bedrock region prefix in sub-agent #185) (impact 4/5, effort 1/5, confidence 0.9)
Add LICENSE file to unblock adoption (impact 4/5, effort 1/5, confidence 0.9)
Features
Local model support (Ollama/vLLM/OpenAI-compat) (impact 5/5, effort 3/5, confidence 0.75)
Provider adapter refactor + Bedrock/Azure/Gemini (impact 4/5, effort 3/5, confidence 0.75)
Image, file, and dataset attachments for web+CLI (impact 4/5, effort 4/5, confidence 0.6)
OpenRouter / custom OpenAI-compatible base URL (impact 4/5, effort 1/5, confidence 0.8)
Cost guardrails: prompt caching, iteration cap, research concurrency (impact 5/5, effort 4/5, confidence 0.85)
Expand pre-flight checks for hf_jobs approval (impact 4/5, effort 2/5, confidence 0.7)
Notify user when agent needs input/approval (impact 3/5, effort 1/5, confidence 0.7)
Evaluation and benchmarking CLI (impact 4/5, effort 4/5, confidence 0.5)
Background sessions on Mongo control plane (impact 5/5, effort 5/5, confidence 0.7)
Claude Code project-mode / plugin support (impact 4/5, effort 4/5, confidence 0.5)
Fixes
CVE-2026-27962 authlib upgrade (impact 5/5, effort 1/5, confidence 0.9)
Sandbox API unauthenticated RCE (impact 5/5, effort 1/5, confidence 0.95)
SSRF in fetch_hf_docs leaks HF token (impact 5/5, effort 1/5, confidence 0.95)
Extended thinking drops after first tool call (impact 5/5, effort 1/5, confidence 0.95)
Start session button hidden by CSS overflow (impact 5/5, effort 1/5, confidence 0.95)
Concurrent plan overwrites + CORS rejection on Spaces (impact 4/5, effort 1/5, confidence 0.8)
Read tool crashes on string offset/limit (impact 4/5, effort 1/5, confidence 0.9)
HF Pro subscription not detected after upgrade (impact 4/5, effort 2/5, confidence 0.85)
Stopped sessions don't free quota; catch-up stuck (impact 5/5, effort 3/5, confidence 0.85)
Chat messages spontaneously disappear mid-session (impact 4/5, effort 3/5, confidence 0.65)
Other / Watchlist
Add LICENSE file (Apache 2.0) (impact 4/5, effort 1/5)
Add CONTRIBUTING.md (impact 2/5, effort 1/5)
Update README model example + env var annotations (impact 2/5, effort 1/5)
Fix CI review workflow for fork PRs (impact 3/5, effort 1/5)
Restructure system prompt tone (impact 3/5, effort 2/5)
Use huggingface_hub.get_token() as fallback (impact 3/5, effort 1/5)
huggingface_hub.get_token()instead of requiring env variable #23)Clusters
Security vulnerabilities
Session reliability & state persistence
Pro/payment detection failures
Provider expansion (local, Azure, Bedrock, Gemini)
File & data uploads
Notifications when agent needs input
Agent stalls & cost runaway
Bedrock region & sub-agent routing
Sandbox creation & auth errors
Claude Code integration
Documentation & i18n
Observability & evaluation