Skip to content

Upgrade: Handled Test cases Like Large Doc chunking One doc test case…#3

Open
techieworld2 wants to merge 1 commit intomainfrom
feature/test-cases-handling
Open

Upgrade: Handled Test cases Like Large Doc chunking One doc test case…#3
techieworld2 wants to merge 1 commit intomainfrom
feature/test-cases-handling

Conversation

@techieworld2
Copy link
Copy Markdown
Collaborator

… and prompt injection security

Copilot AI review requested due to automatic review settings February 8, 2026 18:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves SpecGap’s robustness when analyzing large documents and handling LLM output variability, while adding additional defenses against prompt injection and enabling a single-document “self-audit” flow when only one file is provided.

Changes:

  • Added safe JSON extraction/parsing utilities and updated engines to retry on parse failures.
  • Introduced document sanitization + standardized document-context wrapping in prompts to reduce prompt injection risk.
  • Added large-document chunking + map-reduce condensation, and smart comparison logic (single-doc audit vs cross-doc comparison).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
specgap/app/services/workflow.py Wraps council document context with a standardized document delimiter.
specgap/app/services/tech_engine.py Uses safe parsing + document-context wrapping for tech analysis prompts.
specgap/app/services/biz_engine.py Uses safe parsing + document-context wrapping for legal/business analysis prompts.
specgap/app/services/cross_check.py Uses safe parsing + wrapping; adds single-document audit and smart comparison routing.
specgap/app/services/parser.py Sanitizes extracted text to reduce prompt injection and control characters.
specgap/app/services/sanitizer.py New sanitizer + context wrapper utilities.
specgap/app/services/safe_parse.py New robust JSON extraction/repair and consistent parsing helper.
specgap/app/services/chunker.py New chunking + map-reduce condensation for very large documents.
specgap/app/services/init.py Exposes new utilities and comparison functions at services package level.
specgap/app/main.py Integrates condensation for council; switches deep/full-spectrum synthesis to smart comparison; adds single-doc support.
specgap/.env.example Removes the example env file.
Comments suppressed due to low confidence (1)

specgap/.env.example:1

  • This PR deletes .env.example, but the repository README still instructs users to run cp .env.example .env during setup. Either keep a maintained .env.example (recommended for onboarding) or update the README and any tooling/docs that reference it so new installs don’t break.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread specgap/app/main.py
Comment on lines 421 to 426
@app.post("/api/v1/documents/classify", tags=["Documents"])
async def classify_uploaded_document(
file: UploadFile = File(..., description="Document to classify")
):
"""
Classify a document to determine recommended analysis agents.

Useful for understanding what type of document you're uploading
before running a full analysis.
"""

await file.seek(0)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These endpoint functions no longer have docstrings. In FastAPI, the function docstring is used as the operation description in the generated OpenAPI docs; removing it reduces the usefulness of /docs and /redoc. Consider restoring the docstring (or providing description= in the decorator) for this endpoint.

Copilot uses AI. Check for mistakes.
Comment thread specgap/app/main.py
"""
List saved audit records with optional filtering.
"""

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation string for this endpoint was removed, which means the generated OpenAPI docs will no longer include an operation description here. If the project relies on Swagger/ReDoc for usability, consider restoring the docstring (or setting summary/description on the route decorator).

Suggested change
"""
List stored audits with optional filtering and pagination.
This endpoint returns a paginated list of audit records, which can be
filtered by audit type and risk level for easier browsing of history.
"""

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +72
matches = pattern.findall(cleaned)
if matches:
injection_count += len(matches)
cleaned = pattern.sub("[REDACTED-INSTRUCTION]", cleaned)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitize_document_text uses pattern.findall(cleaned) to count matches and then pattern.sub(...), which allocates potentially large match lists for big documents (you’re sanitizing up to 500k chars). Using pattern.subn(...) would do the replacement and return the substitution count without building a list, reducing memory/time overhead.

Suggested change
matches = pattern.findall(cleaned)
if matches:
injection_count += len(matches)
cleaned = pattern.sub("[REDACTED-INSTRUCTION]", cleaned)
cleaned, num_subs = pattern.subn("[REDACTED-INSTRUCTION]", cleaned)
if num_subs:
injection_count += num_subs

Copilot uses AI. Check for mistakes.
"why_unrealistic": "Why this is infeasible"
}}
],
"completeness_score": 0-100,
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON example in SINGLE_DOC_PROMPT contains non-JSON placeholders like "completeness_score": 0-100. Models often mirror the example, which can increase invalid-JSON outputs and parse retries. Use a valid JSON example value (e.g., an integer like 85) and describe the allowed range in prose instead.

Suggested change
"completeness_score": 0-100,
"completeness_score": 85,

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +123
result = safe_parse_llm_response(
response.text,
expected_keys=["critical_gaps", "ambiguity_score"]
)

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_clean_json_response is now unused after migrating to safe_parse_llm_response. Consider removing the unused helper (or updating it to call into safe_parse) to keep a single, clear JSON-parsing path.

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +124
prompt = (
f"You are a document analyst preparing content for {purpose}.\n"
f"This is section {idx + 1} of {len(chunks)} from a large document.\n\n"
"TASK: Extract and preserve ALL of the following from this section:\n"
"- Specific requirements, obligations, and commitments\n"
"- Financial terms, dates, deadlines, and SLAs\n"
"- Legal clauses, liability terms, and penalties\n"
"- Technical specifications and architecture decisions\n"
"- Any ambiguous or concerning language\n\n"
"Preserve EXACT QUOTES for important clauses. Be thorough — do not summarize.\n"
"Output a structured extraction, NOT a summary.\n\n"
f"--- SECTION {idx + 1}/{len(chunks)} ---\n{chunk}"
)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk content is injected directly into the summarization prompt (...\n{chunk}) without using wrap_as_document_context. Since this is untrusted document text, it can still contain prompt-injection strings that bypass the extraction task (even after earlier sanitization). Consider wrapping each chunk in the document-context delimiter (and/or sanitizing here) so the model is more likely to treat it as data.

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +156
result = safe_parse_llm_response(
response.text,
expected_keys=["contradictions", "strategic_synthesis"]
)

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After switching to safe_parse_llm_response, _clean_json_response is no longer referenced in this module. To avoid dead code and confusion about which parser is canonical, consider removing _clean_json_response (or delegating to safe_parse from it) now that parsing is handled elsewhere.

Copilot uses AI. Check for mistakes.
Comment on lines +120 to +124
result = safe_parse_llm_response(
response.text,
expected_keys=["leverage_score", "trap_clauses"]
)

Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After switching to safe_parse_llm_response, _clean_json_response is no longer referenced in this file. Consider deleting it to reduce dead code and avoid confusion about which parser to use.

Copilot uses AI. Check for mistakes.
Comment on lines +214 to +216
# Sanitize extracted text to prevent prompt injection (Test Case 5)
if not text.startswith("Error:"):
text = sanitize_document_text(text, max_length=500000)
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sanitizer max length is hard-coded to 500,000 chars. This is a policy/limit that likely belongs in configuration (e.g., a Settings field) so it can be tuned per environment and stays consistent with MAX_FILE_SIZE_MB / MAX_CONTEXT_CHARS. Consider moving this value to settings and referencing it here.

Copilot uses AI. Check for mistakes.
# === MULTIPLE FILES: Real cross-document comparison ===
logger.info(f"{len(file_texts)} files detected, running cross-document comparison")

filenames = list(file_texts.keys())
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable filenames is not used.

Suggested change
filenames = list(file_texts.keys())

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants