Upgrade: Handled Test cases Like Large Doc chunking One doc test case… by techieworld2 · Pull Request #3 · aunraza19/SpecGap

techieworld2 · 2026-02-08T18:01:24Z

… and prompt injection security

Copilot

Pull request overview

This PR improves SpecGap’s robustness when analyzing large documents and handling LLM output variability, while adding additional defenses against prompt injection and enabling a single-document “self-audit” flow when only one file is provided.

Changes:

Added safe JSON extraction/parsing utilities and updated engines to retry on parse failures.
Introduced document sanitization + standardized document-context wrapping in prompts to reduce prompt injection risk.
Added large-document chunking + map-reduce condensation, and smart comparison logic (single-doc audit vs cross-doc comparison).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
specgap/app/services/workflow.py	Wraps council document context with a standardized document delimiter.
specgap/app/services/tech_engine.py	Uses safe parsing + document-context wrapping for tech analysis prompts.
specgap/app/services/biz_engine.py	Uses safe parsing + document-context wrapping for legal/business analysis prompts.
specgap/app/services/cross_check.py	Uses safe parsing + wrapping; adds single-document audit and smart comparison routing.
specgap/app/services/parser.py	Sanitizes extracted text to reduce prompt injection and control characters.
specgap/app/services/sanitizer.py	New sanitizer + context wrapper utilities.
specgap/app/services/safe_parse.py	New robust JSON extraction/repair and consistent parsing helper.
specgap/app/services/chunker.py	New chunking + map-reduce condensation for very large documents.
specgap/app/services/init.py	Exposes new utilities and comparison functions at services package level.
specgap/app/main.py	Integrates condensation for council; switches deep/full-spectrum synthesis to smart comparison; adds single-doc support.
specgap/.env.example	Removes the example env file.

Comments suppressed due to low confidence (1)

specgap/.env.example:1

This PR deletes .env.example, but the repository README still instructs users to run cp .env.example .env during setup. Either keep a maintained .env.example (recommended for onboarding) or update the README and any tooling/docs that reference it so new installs don’t break.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-08T18:07:19Z

 @app.post("/api/v1/documents/classify", tags=["Documents"])
 async def classify_uploaded_document(
    file: UploadFile = File(..., description="Document to classify")
 ):
-    """
-    Classify a document to determine recommended analysis agents.
-
-    Useful for understanding what type of document you're uploading
-    before running a full analysis.
-    """
+
    await file.seek(0)


These endpoint functions no longer have docstrings. In FastAPI, the function docstring is used as the operation description in the generated OpenAPI docs; removing it reduces the usefulness of /docs and /redoc. Consider restoring the docstring (or providing description= in the decorator) for this endpoint.

Copilot · 2026-02-08T18:07:19Z

-    """
-    List saved audit records with optional filtering.
-    """
+


The documentation string for this endpoint was removed, which means the generated OpenAPI docs will no longer include an operation description here. If the project relies on Swagger/ReDoc for usability, consider restoring the docstring (or setting summary/description on the route decorator).

Suggested change

"""

List stored audits with optional filtering and pagination.

This endpoint returns a paginated list of audit records, which can be

filtered by audit type and risk level for easier browsing of history.

"""

Copilot · 2026-02-08T18:07:20Z

+        matches = pattern.findall(cleaned)
+        if matches:
+            injection_count += len(matches)
+            cleaned = pattern.sub("[REDACTED-INSTRUCTION]", cleaned)


sanitize_document_text uses pattern.findall(cleaned) to count matches and then pattern.sub(...), which allocates potentially large match lists for big documents (you’re sanitizing up to 500k chars). Using pattern.subn(...) would do the replacement and return the substitution count without building a list, reducing memory/time overhead.

Suggested change

matches = pattern.findall(cleaned)

if matches:

injection_count += len(matches)

cleaned = pattern.sub("[REDACTED-INSTRUCTION]", cleaned)

cleaned, num_subs = pattern.subn("[REDACTED-INSTRUCTION]", cleaned)

if num_subs:

injection_count += num_subs

Copilot · 2026-02-08T18:07:20Z

+            "why_unrealistic": "Why this is infeasible"
+        }}
+    ],
+    "completeness_score": 0-100,


The JSON example in SINGLE_DOC_PROMPT contains non-JSON placeholders like "completeness_score": 0-100. Models often mirror the example, which can increase invalid-JSON outputs and parse retries. Use a valid JSON example value (e.g., an integer like 85) and describe the allowed range in prose instead.

Suggested change

"completeness_score": 0-100,

"completeness_score": 85,

Copilot · 2026-02-08T18:07:20Z

+            result = safe_parse_llm_response(
+                response.text,
+                expected_keys=["critical_gaps", "ambiguity_score"]
+            )
+


_clean_json_response is now unused after migrating to safe_parse_llm_response. Consider removing the unused helper (or updating it to call into safe_parse) to keep a single, clear JSON-parsing path.

Copilot · 2026-02-08T18:07:21Z

+        prompt = (
+            f"You are a document analyst preparing content for {purpose}.\n"
+            f"This is section {idx + 1} of {len(chunks)} from a large document.\n\n"
+            "TASK: Extract and preserve ALL of the following from this section:\n"
+            "- Specific requirements, obligations, and commitments\n"
+            "- Financial terms, dates, deadlines, and SLAs\n"
+            "- Legal clauses, liability terms, and penalties\n"
+            "- Technical specifications and architecture decisions\n"
+            "- Any ambiguous or concerning language\n\n"
+            "Preserve EXACT QUOTES for important clauses. Be thorough — do not summarize.\n"
+            "Output a structured extraction, NOT a summary.\n\n"
+            f"--- SECTION {idx + 1}/{len(chunks)} ---\n{chunk}"
+        )


The chunk content is injected directly into the summarization prompt (...\n{chunk}) without using wrap_as_document_context. Since this is untrusted document text, it can still contain prompt-injection strings that bypass the extraction task (even after earlier sanitization). Consider wrapping each chunk in the document-context delimiter (and/or sanitizing here) so the model is more likely to treat it as data.

Copilot · 2026-02-08T18:07:21Z

+            result = safe_parse_llm_response(
+                response.text,
+                expected_keys=["contradictions", "strategic_synthesis"]
+            )
+


After switching to safe_parse_llm_response, _clean_json_response is no longer referenced in this module. To avoid dead code and confusion about which parser is canonical, consider removing _clean_json_response (or delegating to safe_parse from it) now that parsing is handled elsewhere.

Copilot · 2026-02-08T18:07:22Z

+            result = safe_parse_llm_response(
+                response.text,
+                expected_keys=["leverage_score", "trap_clauses"]
+            )
+


After switching to safe_parse_llm_response, _clean_json_response is no longer referenced in this file. Consider deleting it to reduce dead code and avoid confusion about which parser to use.

Copilot · 2026-02-08T18:07:22Z

+    # Sanitize extracted text to prevent prompt injection (Test Case 5)
+    if not text.startswith("Error:"):
+        text = sanitize_document_text(text, max_length=500000)


The sanitizer max length is hard-coded to 500,000 chars. This is a policy/limit that likely belongs in configuration (e.g., a Settings field) so it can be tuned per environment and stays consistent with MAX_FILE_SIZE_MB / MAX_CONTEXT_CHARS. Consider moving this value to settings and referencing it here.

Copilot · 2026-02-08T18:07:22Z

+    # === MULTIPLE FILES: Real cross-document comparison ===
+    logger.info(f"{len(file_texts)} files detected, running cross-document comparison")
+
+    filenames = list(file_texts.keys())


Variable filenames is not used.

Suggested change

filenames = list(file_texts.keys())

Upgrade: Handled Test cases Like Large Doc chunking One doc test case…

5b61041

… and prompt injection security

Copilot AI review requested due to automatic review settings February 8, 2026 18:01

Copilot started reviewing on behalf of techieworld2 February 8, 2026 18:01 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade: Handled Test cases Like Large Doc chunking One doc test case…#3

Upgrade: Handled Test cases Like Large Doc chunking One doc test case…#3
techieworld2 wants to merge 1 commit intomainfrom
feature/test-cases-handling

techieworld2 commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

+    """
+    List stored audits with optional filtering and pagination.
+    This endpoint returns a paginated list of audit records, which can be
+    filtered by audit type and risk level for easier browsing of history.
+    """

Conversation

techieworld2 commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants