Integrate top 6 tools from Accountability Agent pipeline into Bill Translator by Copilot · Pull Request #15 · Leerrooy95/Bill_Translator

Copilot · 2026-04-03T00:32:11Z

…hecker, CI/CD, security headers, enhanced gitignore; add CLAUDE.md; delete tar.xz Agent-Logs-Url: https://github.com/Leerrooy95/Bill_Translator/sessions/0ca699cf-bacd-46ef-800d-2249ad9ef4fe Co-authored-by: YouGotCuboned <261487993+YouGotCuboned@users.noreply.github.com>

…ion, add CSP comment Agent-Logs-Url: https://github.com/Leerrooy95/Bill_Translator/sessions/0ca699cf-bacd-46ef-800d-2249ad9ef4fe Co-authored-by: YouGotCuboned <261487993+YouGotCuboned@users.noreply.github.com>

Agent-Logs-Url: https://github.com/Leerrooy95/Bill_Translator/sessions/0ca699cf-bacd-46ef-800d-2249ad9ef4fe Co-authored-by: YouGotCuboned <261487993+YouGotCuboned@users.noreply.github.com>

Copilot

Pull request overview

This PR integrates additional “Accountability Agent” pipeline capabilities into the Bill Translator Flask app, adding secure multi-format document ingestion and a Brave Search + Claude fact-checking feature, alongside centralized configuration, documentation, and CI validation.

Changes:

Added centralized Flask configuration (config.py) and applied it in web_app.py.
Introduced PDF/Markdown upload processing (document_processor.py) and a new /fact-check endpoint (fact_checker.py).
Expanded documentation, CI workflow, dependencies, and test coverage for the new modules and security headers.

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
web_app.py	Uses centralized Config, adds security headers, supports `.pdf/.md` upload processing, and adds `/fact-check` route
document_processor.py	New ingestion/extraction helper for txt/pdf/md with hashed filenames and basic path validation
fact_checker.py	New Brave Search + Claude claim verification utility
config.py	New centralized configuration for uploads, sessions, model defaults, etc.
tests.py	Adds unit tests for document processor, fact checker, config, and security headers
templates/index.html	Updates file input accept list and UI copy to include `.pdf`/`.md`
requirements.txt	Adds `pdfplumber` and `requests`, reorganizes dependency list
README.md	Documents new features (PDF/MD support, fact-checking, security/CI additions)
CLAUDE.md	Adds comprehensive developer/setup guide and repo conventions
.github/workflows/validate.yml	Adds CI job to compile-check, test, scan for secrets, and validate templates
.gitignore	Expands ignore rules for secrets, uploads, outputs, venvs, and tooling artifacts
.env.example	Documents additional optional env vars (upload size, model, Brave key, etc.)

Comments suppressed due to low confidence (1)

web_app.py:118

This validation error message still says “upload a .txt file”, but the app now accepts .pdf and .md as well. Updating the message will avoid confusing users when they submit an empty form.

    elif request.form.get("bill_text", "").strip():
        raw_text = request.form["bill_text"].strip()
    else:
        flash("Please upload a .txt file or paste bill text.", "error")
        return redirect(url_for("index"))

Copilot · 2026-04-03T00:45:33Z

-ALLOWED_EXTENSIONS = {".txt"}
-MAX_CONTENT_LENGTH = 2 * 1024 * 1024  # 2 MB
-app.config["MAX_CONTENT_LENGTH"] = MAX_CONTENT_LENGTH
+ALLOWED_EXTENSIONS = {".txt", ".pdf", ".md"}


ALLOWED_EXTENSIONS is redefined here with dot-prefixed extensions, while Config.ALLOWED_EXTENSIONS and document_processor.ALLOWED_EXTENSIONS use non-dot extensions. This duplication risks inconsistencies (e.g., updating one list but not the others) and makes it unclear which source of truth is authoritative. Prefer deriving the web layer’s allowlist from Config (or reusing document_processor.allowed_file) to keep extension policy centralized.

Suggested change

ALLOWED_EXTENSIONS = {".txt", ".pdf", ".md"}

ALLOWED_EXTENSIONS = {

f".{ext.lower().lstrip('.')}" for ext in Config.ALLOWED_EXTENSIONS

}

Copilot · 2026-04-03T00:45:34Z

+    if not claim:
+        return jsonify({"error": "No claim provided."}), 400
+
+    brave_key = request.form.get("brave_key", "").strip() or session.get("brave_key")


The repo documents BRAVE_SEARCH_API_KEY in CLAUDE.md/.env.example, but the route only accepts a key from the form or session. Consider falling back to the server-side env var (e.g., os.environ.get("BRAVE_SEARCH_API_KEY")) when no key is provided, so the documented deployment path works without requiring users to paste the key each session.

Suggested change

brave_key = request.form.get("brave_key", "").strip() or session.get("brave_key")

brave_key = (

request.form.get("brave_key", "").strip()

or session.get("brave_key")

or os.environ.get("BRAVE_SEARCH_API_KEY", "").strip()

)

Copilot · 2026-04-03T00:45:34Z

+    brave_key = request.form.get("brave_key", "").strip() or session.get("brave_key")
+    if not brave_key:
+        return jsonify({"error": "Brave Search API key is required for fact-checking."}), 400
+
+    user_api_key = request.form.get("api_key", "").strip() or session.get("api_key")
+    if not user_api_key:
+        return jsonify({"error": "Anthropic API key is required."}), 400
+
+    # Store keys in session for convenience
+    session["brave_key"] = brave_key
+    session["api_key"] = user_api_key
+


Storing api_key/brave_key in Flask’s default session means the keys end up in the client-side session cookie (signed but not encrypted). That conflicts with the “never saved”/BYOK security model and increases exposure risk if cookies are leaked (or if server logs/debug tools capture them). Prefer not persisting these secrets in the session at all (require them per request), or switch to a server-side session backend (e.g., Redis/filesystem via Flask-Session) and ensure cookies are Secure/HttpOnly with appropriate rotation.

Suggested change

brave_key = request.form.get("brave_key", "").strip() or session.get("brave_key")

if not brave_key:

return jsonify({"error": "Brave Search API key is required for fact-checking."}), 400

user_api_key = request.form.get("api_key", "").strip() or session.get("api_key")

if not user_api_key:

return jsonify({"error": "Anthropic API key is required."}), 400

# Store keys in session for convenience

session["brave_key"] = brave_key

session["api_key"] = user_api_key

brave_key = request.form.get("brave_key", "").strip()

if not brave_key:

return jsonify({"error": "Brave Search API key is required for fact-checking."}), 400

user_api_key = request.form.get("api_key", "").strip()

if not user_api_key:

return jsonify({"error": "Anthropic API key is required."}), 400

Copilot · 2026-04-03T00:45:34Z

+    # Session
+    SESSION_COOKIE_HTTPONLY = True
+    SESSION_COOKIE_SAMESITE = "Lax"
+    SESSION_COOKIE_SECURE = os.environ.get("FLASK_ENV") == "production"


SESSION_COOKIE_SECURE is derived from FLASK_ENV, which is deprecated/removed in newer Flask versions (including Flask 3.x). In practice this can silently leave cookies non-Secure in production unless the deploy explicitly sets FLASK_ENV=production. Prefer controlling this via an explicit env var (e.g., SESSION_COOKIE_SECURE) with a safe default, or set it unconditionally in production deployments where HTTPS is enforced.

Suggested change

SESSION_COOKIE_SECURE = os.environ.get("FLASK_ENV") == "production"

SESSION_COOKIE_SECURE = os.environ.get("SESSION_COOKIE_SECURE", "true").strip().lower() in {

"1",

"true",

"t",

"yes",

"y",

"on",

}

Copilot · 2026-04-03T00:45:35Z

+_BRAVE_SEARCH_URL = "https://api.search.brave.com/res/v1/web/search"
+
+
+def _brave_search(query, api_key, count=5):
+    """Search Brave for evidence related to a claim."""
+    headers = {
+        "Accept": "application/json",
+        "Accept-Encoding": "gzip",
+        "X-Subscription-Token": api_key,
+    }
+    params = {"q": query, "count": count}


This module hardcodes the Brave Search URL and default result count, while similar values are already present in Config (BRAVE_SEARCH_URL, BRAVE_MAX_RESULTS). Using the centralized config (or accepting these as parameters) would reduce duplication and make it easier to adjust endpoints/limits in one place.

Copilot · 2026-04-03T00:45:35Z

+      - name: Install dependencies
+        run: |
+          pip install -r requirements.txt
+          pip install requests


pip install -r requirements.txt already installs requests (and pins its minimum version). The extra pip install requests is redundant and can be removed to keep CI steps minimal and deterministic.

Suggested change

pip install requests

Copilot · 2026-04-03T00:45:35Z

+        name1 = secure_filename_hash("same.pdf")
+        name2 = secure_filename_hash("same.pdf")


This test is probabilistic: it expects two calls to secure_filename_hash() to always differ, but the implementation uses randomness (secrets.token_hex) so collisions are theoretically possible and can create rare flaky failures. Consider patching secrets.token_hex (or injecting a deterministic RNG) to make the test fully deterministic while still asserting that the function uses the token value and preserves the extension.

Suggested change

name1 = secure_filename_hash("same.pdf")

name2 = secure_filename_hash("same.pdf")

with patch(

"document_processor.secrets.token_hex",

side_effect=[

"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",

"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",

],

):

name1 = secure_filename_hash("same.pdf")

name2 = secure_filename_hash("same.pdf")

self.assertEqual(name1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.pdf")

self.assertEqual(name2, "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb.pdf")

Copilot AI and others added 3 commits April 3, 2026 00:28

Address code review: fix score-only PDF support, update requests vers…

77ba1cb

…ion, add CSP comment Agent-Logs-Url: https://github.com/Leerrooy95/Bill_Translator/sessions/0ca699cf-bacd-46ef-800d-2249ad9ef4fe Co-authored-by: YouGotCuboned <261487993+YouGotCuboned@users.noreply.github.com>

Fix stack trace exposure in error responses

34b69ca

Agent-Logs-Url: https://github.com/Leerrooy95/Bill_Translator/sessions/0ca699cf-bacd-46ef-800d-2249ad9ef4fe Co-authored-by: YouGotCuboned <261487993+YouGotCuboned@users.noreply.github.com>

Copilot AI assigned Copilot and YouGotCuboned Apr 3, 2026

Copilot created this pull request from a session on behalf of YouGotCuboned April 3, 2026 00:33 View session

Copilot AI requested a review from YouGotCuboned April 3, 2026 00:33

Copilot finished work on behalf of YouGotCuboned April 3, 2026 00:33

Leerrooy95 marked this pull request as ready for review April 3, 2026 00:41

Copilot AI review requested due to automatic review settings April 3, 2026 00:41

Leerrooy95 merged commit b5a2216 into main Apr 3, 2026
2 checks passed

Copilot started reviewing on behalf of Leerrooy95 April 3, 2026 00:42 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate top 6 tools from Accountability Agent pipeline into Bill Translator#15

Integrate top 6 tools from Accountability Agent pipeline into Bill Translator#15
Leerrooy95 merged 3 commits intomainfrom
copilot/update-template-with-tools-again

Copilot AI commented Apr 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		name1 = secure_filename_hash("same.pdf")
		name2 = secure_filename_hash("same.pdf")

-        name1 = secure_filename_hash("same.pdf")
-        name2 = secure_filename_hash("same.pdf")
+        with patch(
+            "document_processor.secrets.token_hex",
+            side_effect=[
+                "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
+                "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
+            ],
+        ):
+            name1 = secure_filename_hash("same.pdf")
+            name2 = secure_filename_hash("same.pdf")
+        self.assertEqual(name1, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.pdf")
+        self.assertEqual(name2, "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb.pdf")

Conversation

Copilot AI commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Apr 3, 2026 •

edited

Loading