Skip to content

[PRIVACY]: Strip Sensitive Metadata and Creator Signatures from Output PDFs #315

@Cubix33

Description

@Cubix33

📝Description

Currently, when FireForm generates a filled incident report, the final PDF inherits all "Hidden Metadata" from the original template. This metadata often contains sensitive or unprofessional information, such as:

  • The Original Author's Name: (e.g., a specific developer or student's name).
  • Creation Software: Signatures like "Google Docs Renderer" or "Microsoft Word".
  • Legacy Timestamps: Creation dates from years ago that do not match the actual incident date.
Image

In a professional emergency services context, these "digital fingerprints" are a privacy risk and can compromise the legal "Chain of Custody" for official records.

💡 Rationale

We should implement a "Metadata Scrubbing" step in the PDF generation pipeline. Before the final file is saved, the system will programmatically wipe the internal Info dictionary and replace it with a standardized, anonymous profile. This ensures every report produced by FireForm is professional, sanitized, and ready for legal use.

🛠️ Proposed solution

I will modify the fill_form method to reset the document properties using pdfrw.

  • Initialize a Clean Dictionary: Use PdfDict() to create a fresh, empty metadata container.
  • Inject Standardized Keys: Use PdfName to set professional values for /Title and /Author (e.g., "FireForm Automated Report").
  • Sanitize Producer Info: Ensure the /Producer and /Creator fields no longer point to third-party software like Google Docs.
Image

✅Acceptance Criteria

  • Anonymity Verified: Checking "Document Properties" in a browser or using pdfinfo shows no personal names or software signatures.
  • Consistent Branding: The internal PDF title is standardized across all generated reports regardless of the source template.
  • Zero Corruption: The output PDF remains perfectly readable and the form fields remain filled, proving that the scrubbing did not break the file structure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions