Skip to content

Features-ui: Add download audit modals, ZIP bundle generation, and PDF proxy endpoints#14

Open
nando-bingani wants to merge 74 commits intov2-prodfrom
v2-prod-protologue
Open

Features-ui: Add download audit modals, ZIP bundle generation, and PDF proxy endpoints#14
nando-bingani wants to merge 74 commits intov2-prodfrom
v2-prod-protologue

Conversation

@nando-bingani
Copy link
Copy Markdown
Collaborator

Implement client-side download audit system with user metadata collection, ZIP bundle generation, and PDF proxy endpoints for CORS-free architecture.

Key Features:

  • Download Audit Modals - Collect user details (name, email, organisation) before downloads
  • Client-Side ZIP Generation - JSZip library creates bundles with metadata PDFs + data files
  • PDF Proxy Endpoint - Flask proxy to ODP API for CORS-free PDF generation
  • Form Caching - LocalStorage saves user details (30-day expiry)
  • Email Validation - Client-side validation with regex pattern
  • Download Audit Logging - Posts audit data to /catalog/download-audit proxy

New Features:

  1. Download Modal (#download-popup) - Single file downloads from detail page
  2. Bundle Modal (#download-audit-modal) - Multi-record ZIP downloads
  3. ZIP Generation - Client-side bundle creation with JSZip
  4. PDF Proxy - /catalog/format/metadata.pdf endpoint
  5. Audit Proxy - /catalog/download-audit endpoint forwards to ODP API

Files Modified:

  • odp/ui/base/views/catalog.py - Proxy endpoints (PDF + audit)
  • odp/ui/base/static/scripts/catalog.js - Download handlers, ZIP generation, form caching
  • odp/ui/base/templates/catalog_index.html - Search results with checkboxes
  • odp/ui/base/templates/catalog_record.html - Detail page download button
  • odp/ui/base/macros/catalog.j2 - Download modals and buttons

Bug Fixes:

  • Fixed binary PDF response handling (requests.post() instead of cli.post())
  • Fixed download-audit proxy (removed .json() on already-parsed response)
  • Removed unused variables and debug console.log statements
  • Removed commented-out code blocks

Production Ready:

  • All debug code removed
  • Clean code with no print statements
  • Proper error handling
  • Email validation enforced

Browser Compatibility:

  • Chrome, Firefox, Safari, Edge (all modern browsers)
  • LocalStorage for form caching
  • JSZip 3.10.1 for ZIP generation

onclick="toggleSelectAll(this)">
<label class="form-check-label" for="select_all">Select All</label>
</div>
{# <div class="form-check">#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't leave commented out code. Remove it if it's not needed

{% macro result_list(records,app_name=None) %}

<form id="recordForm">
{% if app_name == 'MIMS' %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Maybe move the code in this if into it's own macro and call that macro here.
  2. I think it would be better if this file didn't know about things like MIMS. The ides is for these to be re-usable chunks of code. So maybe pass in a variable called something like: 'show_bulk_download_options'. You can also set that to false by default which means you will only have to pass in true when you're calling it from MIMS and you don't have to change anything else from other areas.

onclick="toggleSelectAll(this)">
<label class="form-check-label" for="select_all">Select All</label>
</div>
{# <div class="form-check">#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this code is not needed remove it rather than commenting it out

{% macro result_list(records,app_name=None) %}

<form id="recordForm">
{% if app_name == 'MIMS' %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Maybe move the code that lies in this if to it's own macro. This will help declutter the current macro.
  2. This macro probably shouldn't know about app names like MIMS and such. It might be better to pass in a variable like 'show_download_options' that defaults to false.

</span>
</button>

{# <button#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code

<button class="btn btn-link p-0 no-outline"
type="button"
onclick="openSingleDownloadModal('{{ download_url }}', '{{ record.doi }}', '{{ record.id }}')">
<img src="{{ url_for('static', filename='images/download-icon.png') }}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at the formatting of this file. Try keep the indenting right so that it's more readable.

{%- if not loop.last %}, {% endif %}
</span>
{% set download_urls = [] %}
{% for record in records %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets look into this flow.


</form>

<div class="modal fade" id="download-popup" tabindex="-1">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this looks like a repetition of the above?

(No DOI)
{% endif %}
</span>
{% macro result_list(records,app_name=None) %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change the variable to something like "show_download_options" so that this file doesn't have to know about MIMS and such.

onclick="toggleSelectAll(this)">
<label class="form-check-label" for="select_all">Select All</label>
</div>
{# <div class="form-check">#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code should be removed if it's not being used

{% macro result_list(records,app_name=None) %}

<form id="recordForm">
{% if app_name == 'MIMS' %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would maybe make this more readable if the code within this 'if' was moved to it's own macro and the macro was called here

</span>
</button>

{# <button#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code

{%- if not loop.last %}, {% endif %}
</span>
{% set download_urls = [] %}
{% for record in records %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this should be stored in the DOM. Lets discuss


</form>

<div class="modal fade" id="download-popup" tabindex="-1">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very similar to the modal above

{% elif prop == 'Keywords' %}
{{ record.keywords | sort | join(', ') }}
{% elif prop == 'Keywords' %}
{% for keyword in record.keywords | sort %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this maybe be better in a server side function that gets the keywords data in the right state before it's delivered to the template

<input class="form-check-input" type="checkbox" id="accept-terms-of-use-popup">
<label class="form-check-label" for="accept-terms-of-use-popup">
<small class="text-muted">
<strong>Data Usage Acknowledgment:</strong> These data are made available with the express understanding that any such use will properly acknowledge the originator(s) and publisher and cite the accession numbers and/or associated Digital Object Identifiers. Anyone wishing to use these data should properly cite and attribute the data providers listed as authors in the metadata provided with each dataset. It is expected that all the conditions of the data license will be strictly honoured. Use of any material herein should be properly cited using the dataset's persistent identifiers, such as accession numbers and DOIs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been repeated 3 times

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this should be here?

}
}

function clearDownloadCache() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't being used apparently

window.location.href = redirectUrl;
}

function selectedRecordListLink(event) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not in use apparently

updateButtonStates();
}

function toggleUnSelectAll() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

if (cb?.type === 'checkbox') cb.checked = true;
}

function downloadSelectedRecords(event, buttonEl, record_id) {
Copy link
Copy Markdown
Contributor

@dylanpivo dylanpivo Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small issue, but pay attention to variable naming conventions. snake_case in python and camelCase in js. Then the actual name as well.

const disclaimerCheckbox = document.getElementById('disclaimer-acknowledgment');

// Validate all required fields
if (!nameInput.value.trim()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach of validating and handling the requests in JS works, but it's not traditionally how the systems handles this sort of thing. We would generally have a form within the modal with a submit target. The validation would be handled by wtforms and the submission logic would be handled in a view


</div>
{{ pagination }}
{# <div class="d-flex justify-content-end p-3">#}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code



@bp.route('/format/metadata.pdf', methods=['POST'])
def format_metadata_pdf():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unusual code flow. You are taking the metadata from the front-end, sending it to the back-end for pdf generation and then sending it back? Should it not just send a list of record id's back which will build the pdf by getting the metadata from the db and then sending it back, or something along those lines?

# The cli.post() method tries to parse JSON, but PDF is binary
# So we use requests directly to get the binary content
api_url = config.ODP.API_URL
response = requests.post(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use either the api or cli objects so that communication between the client and API is authorised.

event.preventDefault();
const targetInput = document.getElementById('record-subset-link');
if (!targetInput) {
console.error("Missing target input 'record-subset-link'");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please look out for console logs

from flask import Blueprint, abort, current_app, make_response, redirect, render_template, request, url_for
from flask import Blueprint, abort, current_app, make_response, redirect, render_template, request, url_for, Response, \
jsonify, send_file
from io import BytesIO
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format inputs. Check shortcuts. Mine is Ctrl+Alt+o

</div>
{% endfor %}

<div class="modal fade" id="download-audit-modal" tabindex="-1" aria-labelledby="downloadModalLabel" aria-hidden="true">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check out the formatting, Especially with HTML. Use the shortcuts to help. Mine is Ctrl+Alt+l

page = 1 # request.args.getlist('page')[0]

size = 5 # request.args.getlist('size')[0]
catalog_record_list = cli.get(f'/catalog/{catalog_id}/subset?{record_ids_query}&page={page}&size={size}')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the convention is to pass the page and size in as parameters rather than on the url string. Have a look at index()

Copy link
Copy Markdown
Contributor

@dylanpivo dylanpivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can sit and talk about the flow that would be great

- Replaced client-side JSZip with server-side ZIP generation
- Simplified _performZipDownload() to use new /catalog/generate-zip-bundle endpoint
- Removed manual audit logging from frontend (server handles it automatically)
- Reduced frontend code: 73 lines of JSZip → 40 lines of clean server call
- Added UI proxy endpoint for CORS handling (UI port 8080 → API port 2020)
- Fixed modal field ID mismatch by passing userData as parameter

Benefits:
- fewer HTTP requests (2N+1 → 1)
- Lower browser resource usage
- atomic audit logging
- No CORS errors

Author: nando-bingani <n.bingani@saeon.nrf.ac.za>
- Clean up: Removed commented-out code and unused JavaScript functions in `catalog.j2`, `catalog_subset.html`, and `catalog.js`.
- Refactor (Jinja):
    - Replaced 'app_name' check with 'show_bulk_download_options' boolean in `result_list` macro.
    - Extracted download audit logic into its own reusable macro `download_audit_modal`.
    - Improved HTML indentation and formatting across template files.
- Refactor (Python/Views):
    - Updated `subset_record_list` to pass 'page' and 'size' as parameters to `cli.get` rather than string formatting the URL.
    - Standardized facet formatting and removed internal-only facets ('Keyword', 'SDG Variables') in a centralized helper.
    - Fixed imports and removed redundant Response/jsonify definitions.
- JavaScript:
    - Renamed variables to use camelCase (e.g., `recordId`) to match JS conventions.
    - Removed production console logs.
- Download Flow:
    - Migrated download validation to use a formal WTForm (`DownloadAuditForm`) and server-side ZIP generation via an API proxy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants