Upload a bulk search

## Feasibility Assessment: Bulk CSV Search Endpoint

The core building blocks already exist — this is mostly plumbing work.

1. **ES `_msearch` is already used** in `ElasticsearchPages.scala` and `Pages2.scala` — these execute multiple search queries in a single ES round-trip via `multi()`. 

2. **Query/filter construction is already modular.** `SearchParameters` is a clean case class, and the filter-building logic in `SearchContext` operates on a single `SearchParameters` — it can be called in a loop without modification.

3. **Permission checking can be done once** per request (the user's visibility doesn't change between queries), then reused across all search terms.

4. **Results are already JSON-serializable** via the existing `SearchResults` model.

### What would need to be built

| Layer | Work |
|-------|------|
| **Route** | New `POST /api/search/bulk` (POST because payload can be large) |
| **Controller** | Parse CSV/JSON body into `List[String]` of search terms, apply the shared UI filters to each, call `verifyParameters` per query |
| **Index service** | New `queryBatch(params: List[(SearchParameters, SearchContext)])` method using the existing `multi()` msearch pattern |
| **Response** | Array of `SearchResults`, each tagged with the originating search term for CSV correlation |
| **Frontend** | CSV upload UI + a new `SearchApi` method; the existing filters (collections, workspaces, MIME types, dates) can be reused as-is since they're just query params |

### Key design decisions

- **Aggregations** — probably omit per-query aggs in bulk mode (expensive and not useful per-term).
- **Pagination** — limit bulk results to page 1 with a small `pageSize` (e.g. 10–20 hits per term). Deep pagination across thousands of queries would be very expensive.
- **Batch size cap** — ES `_msearch` has a default `max_concurrent_searches` setting. For thousands of terms, we'd want to chunk into batches (e.g. 100 at a time) and aggregate results.
- **Response format** — for very large batches, consider streaming the response or returning a downloadable CSV/JSON rather than a single massive JSON payload.
- **Rate limiting** — this endpoint may need some guard (max terms per request, request timeout).

The hardest part isn't the search itself — it's deciding on the UX for presenting results from potentially thousands of queries (columnar export? hit counts only? full highlights?).

Layer	Work
Route	New `POST /api/search/bulk` (POST because payload can be large)
Controller	Parse CSV/JSON body into `List[String]` of search terms, apply the shared UI filters to each, call `verifyParameters` per query
Index service	New `queryBatch(params: List[(SearchParameters, SearchContext)])` method using the existing `multi()` msearch pattern
Response	Array of `SearchResults`, each tagged with the originating search term for CSV correlation
Frontend	CSV upload UI + a new `SearchApi` method; the existing filters (collections, workspaces, MIME types, dates) can be reused as-is since they're just query params

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upload a bulk search #582

Feasibility Assessment: Bulk CSV Search Endpoint

What would need to be built

Key design decisions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Upload a bulk search #582

Description

Feasibility Assessment: Bulk CSV Search Endpoint

What would need to be built

Key design decisions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions