Skip to content

WesternConcrete/archives-api-client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Archives API Client

A minimal TypeScript/JavaScript client for querying the JFK files on Archives API. This package lets you:

  • Query the JFK files using:
    • Full-text search (text)
    • Semantic vector search (vector)
    • Metadata-only filtering (metadata)
  • Retrieve page text or presigned page PNG URLs for specific page IDs

This project was developed with love by Wes Convery <3.


Installation

npm install archives-api-client
# or
yarn add archives-api-client

Quick Start

import { ArchivesApiClient } from "archives-api-client";

// Initialize the client with your provided API key from archives-api.com
const { jfk } = new ArchivesApiClient({
  apiKey: "your-api-key",
});

Searching Documents

The client supports three main search methods:

1. Text Search

Search for documents by matching your query string within the full text:

const results = await jfk.search.text({
  query: "Lee Harvey Oswald",
});

2. Vector (Semantic) Search

Search using vector embeddings for semantic similarity. Note: Filtering by comments is not allowed in this method.

const results = await jfk.search.vector({
  query: "multiple shooters",
  metadata: {
    document_type: { operator: "eq", value: "MEMORANDUM" },
  },
  limit: 20,
});

3. Metadata-Only Search

Search using metadata filters only, without matching any text or vectors:

const results = await jfk.search.metadata({
  metadata: {
    nara_release_date: { operator: "between", value: [new Date("1963-01-01"), new Date("1970-01-01")] },
    file_number: { operator: "eq", value: "124-10001-10110" },
  },
});

Each search returns an object with the following structure:

{
  hits: Array<JFKDocumentResult>;
  limit: number;
  total: number; // not available for vector searches
}

Here, hits is an array of matching documents along with their metadata.


Retrieving Pages

Get Page Text

Retrieve the full text for one or more pages using their IDs:

const pageTextMap = await jfk.pages.getText({
  page_ids: ["page-id-1", "page-id-2"],
});

The response maps page IDs to their text:

{
  "page-id-1": { "text": "Full page text..." },
  "page-id-2": { "text": "Full page text..." }
}

Get Page PNG

Retrieve presigned URLs for PNG images of pages:

const pagePngMap = await jfk.pages.getPng({
  page_ids: ["page-id-1", "page-id-3"],
});

The response maps page IDs to URLs:

{
  "page-id-1": { "url": "https://..." },
  "page-id-3": { "url": "https://..." }
}

Metadata Filters & Keys

When constructing metadata filters, you can use one or more of these filter types:

  • TextFilter

    { operator: "contains", value: "search string" }
    { operator: "isNull" }
  • KeywordFilter

    { operator: "eq", value: "specific keyword" }
    { operator: "isNull" }
  • DateFilter

    { operator: "gte" | "lte" | "gt" | "lt" | "eq", value: new Date() }
    { operator: "between", value: [new Date(), new Date()] }
    { operator: "isNull" }
  • NumberFilter

    { operator: "eq" | "gt" | "lt" | "gte" | "lte", value: 123 }
    { operator: "between", value: [100, 200] }

For example, a composite filter may look like this:

{
  document_type: { operator: "eq", value: "REPORT" },
  document_date: { operator: "between", value: [new Date("1963-01-01"), new Date("1963-12-31")] },
  pages_released: { operator: "gt", value: 10 },
}

These filters are defined in the package's type definitions and ensure that search queries are structured correctly.


cURL Usage

You can interact with the API using cURL. Below are detailed examples to make the usage clear:

Example 1: Text Search

Send a POST request to perform a text search:

curl -X POST http://localhost:3000/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "action": "search",
    "payload": {
      "query": "Lee Harvey Oswald",
      "searchType": "text"
    }
  }'

Explanation:

  • POST: The request method.
  • Headers: Set the content type and include your API key (x-api-key).
  • Payload: JSON object specifying the search action and payload.
    • query: The text to search.
    • searchType: Specifies that it is a text search.

Example 2: Vector Search with Metadata Filter

Perform a vector search (note: do not include a filter on comments):

curl -X POST http://localhost:3000/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "action": "search",
    "payload": {
      "query": "multiple shooters",
      "searchType": "vector",
      "metadata": {
        "document_type": { "operator": "eq", "value": "MEMORANDUM" }
      },
      "limit": 20
    }
  }'

Key Points:

  • The metadata object includes filter keys (e.g., document_type).
  • The limit parameter controls the number of results returned.

Example 3: Metadata-Only Search

Perform a metadata search by supplying filters:

curl -X POST http://localhost:3000/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "action": "search",
    "payload": {
      "searchType": "metadata",
      "metadata": {
        "nara_release_date": { "operator": "between", "value": ["1963-01-01", "1970-01-01"] },
        "file_number": { "operator": "eq", "value": "124-10001-10110" }
      }
    }
  }'

Details:

  • The value for a between date filter is an array of date strings.
  • This request does not include a text query since it is a metadata-only search.

Example 4: Retrieve Page Text

Request page text for specific pages:

curl -X POST http://localhost:3000/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "action": "pageText",
    "payload": {
      "page_ids": ["page-id-1", "page-id-2"]
    }
  }'

Example 5: Retrieve Page PNG

Request presigned PNG URLs for specific pages:

curl -X POST http://localhost:3000/ \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -d '{
    "action": "pagePng",
    "payload": {
      "page_ids": ["page-id-1", "page-id-3"]
    }
  }'

These examples illustrate how to use cURL to interact with the API endpoints directly. Adjust the endpoint URL, payload, and headers as required for your environment.


Types & Interfaces

Key Interfaces

  • Search Input Types:

    • TextSearchInput
    • VectorSearchInput
    • MetadataSearchInput
  • Page Input:

    • PageInput for retrieving page text or PNG images.
  • Document Result:

    • JFKDocumentResult
      This interface represents the structure of documents returned from the JFK file group search. For example:
      export interface JFKDocumentResult extends BaseDocumentResult {
        metadata: {
          comments?: string;
          document_date?: Date;
          document_type?: string;
          file_name?: string;
          file_number?: string;
          formerly_withheld?: string;
          from_name?: string;
          nara_release_date?: Date;
          originator?: string;
          pages_released?: number;
          page_count?: number;
          record_number?: string;
          review_date?: Date;
          to_name?: string;
        };
      }
      This type ensures that every search hit includes both basic document links and a detailed metadata object with key fields like document_date and file_number.
  • Filter Types:
    The package defines several filter types used in metadata queries:

    • TextFilter
    • KeywordFilter
    • DateFilter
    • NumberFilter

Helper functions in the package convert date filters and format payloads consistently for all requests.


License

This project is open source and available for use, modification, and distribution. See the LICENSE file for details.

About

A minimal TypeScript/JavaScript client for querying the JFK files on Archives API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors