Skip to content

Duplicate job IDs for multi-office postings (same role, different locations) #17

@prPMDev

Description

@prPMDev

Finding

Discovered during real Claude Desktop testing. Companies like Brex post the same role across multiple offices as distinct Greenhouse requisitions. The normalizer generates the same `id` for all of them because the hash inputs are identical.

Example

One query returned 3 entries for Brex "Group Product Manager":

Location greenhouseId Normalized id
San Francisco 8438580002 `377f4d5c4c50`
Seattle 8438582002 `377f4d5c4c50`
New York 8438581002 `377f4d5c4c50`

Same normalized id, different postings, different URLs.

Root cause

`src/normalizer.js` line 17:
```js
id: jobId(raw.company || raw.companySlug, raw.title, ats)
```

Hash takes only company + title + ats. Location, greenhouseId, and url are not part of the identity.

Impact

  • Low for conversational use — AI handles duplicates fine
  • Higher for any downstream system that uses `id` as primary key (e.g., "track jobs I've seen", dedupe across scans)

Options

  1. Include url or greenhouseId in the hash → true posting-level uniqueness
  2. Dedupe at adapter level → treat multi-office postings as one role
  3. Document `id` as "role identity" (by design) and add a separate `postingId` field for per-requisition uniqueness

Option 3 is cheapest and backward-compatible. Options 1 and 2 change behavior for existing library consumers.

Priority

Post-MVP. Doesn't block the MCP launch. File for consideration when the next user builds something on top of the library that depends on per-posting uniqueness.

Discovered in

Second Claude Desktop validation run (fintech sector fan-out, Brex results showed the pattern).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions