feat(BUY-8819): NL query preprocessor in search endpoint#43
Open
feat(BUY-8819): NL query preprocessor in search endpoint#43
Conversation
fix(cicd): use correct SSH secret name in deploy workflow
fix(cicd): sync deploy workflow fix to master
…UY-5779) Cloud Run (managed) rejects images from ghcr.io — only gcr.io, docker.pkg.dev, and docker.io are supported. Switch the site deploy workflow to build and push to Artifact Registry (asia-southeast1-docker.pkg.dev/gaia-calendar-488606/buywhere/site) and add an idempotent repo-create step. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…UY-5779) fix(ci): push site image to Artifact Registry for Cloud Run deploy (BUY-5779)
…penapi.json Preferred fix per BUY-6224: replace stale public alias with canonical spec via nginx 308 redirect. Co-Authored-By: Paperclip <noreply@paperclip.ing>
… (BUY-5219) - Add app/currency.py with get_exchange_rate() and convert_price() - Live rates from open.er-api.com (free, no key required) with 1-hour in-process cache - Supported currencies: USD, SGD, VND, THB, MYR - Precision fix: 4dp fallback when 2dp rounds to zero (e.g. small VND amounts) - Add tests/test_multi_region_api.py with 11 currency tests — all green - Unblocks BUY-5170 currency conversion blocker Co-Authored-By: Paperclip <noreply@paperclip.ing>
…, >80% coverage (BUY-5090) - Add Jest + ts-jest + supertest to api/package.json devDependencies - Create jest.config.js with 80% coverage threshold enforced via coverageThreshold - Write 86 unit tests across 6 test files: - agentDetect: 13 tests covering UA heuristics and X-Agent-Framework header - apiKey: 15 tests for hashKey, requireApiKey (401/valid), checkRateLimit (429/fail-open) - auth: 9 tests for POST /register (validation, hashing, signup channel inference) - categories: 12 tests for GET /categories and GET /categories/:slug - products: 30 tests for search, deals, compare, price-history, prices, similar, GET/:id, POST ingest - queryLog: 7 tests for classifyIsAgent logic and fire-and-forget DB logging - Final coverage: 91.6% statements, 76.1% branches, 84.9% functions, 94.8% lines - Add .github/workflows/test-coverage.yml — CI fails if coverage drops below threshold Co-Authored-By: Paperclip <noreply@paperclip.ing>
…(BUY-4844) Lowercase region values like "us" were passed directly to the DB filter, causing mismatches against uppercase-stored region codes (US, SEA, etc.). Apply .toUpperCase() normalization — same as country_code already does. Adds regression test to confirm lowercase input reaches the DB as uppercase. Co-Authored-By: Paperclip <noreply@paperclip.ing>
… (BUY-3904) Writes the missing script from BUY-3902 (Sol's artifact was never committed). Calls existing analytics API endpoints: /v1/analytics/overview, /v1/analytics/ query-count, /v1/analytics/geo-scorecard, /v1/analytics/agents, /v1/analytics/ conversions. Requires ADMIN_API_KEY and/or BUYWHERE_API_KEY env vars. Also documents the /v1/growth/metrics/activation-funnel gap from BUY-3902. NOTE: analytics endpoints currently return 404 on api.buywhere.ai — they exist in the compiled dist but are inaccessible in production. Requires diagnosis of production deployment before this script can produce live KPI data. Co-Authored-By: Paperclip <noreply@paperclip.ing>
- POST /v1/auth/register: actionable validation errors with hints and docs links; next_steps array in 201 response guides new devs to first API call - requireApiKey: add hint and quickstart link to 401 errors; set X-BuyWhere-Docs response header to surface docs on auth failures - Add GET /docs/quickstart: 5-minute REST quick-start guide covering registration, first search, common queries, error reference, Python/TS examples - Redirect GET /docs root to /docs/quickstart (was MCP guide) - Update apiKey test mock to include res.set for header assertions Co-Authored-By: Paperclip <noreply@paperclip.ing>
- Add GET /healthz — lightweight liveness probe (no DB dependency) for Knative/GCP Cloud Run health checks - Add 404 fallback middleware so unknown routes return JSON error - Add Dockerfile.mcp (root build context) for standalone MCP Cloud Run build Closes BUY-6595 Co-Authored-By: Paperclip <noreply@paperclip.ing>
The startup and liveness probes were hitting /health which does a
blocking DB query. If DB isn't ready during startup, the probe fails
and Cloud Run rolls back the deployment.
/healthz returns {status: ok} immediately with no DB dependency — correct
for liveness/startup probes. /health (with DB query) remains available
for readiness checks by monitoring systems.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
The VM cannot pull from Artifact Registry without Docker authentication. Generates a short-lived GCP access token in CI and passes it to the remote SSH script for docker login before docker pull. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When buywhere_default network was created outside docker compose it lacks the required labels and causes 'docker compose up --no-deps api' to abort. Detect the missing labels and fall back to full stack restart (down + network rm + up) to let compose recreate it correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…twork fix docker compose down --remove-orphans may not remove containers that were started outside compose. Force-remove all project-prefixed containers to ensure docker compose up can create them cleanly after network recreation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rk reset After compose down + container rm, port 8000 may still be occupied by a container not matched by the project-name filter. Explicitly free any container publishing port 8000 before compose up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A stale container may hold port 8000 even after network fix. Free it explicitly before docker compose up --no-deps api in the normal path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add Pro tier card (S$49/mo, 50k req/day) alongside Free tier card - Remove "not a subscription API business" statement from FAQ - Update business model FAQ to reflect combined subscription + referral model - Replace "paid tiers coming later" FAQ with concrete Pro tier details - Update "For developers" section with real Pro pricing copy - Fix billing.ts FALLBACK_BILLING_TIERS: Pro price 29 USD → 49 SGD Unblocks BUY-4293 merchant re-entry sends. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous health check used a single curl with set -euo pipefail which propagated curl exit codes (56 = recv error) masking the actual failure. This adds: - || echo "000" to prevent set -e exit on curl failure - Retry loop (6 attempts, 5s between) so slow-starting containers succeed - docker logs on failure to expose the actual container crash reason Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
STDIO MCP server that proxies to api.buywhere.ai/mcp. Exposes 5 tools: search_products, get_product, compare_products, get_deals, list_categories. - Uses McpServer + StdioServerTransport from @modelcontextprotocol/sdk v1.29.0 - Zod schemas for all tool parameters (type-safe + auto-generates JSON Schema) - Proxies tool calls to hosted MCP endpoint with Bearer auth - Smoke tested: initialize + tools/list both return correct JSON-RPC responses - Published to npm as @buywhere/mcp-server@0.1.6 (public, tag: latest) - smithery.yaml manifest added for MCP registry listing Co-Authored-By: Paperclip <noreply@paperclip.ing>
…ction VM The live nginx for buywhere.ai proxies to a local Next.js process on the production VM, not to Cloud Run. The buywhere.ai.conf (which would switch to Cloud Run + add a native 308 redirect for /openapi.json) has never been deployed due to a sudo permission issue on /etc/nginx/sites-enabled/buywhere.ai. This workflow SSHes to the VM, pulls the latest main branch, runs npm run build to compile the updated route.ts (which now returns 308 instead of the stale .io spec), clears the ISR cache, and restarts the site process. Once the nginx deploy permission issue is fixed (chown on sites-enabled/buywhere.ai), this workflow can be retired in favour of the Cloud Run + nginx-level redirect path. Co-Authored-By: Paperclip <noreply@paperclip.ing>
Add a pre-deploy diagnostics step to read the live nginx config, detect running processes, and find the Next.js site directory. Required before we can reliably update the local site — first run failed because the candidate path list was incomplete. Co-Authored-By: Paperclip <noreply@paperclip.ing>
Diagnostic run (25263041990) confirmed: - Live nginx for buywhere.ai: proxy_pass http://127.0.0.1:3006 (not 3000 or Cloud Run) - next.config.mjs found at $HOME/buywhere-site/ and $HOME/buywhere-api/ - No systemd service matching 'buywhere|next|site' — likely pm2 Update candidate list to prioritise $HOME/buywhere-site and $HOME/buywhere-api. Add port-3006 listener check to diagnostics to confirm the running process. Co-Authored-By: Paperclip <noreply@paperclip.ing>
The buywhere-site on the VM has local modifications to route.ts and other files that diverge from origin/main. git pull fails with "divergent branches". Switch to git reset --hard origin/main to force-sync and pick up the BUY-7473 308 redirect fix. Co-Authored-By: Paperclip <noreply@paperclip.ing>
… BUY-7302 docs Need to find who owns the process on port 3006 and how to restart it. BUY-7302 commit in buywhere-site reportedly documents the runtime owner and systemd restart path. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…deploy Diagnostics from run 25263249341 confirmed: - Site runs as nohup (NOT systemd yet), PID at .runtime/buywhere-site.pid - Standalone root: .next-deploy/standalone/server.js on PORT=3006 HOSTNAME=127.0.0.1 - Build fails because old root-owned files in .next-deploy/standalone/.next-deploy/ - Sudo rules: NOPASSWD only for nginx — no buywhere-site restart available Changes: 1. Pre-build: rm -rf .next-deploy/standalone/.next-deploy to clear root-owned files (deploy user owns the .next-deploy parent, so rm should succeed) 2. Restart: kill via PID file, start new nohup process with same env vars Co-Authored-By: Paperclip <noreply@paperclip.ing>
…y files rm -rf on .next-deploy/standalone/.next-deploy fails because the deploy user cannot remove root-owned subdirectories (chmod also fails since they don't own the directories to change their permissions). Solution: temporarily patch next.config.mjs to use distDir='.next-fresh' for this build, then restart the server from .next-fresh/standalone/server.js. The root-owned .next-deploy can be cleaned up by an admin later. Co-Authored-By: Paperclip <noreply@paperclip.ing>
Replace stale inline spec (servers: api.buywhere.io/v1) with a 301 redirect to https://api.buywhere.ai/openapi.json. This makes buywhere.ai the authoritative consumer of the FastAPI-generated spec, eliminating the .io/.ai divergence permanently. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…doc with sed The column-0 Python heredoc content broke GitHub's YAML parser, causing workflow_dispatch to be unrecognized and dispatch to fail with 422. Replace python3 heredoc with equivalent sed in-line commands. Co-Authored-By: Paperclip <noreply@paperclip.ing>
EADDRINUSE on port 3006 indicated old process running with different PID than PID file. Add port-based kill (lsof/fuser) after PID-file kill to ensure port 3006 is free before starting new process. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…t 3006 When lsof/fuser/ps can't kill the existing process (root-owned), attempt to start the new Next.js on port 3007 and update nginx proxy_pass via sudo sed + sudo systemctl reload nginx (which IS in the deploy user's sudo rules). Falls back gracefully with an escalation message if swap fails. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…all back to port-swap Previous restart logic used PORT_PID=$(lsof|fuser|ps|grep|grep -v grep|awk|head) which exits non-zero via pipefail when no process is visible. With set -euo pipefail, this killed the script immediately after the PID kill step — no output, exit 1 in 130ms. Fix: use lsof/fuser || true (safe), then attempt to start on port 3006 directly. If the process exits within 6s (port blocked by root-owned proc), fall back to port-swap (start on 3007, update nginx proxy_pass, reload nginx). This eliminates the false 'cannot identify process' failure path. Co-Authored-By: Paperclip <noreply@paperclip.ing>
…l restart Previous script used hard `sudo cp` and `sudo nginx -s reload` which require a password prompt and always fail non-interactively. Replace with: - Plain cp first, then sudo -n cp, then sudo -n tee (no password required) - systemctl restart nginx with sudo -n fallback (matches working pattern) - validate_nginx() that treats pid-file permission errors as non-fatal - RESTART_ONLY input to skip file write when config already exists - Fix DEST path: drop .conf suffix (sites-enabled uses hostname-only files) - Smoke test: use MCP initialize method (tools/list is deprecated) Co-Authored-By: Paperclip <noreply@paperclip.ing>
…(B) > description(C) Adds a BEFORE INSERT OR UPDATE trigger that auto-populates search_vector with weighted tsvector values so full-text search prioritises title over brand over description. Also backfills NULL search_vector rows. Co-Authored-By: Paperclip <noreply@paperclip.ing>
Extract price constraints, country mentions, and sort intent from natural language queries before passing the cleaned text to PostgreSQL FTS. - New preprocessSearchQuery() in api/src/lib/queryPreprocessor.ts - Integrates into GET /v1/products/search with explicit-param priority - Detects: under/above/between prices, in/for <country>, sort intent - Removes noise words, price literals, and stop words from FTS query - Adds sort=price_asc|price_desc|rating_desc ORDER BY support - Updated cache key includes cleaned query and sort param - 53 unit tests in api/tests/queryPreprocessor.test.mjs Co-Authored-By: Paperclip <noreply@paperclip.ing>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a natural language query preprocessor to the search endpoint.
Changes
Example
GET /v1/products/search?q=best laptop under 1000 in Singapore→ cleaned:laptop, maxPrice:1000, countryCode:SG, sort:rating_descTesting
All 71 tests pass (53 NL preprocessor + 18 existing).