Skip to content

mel-ugaddan/product_search

Repository files navigation

Product Search Benchmarking with BERT and pgvector

An exploration of semantic search using BERT embeddings, PostgreSQL's pgvector extension, and HNSW indexing for low-latency semantic result.

Project Overview

This project the effectiveness of transformer-based semantic search for e-commerce product discovery. By using BERT's text representations and pgvector's efficient vector similarity search with HNSW (Hierarchical Navigable Small World) indexing, we explore whether semantic understanding improves product search beyond traditional keyword matching.

Author Note: This mini-blog serves as part of my software engineering portfolio, demonstrating practical applications of modern NLP and vector database technologies. I've spent time in academia learning transformers ( Modern NLP Works ) and I have experience working in technology like elasticsearch.

Technical Stack

  • BERT (Bidirectional Encoder Representations from Transformers): For generating dense vector representations of product descriptions and search queries
  • PostgreSQL + pgvector: Vector database extension for storing and querying embeddings
  • HNSW Indexing: Approximate nearest neighbor algorithm for fast similarity search
  • Python: Primary language for embeddings generation and benchmarking

Search Examples

Example 1

Search Text: "Weighing Scale"

Rank Product Name
1 Glun Multipurpose Portable Electronic Digital Weighing Scale Weight Machine (10 Kg - with Back Light)
2 Kitchenwell Multipurpose Portable Electronic Digital Weighing Scale Weight Machine
3 CARDEX Digital Kitchen Weighing Machine Multipurpose Electronic Weight Scale With Back Lite LCD Display for Measuring Food, Cake, Vegetable, Fruit (KITCHEN SCALE)
4 Dr Trust Electronic Kitchen Digital Scale Weighing Machine (Blue)
5 Gadgetronics Digital Kitchen Weighing Scale & Food Weight Machine for Health, Fitness, Home Baking & Cooking (10 KGs,1 Year Warranty & Batteries Included)

Query Time: 32.19 ms


Example 2

Search Text: "Iphone charger"

Rank Product Name
1 USB Charger, Oraimo Elite Dual Port 5V/2.4A Wall Charger, USB Wall Charger Adapter for iPhone 11/Xs/XS Max/XR/X/8/7/6/Plus, iPad Pro/Air 2/Mini 3/Mini 4, Samsung S4/S5, and More
2 ESR USB C to Lightning Cable, 10 ft (3 m), MFi-Certified, Braided Nylon Power Delivery Fast Charging for iPhone 14/14 Plus/14 Pro/14 Pro Max, iPhone 13/12/11/X/8 Series, Use with Type-C Chargers, Black
3 iPhone Original 20W C Type Fast PD Charger Compatible with I-Phone13/13 mini/13pro/13 pro Max I-Phone 12/12 Pro/12mini/12 Pro Max, I-Phone11/11 Pro/11 Pro Max 2020 (Only Adapter)
4 Portronics Adapto 20 Type C 20W Fast PD/Type C Adapter Charger with Fast Charging for iPhone 12/12 Pro/12 Mini/12 Pro Max/11/XS/XR/X/8/Plus, iPad Pro/Air/Mini, Galaxy 10/9/8 (Adapter Only) White
5 Kanget [2 Pack] Type C Female to USB A Male Charger Charging Cable Adapter Converter compatible for iPhone 14, 13, 12,11 Pro Max/Mini/XR/XS/X/SE, Samsung S20 ultra/S21/S10/S8/S9/MacBook Pro iPad (Grey)

Query Time: 44.42 ms


Example 3

Search Text: "Coffee maker"

Rank Product Name
1 Morphy Richards New Europa 800-Watt Espresso and Cappuccino 4-Cup Coffee Maker (Black)
2 InstaCuppa Milk Frother for Coffee - Handheld Battery-Operated Electric Milk and Coffee Frother, Stainless Steel Whisk and Stand, Portable Foam Maker for Coffee, Cappuccino, Lattes, and Egg Beaters
3 Amazon Basics 650 Watt Drip Coffee Maker with Borosilicate Carafe
4 PHILIPS Drip Coffee Maker HD7432/20, 0.6 L, Ideal for 2-7 cups, Black, Medium
5 Oratech Coffee Frother electric, milk frother electric, coffee beater, cappuccino maker, Coffee Foamer, Mocktail Mixer, Coffee Foam Maker, coffee whisker electric, Froth Maker, coffee stirrers electric, coffee frothers, Coffee Blender, (6 Month Warranty) (Multicolour)

Query Time: 33.42 ms

Inference Time

Method Speed
Transformer module (Baseline) 15.20ms
Llamacpp module 5.19ms

Note: Average over 5 runs


Potential Improvements

  • Search engine as a Full Backend service
  • It is known that Python backend service is slow. I've been thinking to convert this to something like Go or Rust
  • A Hybrid indexing like a combined version of HNSW with other methods
  • Larger Database
  • Personalized Search like Youtube

About

A simple product search using representation from Pretrained Language Model like BERT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages