Skip to content

Conversation

zhaih
Copy link
Contributor

@zhaih zhaih commented Sep 22, 2025

Description

In #14331 @mayya-sharipova made a great change to introduce this smart algorithm of computing join set and then use it to accelerate the graph merge. This PR tries to make the above algorithm work with the concurrent graph merge.

Progress

I made the first version which just naively let each thread deal with one segment and then handle rest of the nodes like before and get a quite mixed and ineresting benchmark result. Basically the changed code only performs better in 10M doc situation, which kind of making sense since I made it handle one segment per thread, so if there's not enough segments it won't be able to make use of all the threads.

Benchmark

cand

recall  latency(ms)  netCPU  avgCpuCount      nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.784        0.254   0.249        0.980   1000000   100      50       64        250         no     28.48      35108.66           98.94             1          435.66       381.470      381.470       HNSW
 0.761        0.266   0.261        0.981   2000000   100      50       64        250         no     61.42      32562.15          231.87             1          875.69       762.939      762.939       HNSW
 0.738        0.281   0.276        0.982   5000000   100      50       64        250         no    182.75      27359.33          624.20             1         2213.94      1907.349     1907.349       HNSW
 0.716        0.310   0.295        0.952  10000000   100      50       64        250         no    394.47      25350.47          782.60             1         4456.40      3814.697     3814.697       HNSW

baseline

recall  latency(ms)  netCPU  avgCpuCount      nDoc  topK  fanout  maxConn  beamWidth  quantized  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.817        0.284   0.270        0.951   1000000   100      50       64        250         no     28.31      35318.22           66.91             1          444.55       381.470      381.470       HNSW
 0.794        0.303   0.288        0.950   2000000   100      50       64        250         no     63.60      31447.53          158.45             1          894.32       762.939      762.939       HNSW
 0.774        0.306   0.301        0.984   5000000   100      50       64        250         no    180.95      27632.40          253.59             1         2274.45      1907.349     1907.349       HNSW
 0.755        0.334   0.329        0.985  10000000   100      50       64        250         no    433.43      23071.94         1022.34             1         4591.37      3814.697     3814.697       HNSW

@msokolov
Copy link
Contributor

the results are confusing indeed! I would suggest multiple runs with a smaller index to see if there is substantial noise? Also - why do you say there is only a single segment in these smaller indexes? There should be many segments produced during the indexing run. I do see the result indicates 1 segment, but probably that is because you chose to -forceMerge and it just says the number of segments at the end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants