Vector Store search: correct chunk only retrieved at top_k ≥ 45
When searching a Vector Store via the /vector_stores/{id}/search endpoint, the chunk with the highest similarity score is not returned unless max_num_results is set to 45 or higher. This violates the basic invariant that top_k=N results should be a strict prefix of top_k=M results when M > N.
This appears to be a severe HNSW recall issue, not a ranking issue — the correct chunk exists in the index with a high score (0.88), but the retrieval engine fails to surface it at small top_k values.
Environment
API:
POST /v1/vector_stores/{vector_store_id}/searchVector store: ~40 markdown files, default chunking strategy
Embedding model: default
No filters, no custom ranking options, no rewrite_query
Query: a short non-English search phrase (4 words)
Expected top result: a specific document (referred to as Doc A below) that is the most semantically relevant to the query
Reproduction
Same query, same vector store, same API call — only max_num_results changes.
Run 1: max_num_results = 50
Rank 1: Doc A score = 0.8838 ← correct, highest score
Rank 2: Doc B score = 0.8300
Rank 3: Doc C score = 0.7991
Rank 4: Doc D score = 0.7846
...
Doc A ranks #1 with the highest score (0.8838).
Run 2: max_num_results = 2
Rank 1: Doc B score = 0.8300 ← wrong document
Rank 2: Doc C score = 0.7991
Doc A is completely missing , despite having the highest score (0.8838) in Run 1.
Threshold testing
I tested max_num_results at multiple values to find when Doc A first appears in the results:
| max_num_results | Doc A in results? | Doc A rank |
|---|---|---|
| 2 | -– | |
| 5 | -– | |
| 10 | -– | |
| 20 | -– | |
| 30 | -– | |
| 44 | -– | |
| 45 | 1 | |
| 50 | 1 |
Doc A only appears starting at max_num_results = 45, and when it does appear, it is ranked #1 with the highest score by a clear margin.
Why this is a bug, not expected behavior
In a correctly functioning vector search:
top_k = Nresults MUST be a strict prefix oftop_k = Mresults whenM > N, assuming deterministic ranking by scoreRecall@10 on a small index (~few hundred chunks) should be ≥95% for HNSW with reasonable parameters
A chunk with score 0.8838 should never be excluded from results that include chunks with scores 0.7991 and below
All three properties are violated here. The most likely root cause is that ef_search (HNSW exploration parameter) is set too low and/or scales with top_k, causing graph traversal to terminate before reaching the node containing Doc A’s embedding.
Discussion in the ATmosphere