External Publication
Visit Post

Native binary embeddings experiment: curious about your thoughts

Hugging Face Forums [Unofficial] June 25, 2026
Source

Thanks a lot for your time and your responses, really helpful for someone still learning.

One point I want to clarify on the bit budget comparison, because I think it reframes things a bit:

  • Float 384-dim = 384 × 32 = 12 288 bits
  • Post-hoc binary 384-dim = 384 bits
  • Native binary 2048-dim = 2 048 bits

So post-hoc binary isn’t a middle ground — it’s an extreme compression that shows clearly in the results (Recall@10 drops from 0.313 → 0.236, −25%). Native binary 2048 sits between the two: 6× fewer bits than float, but 5× more than post-hoc, and it recovers meaningful recall (0.276) while being 12× faster on CPU.

The goal of the experiment isn’t really to prove native binary beats post-hoc at equal bit budget. It’s more basic than that: can you trade weight precision for more dimensions and still get useful retrieval on CPU-only hardware? The results suggest yes, and 2048-dim seems to be a reasonable sweet spot for that trade-off.

Your suggestions on the dimension sweep (512, 1024) and bit diagnostics are genuinely useful next steps — I’ll add them, and I will also add the Q4 quantization on float 384 to compare.

The equal-bit ablation is also interesting, but it’s a different research question than what I was trying to answer here.

Appreciate the BPR/JPQ/RepCONC references, I wasn’t familiar with all of them.

I’ll keep you posted.

Discussion in the ATmosphere

Loading comments...