embedding – SearchLayer Blog

Benchmarking Self Hosted Embedding Models

Posted on 14 December 2025by Sven

Vector embeddings power a lot of modern search and retrieval systems. In practice, though, choosing an embedding model is less about leaderboards and more about engineering tradeoffs:

How many tokens per minute can I push through it
How much GPU memory does it need

In this post I will walk through a small benchmark setup for four popular self hosted embedding models.