CategoriesAI

Benchmarking Self Hosted Embedding Models

Vector embeddings power a lot of modern search and retrieval systems. In practice, though, choosing an embedding model is less about leaderboards and more about engineering tradeoffs:

  • How many tokens per minute can I push through it
  • How much GPU memory does it need

In this post I will walk through a small benchmark setup for four popular self hosted embedding models.

CategoriesAIElasticSearchSearch

Setting Up ElasticSearch for Semantic Search with ELSER

In today’s data-driven world, efficient search is critical. Traditional keyword search falls short when understanding context and user intent. Semantic search solves this problem by understanding meanings rather than just matching words.

In this tutorial, I’ll guide you through setting up Elasticsearch with the Elastic Learned Sparse EncodeR (ELSER) model for powerful semantic search capabilities. ELSER is Elastic’s specialized ML model that creates sparse vector representations to efficiently capture semantic meaning while maintaining computational performance.