AI – SearchLayer Blog

Benchmarking Self Hosted Embedding Models

Posted on 14 December 2025by Sven

Vector embeddings power a lot of modern search and retrieval systems. In practice, though, choosing an embedding model is less about leaderboards and more about engineering tradeoffs:

How many tokens per minute can I push through it
How much GPU memory does it need

In this post I will walk through a small benchmark setup for four popular self hosted embedding models.

AI ElasticSearch Search

Setting Up ElasticSearch for Semantic Search with ELSER

Posted on 9 March 2025by Sven

In today’s data-driven world, efficient search is critical. Traditional keyword search falls short when understanding context and user intent. Semantic search solves this problem by understanding meanings rather than just matching words.

In this tutorial, I’ll guide you through setting up Elasticsearch with the Elastic Learned Sparse EncodeR (ELSER) model for powerful semantic search capabilities. ELSER is Elastic’s specialized ML model that creates sparse vector representations to efficiently capture semantic meaning while maintaining computational performance.

AI OpenSearch

Simple Guide to RAG Setup with OpenSearch – part 2

Posted on 9 February 2025by Sven

This is part two of the three-part RAG series. In this part, we will set up a Python script to load a dataset and pass the embedded text to OpenSearch.

AI OpenSearch

Simple Guide to RAG Setup with OpenSearch – part 1

Posted on 8 February 2025by Sven

In this guide, we explore how to get started with OpenSearch and OpenSearch Dashboards using Docker. This setup forms the foundation for creating a local RAG environment.