Research & Innovation

The Science of Semantic Understanding

Beyond Distance: Exploring the information-theoretic foundations of semantic search and pushing the boundaries of AI-powered information retrieval.

Vector Database Comparison

Performance Benchmarks

Comparing Moorcheh against leading vector databases using comprehensive evaluation by ChatGPT and Gemini

Combined Performance Score

Overall performance combining relevance and completeness metrics

Evaluation Methodology

1Models Used

gpt-4o-mini and gemini-pro-2.5

2Sample Query

"Tesla's Financial Outlook?"

3Evaluation Process

  • Query processing through AI model pipelines
  • Document chunk retrieval from vector databases
  • Context evaluation using the standardized prompt
  • Scoring and rationale recording by LLM judges

4Evaluation Criteria

Relevance (0-100)

How related are the document chunks to the query?

• 100 = Entirely focused on query subject
• 50 = Partially related content
• 0 = Topically unrelated

Completeness (0-100)

How complete is the answer using only the given context?

• 100 = All necessary information included
• 50 = Some key information missing
• 0 = No necessary information present

Detailed Evaluation Prompt

Answer the two following questions based on the retrieved passages provided from each query. 1. Relevance Evaluation Does the retrieved context directly pertain to the topic and scope of the query? Provide a relevance score between 0 and 100, where: • 100 = The context is entirely focused on the query's subject • 50 = The context is partially related (e.g. correct company but wrong financial quarter) • 0 = The context is topically unrelated to the query Rationale: Briefly explain what aspects of the context are topically aligned with the query. If the context includes off-topic information, describe it. 2. Completeness Evaluation If someone were to answer the query using only this context, how complete and sufficient would their answer be? Provide a completeness score between 0 and 100, where: • 100 = The context includes all necessary information to fully answer the query • 50 = The context includes some, but not all, key information (e.g., only Q1 revenue when the query asks for full-year revenue) • 0 = The context includes none of the necessary information to answer the query Rationale: Clearly state whether the context contains the required facts, figures, or explanations needed to construct a complete answer. If any crucial components are missing, specify what they are.

5Key Findings

Moorcheh scores consistently highly across all queries for relevance and completeness

Specialized vector databases generally outperform general-purpose databases

Different LLMs may evaluate the same results differently, with varying preferences for context relevance and completeness

Benchmarking & Validation

Measuring Performance

We are committed to transparently evaluating Moorcheh's performance against industry standards.

Our Vision for Research

Moorcheh is a research-driven company. We are continuously exploring new frontiers in information theory and its applications to AI.

Advanced binarization techniques
Refinements to the ITS model
Applications in explainable AI and adversarial robustness
Novel methods for knowledge graph construction from semantic relationships
Future benchmarks on standard datasets (SIFT, GLOVE, etc.)

Interested in our research?

Explore our publications or get in touch to discuss collaborations and partnerships.