Now Available · On-Device

Moorcheh on EdgeVector search that leaves room for the LLM.

moorcheh-edge compresses embeddings with MIB and searches them locally — text or precomputed vectors, optional on-device RAG via Ollama. No API keys. Proven on Arduino UNO Q at ~17 ms for 10k vectors.

Get Started See how it works ↓

~17 ms

search @ 10k

10K

store cap

~20 MiB

server RAM

moorcheh-edge terminalInstall

$ ▋

Client

Server

MIB

Store

InstallDeployUploadSearch

How It Works

Your Device. Your Knowledge.

One CLI, one Docker image, one local store — text or vectors, optional local RAG, no API keys.

On-device perimeter · No egress

Stage 01 / 03

pip installmoorcheh-edge

moorcheh-edge upDocker · :8080

Ready~/.moorcheh-edge

One command. On-device.

moorcheh-edge up pulls moorcheh/moorcheh-edge, starts the API on localhost:8080, and warms BGE-small for text search. Add --with-llm for local RAG via Ollama (qwen2.5:0.5b-instruct). No API keys — Python 3.10+ and Docker.

Why On Edge

Built for Local-First Vector Workflows

Same Moorcheh MIB/EDM engine. Sized for one device — search, optional RAG, and room for the LLM.

100%

On-device

Nothing leaves the host

Vectors are binarized and stored on disk at ~/.moorcheh-edge/data. Queries are scored locally. Optional RAG calls Ollama on the same machine — no cloud vector DB.

Zero

API keys

No signup, no tokens

Run moorcheh-edge up and start uploading. There's no account, no billing, no rate limiter — just the local HTTP server on localhost:8080.

MIB

Engine

Information-theoretic codes

Maximum Information Binarization compresses float embeddings into compact one-bit codes scored with EDM — ~96 B per 768-d vector vs ~3 KB float32.

~20 MiB

Server

Tiny Docker footprint

On Arduino UNO Q we measured ~20 MiB container RSS at 10k vectors and ~17 ms median search — leaving RAM for BGE and a local LLM.

384-d

Text default

Text or bring your own

Text mode embeds with BGE-small (384-d) on the client. Vector mode accepts precomputed floats from 128–1536 dimensions. Same MIB store either way.

Offline

Capable

Works without the internet

After the first image and model pull, moorcheh-edge runs entirely offline. Built for air-gapped boxes, field devices, and kiosks like Arduino UNO Q.

Limits

Transparent Edge Quotas

What you get with the free moorcheh-edge store. Need more? Talk to us about Moorcheh.

10,000

max items

One local store

Each moorcheh-edge store holds up to 10,000 items. Upload beyond the cap returns HTTP 409. Check usage via moorcheh-edge status.

384 / 128–1536

dimensions

Text or vector mode

Text mode locks to 384-d (BGE-small). Vector mode accepts 128, 256, 384, 768, 1024, or 1536 floats — set on first upload. Finite values only.

RAG

optional

Search + local answer

Search works without an LLM. For moorcheh-edge answer, run up --with-llm to install Ollama and pull qwen2.5:0.5b-instruct on Linux.

moorcheh-edge status

{
  "status": "ok",
  "dimension": 384,
  "store_mode": "text",
  "embedding_model": "BAAI/bge-small-en-v1.5",
  "items": 42,
  "max_items": 10000,
  "remaining": 9958
}

Get started

Three commands. On-device.pip install · up --with-llm · search or answer locally

Quickstart Guide Full Documentation