Moorcheh on EdgeVector search on the device.
Bring your own 1024-dim embeddings. moorcheh-edge binarizes and searches them locally — no cloud calls, no API keys, no Ollama. Just Docker, a single command, and a sub-millisecond store.
Your Device. Your Vectors.
One CLI, one Docker image, one local store — no Ollama, no API keys, no cloud.
One command. On-device.
moorcheh-edge up pulls the moorcheh/moorcheh-edge image and starts the API on localhost:8080. No Ollama, no API keys, no signup — just Python 3.10+ and Docker.
Built for Local-First Vector Workflows
Same Moorcheh engine. None of the cloud machinery. Runs anywhere Docker runs.
Nothing leaves the host
Vectors are binarized and stored on disk at ~/.moorcheh-edge/data. Queries are scored locally. The API only listens on localhost.
No signup, no tokens
Run moorcheh-edge up and start uploading. There's no account, no billing, no rate limiter — just the local HTTP server.
Information-theoretic codes
Maximally Informative Binarization compresses 1024-d floats into compact codes scored with bitwise distance — the same engine that powers cloud Moorcheh.
Docker does the rest
moorcheh-edge up pulls moorcheh/moorcheh-edge:latest, wires the volume, and exposes localhost:8080. No Ollama. No services to wire.
Bring your own embeddings
Use any embedding model upstream — OpenAI, Cohere, BGE, custom — as long as you produce 1024 floats per vector. No model is bundled.
Works without the internet
After the first image pull, moorcheh-edge runs entirely offline. Perfect for air-gapped boxes, field laptops, and on-device prototypes.
Transparent Edge Quotas
What you get with the free moorcheh-edge store. Need more? Talk to us about Moorcheh.
One local store
Each moorcheh-edge store holds up to 10,000 vectors. Upload beyond the cap returns HTTP 409. Check usage via moorcheh-edge status.
Fixed vector shape
Every vector must be exactly 1024 finite floats — no NaN or Infinity. Bring embeddings from any upstream model that outputs 1024-d.
No plain-text queries
Edge is vector-only. Generate embeddings in your app or upstream service first — there's no Ollama, no text path, no tokenizer on the device.
{
"status": "ok",
"vector_dimension": 1024,
"items": 42,
"max_items": 10000,
"remaining": 9958
}