MemTurbo

MemTurbo

A Precog Labs App

Beta

Store, search, and retrieve agent memories with industry-leading vector compression. Cut storage costs by 95% while maintaining semantic accuracy.

MemTurbo uses TurboQuant compression to reduce vector storage by up to 95% while preserving semantic search quality. Built for AI agents that need persistent, searchable memory across conversations.

How It Works

Three steps from raw text to searchable compressed memory

Step 1

Store Memories

Send text content via REST API, Python/TypeScript SDK, or MCP server. MemTurbo ingests it with metadata like agent ID, tags, and session context.

Step 2

Compress & Embed

TurboQuant generates embeddings and compresses vectors using Lloyd-Max quantization with QJL residual correction for 95% storage reduction.

Step 3

Search & Retrieve

Query memories with semantic search powered by pgvector HNSW indexing. Results are ranked by cosine similarity with scores, filtered by tags, agent, or session.

Features

Everything you need for AI agent memory management

TurboQuant Compression

Lloyd-Max quantization with QJL residual correction reduces vector storage by up to 95% while maintaining cosine similarity accuracy.

Semantic Search

Find memories by meaning, not just keywords. Cosine similarity search across compressed vectors with relevance scoring via pgvector HNSW.

Multi-Tenant Isolation

Row-level security ensures complete data isolation between organizations and projects. Built for production multi-tenancy from day one.

Agent Memory

Tag memories by agent, user, and session. Build persistent context for AI agents that remember across conversations and interactions.

Version History

Track memory changes over time with automatic versioning. Every update creates a new version, preserving the complete history.

REST API & SDKs

Full REST API with TypeScript and Python SDKs. Store, search, update, and delete memories programmatically with typed clients.

MCP Server

Model Context Protocol server for direct integration with AI assistants like Claude. Expose memories as tools for LLM agents to use natively.

Background Processing

Async embedding pipeline with configurable worker pools. Non-blocking ingestion means your API stays fast while heavy compute runs in background.

API Key Authentication

Simple, secure API key auth with per-key rate limiting. Create multiple keys with different permissions and rate limits per project.

What It Provides

A complete memory infrastructure for AI applications

Memory Storage

Persistent storage with metadata, tags, agent IDs, and session tracking

Vector Search

Semantic similarity search with cosine distance scoring via pgvector

95% Compression

TurboQuant 3-bit compression with PolarQuant and QJL residuals

Multi-Platform

REST API, TypeScript SDK, Python SDK, MCP server, and web dashboard

Simple, Transparent Pricing

Start free, upgrade when you need more storage or collaboration

Free
$0/mo
  • 1,000 memories
  • Basic embedding model
  • REST API access
Get Started
ProPopular
$29/mo
  • 50,000 memories
  • Advanced models (BGE-M3)
  • Priority embedding queue
Upgrade
Team
$49/user/mo
  • Unlimited memories
  • Organizations & RBAC
  • MCP server access
Start Trial
Enterprise
Custom
  • On-prem deployment
  • Bring your own embeddings
  • SLA & uptime guarantees
Contact Sales
MemTurbo

Ready to add memory to your AI agents?

Get started with the dashboard or integrate directly via the REST API. No credit card required for the free tier.

Launch Dashboard