Search 9 million+
mathematical theorems

Describe a result in natural language — TheoremSearch finds it across arXiv, the Stacks Project, and five other sources. 70% more accurate than LLM search.

Luke AlexanderEric LeonenSophie SzetoArtemii RemizovIgnacio TejedaGiovanni InchiostroVasily Ilin

TheoremSearch screenshot
9M+
Theorems indexed
7
Sources
70%
More accurate than LLM search
~5s
Query latency

Why TheoremSearch?

Mathematical knowledge is scattered across millions of papers. Important results hide as lemmas in obscure sources, and existing search tools only operate at the document level.

For mathematicians, the recent AI “breakthroughs” on Erdős problems illustrate this: most turned out to be rediscoveries of results already in the literature. As Tao observed, many “open” problems are open through obscurity rather than difficulty. DeepMind’s Aletheia confirmed this: most of its correct solutions were identifications of existing literature.

For AI agents, the bottleneck is the same. Without the relevant literature, LLMs fabricate incorrect arguments. In our experiments, Claude answered a research-level algebraic geometry question incorrectly on its own, but correctly when given access to TheoremSearch as a RAG tool.

How It Works

  1. 1

    Parse theorems. We extract over 9 million theorem statements from LaTeX sources across arXiv and seven other sources using a combination of plasTeX, TeX logging, and regex-based parsing.

  2. 2

    Generate slogans. Each theorem is summarized into a concise natural-language description ("slogan") by DeepSeek V3, converting formal LaTeX notation into searchable English text.

  3. 3

    Embed and index. Slogans are embedded using Qwen3-Embedding-8B and stored in a PostgreSQL database with pgvector, using an HNSW index with binary quantization for fast approximate nearest-neighbor search.

  4. 4

    Retrieve. User queries are embedded with the same model. We retrieve the top-k theorems by Hamming distance, then re-rank by cosine similarity.

Performance & Coverage

Retrieval Performance (Hit@10)

MODELTHM-LEVELPAPER-LEVEL
Google Search0.378
ChatGPT 5.20.180
Gemini 3 Pro0.252
Ours0.4320.505

Theorem-level = retrieval of exact theorem statements
Paper-level = retrieval of the correct paper containing the theorem

Data Sources

SOURCETHEOREMS
arXiv9,246,761
ProofWiki23,871
Stacks Project12,693
Open Logic Project745
CRing Project546
Stacks and Moduli506
HoTT Book382
An Infinitely Large Napkin231

REST API

TheoremSearch provides a production REST API for semantic theorem search.

curl https://api.theoremsearch.com/search \
  -H "Content-Type: application/json" \
  -d '{"query": "smooth DM stack codimension one", "n_results": 5}'

Returns theorem-level results with metadata and similarity scores. Full API reference →

MCP Tool

TheoremSearch is available as an MCP tool for AI agents via a single theorem_search tool.

ENDPOINThttps://api.theoremsearch.com/mcp