Skip to main content

ProofsPath · 30 min

Navigating Mathlib4

Important
For:MLGeneralResearch

Why This Matters

Mathlib4 contains, as of 2026, roughly 1.6 million lines of Lean 4 code: about 200,000 theorems and definitions covering analysis, algebra, number theory, topology, category theory, measure theory, and combinatorics. The library doubled in size between 2023 and 2025 and continues to grow.

For a Lean user, the productive question is rarely "how do I prove this?" The productive question is "has someone already proved this in Mathlib, and what is it called?" The dominant time sink for new Mathlib contributors is re-proving lemmas that already exist with slightly different names or in unexpected namespaces. The library is large enough that knowing it cold is impossible; the operational skill is searching effectively.

This page covers the four mechanisms that make Mathlib navigable: the naming convention, the search tactics, the documentation site, and Loogle / Moogle.

The Naming Convention

Mathlib4 lemma names are constructed by gluing together fragments that describe the structure of the statement. The fragments are concatenated with underscores and the result reads (approximately) left-to-right as the lemma reads.

Pattern: <conclusion>_<assumption_1>_<assumption_2>_...

Or, when the lemma is a property of one object, <Namespace>.<property>.

Examples:

  • add_zero : a + 0 = a: the conclusion is "add zero", which equals "a".
  • zero_add : 0 + a = a: distinct lemma; the convention reads "the conclusion has zero on the left."
  • mul_pos : 0 < a → 0 < b → 0 < a * b: "mul is pos" given the two positivity hypotheses.
  • Nat.succ_le_succ : n ≤ m → n.succ ≤ m.succ: namespace Nat because it is about natural numbers.
  • List.length_append : (l₁ ++ l₂).length = l₁.length + l₂.length: namespace List, the statement is about length of append.
  • Finset.sum_comm : ∑ i in s, ∑ j in t, f i j = ∑ j in t, ∑ i in s, f i j: namespace Finset, the property is "sum commutes."

The convention is consistent enough that you can guess the name of a lemma you have not seen. "Sum over a finset of zero is zero": Finset.sum_const_zero. "Cardinality of a union is at most the sum of cardinalities": Finset.card_union_le. The conventions document is at leanprover-community.github.io/contribute/naming.html.

Search Tactics

Inside Lean 4 you have several tactics that search Mathlib for the lemma you need.

exact? searches for a single Mathlib lemma whose application closes the current goal. Place exact? where you would write the proof; Lean returns the named lemma if it finds one. If no lemma closes the goal directly, exact? returns nothing.

example (n : Nat) : n + 0 = n := by exact?
-- Try this: exact rfl

apply? is exact?'s weaker cousin: it finds a lemma whose conclusion unifies with the goal, leaving any hypotheses as new subgoals. Useful when the goal can be reduced to simpler goals via a single inference step.

example (n m : Nat) (h : n ≤ m) : n.succ ≤ m.succ := by apply?
-- Try this: exact Nat.succ_le_succ h

rw? searches for rewrite rules that change the goal. Less precise than exact? but useful when you know the goal is one rewrite away from a closeable form.

library_search (community-maintained) is the broader search that handles trickier patterns; superseded by exact? in most cases but occasionally finds things exact? misses.

The search tactics are productive during proof writing. The recommended workflow:

  1. State the theorem.
  2. Open the tactic proof with by exact? or by apply?.
  3. If a lemma is found, use it.
  4. If not, decompose the goal (with intro, cases, rcases) and call exact? again on the simpler subgoals.

Most theorems in Mathlib decompose to two or three search-tactic invocations.

The Documentation Site

The auto-generated docs at leanprover-community.github.io/mathlib4_docs/ are searchable by lemma name, namespace, or full-text. The conventions:

  • The header of each lemma page shows the full signature, the file location, and the source line number.
  • The body shows the docstring (if any) and the inline source.
  • The "uses" and "used by" sections show the dependency graph.
  • Each file has its own page; the directory tree mirrors the import structure.

For a Mathlib contributor, the docs are the canonical reference. For a learner, the docs are a sometimes-overwhelming firehose; the search tactics are usually the more productive entry point.

Loogle and Moogle

Two community tools complement the in-editor search:

Loogle (loogle.lean-lang.org) is type-based search. Type a Lean expression like ?n * 0 and Loogle returns lemmas whose statement matches that pattern, including under reordering and renaming. Operators with question marks are wildcards. Examples:

  • ?a + 0 = ?aadd_zero, Nat.add_zero, Int.add_zero, etc.
  • ?a * ?b = ?b * ?amul_comm, Nat.mul_comm, etc.
  • 0 ≤ ?a^2 → lemmas about non-negativity of squares.

Loogle's index covers all of Mathlib4 and is updated within hours of new lemma merges.

Moogle (moogle.ai) is natural-language semantic search. Type a sentence describing what you want; Moogle returns relevant lemmas. Useful when you know the math but not the Lean name. Example queries:

  • "the integral of a non-negative function is non-negative"
  • "the kernel of a group homomorphism is a normal subgroup"
  • "Markov's inequality"

Moogle is less precise than Loogle but better for cases where you do not know the canonical Lean phrasing.

File Structure

Mathlib4 organizes files by topic. The top-level directories under Mathlib/:

  • Algebra/: groups, rings, fields, modules, polynomials, linear algebra.
  • Analysis/: limits, derivatives, integration, normed spaces, complex analysis.
  • CategoryTheory/: categories, functors, limits, sheaves.
  • Combinatorics/: finite sums, graphs, Young tableaux.
  • Data/: basic data types (Nat, Int, Real, List, Finset).
  • Geometry/: Euclidean geometry, manifolds.
  • LinearAlgebra/: modules-as-vector-spaces, matrices, determinants.
  • MeasureTheory/: measures, integrals, LpL^p spaces.
  • NumberTheory/: arithmetic functions, prime distributions, pp-adics.
  • Order/: partial orders, lattices, filter convergence.
  • Probability/: probability mass and density, martingales.
  • Topology/: topological spaces, compactness, manifolds.

Within each directory, files have descriptive names: Mathlib/Probability/Independence/Basic.lean, Mathlib/Analysis/InnerProductSpace/Basic.lean. Importing a file gives you everything in that file plus its transitive dependencies. The Mathlib.Tactic family contains the tactic implementations.

Worked Example: Find the Right Lemma

You want to prove "for a,b>0a, b > 0, the product ab>0a b > 0." Search workflow:

Loogle. Query 0 < ?a * ?b. Returns mul_pos, Nat.mul_pos, Int.mul_pos, etc. Pick the one matching your type.

exact?. State the theorem and call exact?:

example (a b : ℝ) (ha : 0 < a) (hb : 0 < b) : 0 < a * b := by exact?
-- Try this: exact mul_pos ha hb

Naming convention. Guess from convention: "mul is positive" = mul_pos. The statement is exactly 0 < a → 0 < b → 0 < a * b, so the lemma name matches the structure.

Mathlib docs. Search mul_pos in the docs. The lemma is in Mathlib.Algebra.Order.Ring.Lemmas, with signature theorem mul_pos [Mul α] [Zero α] [PartialOrder α] [PosMulStrictMono α] (ha : 0 < a) (hb : 0 < b) : 0 < a * b.

All four routes converge to the same lemma. The dominant skill is being comfortable starting with whichever route is fastest for your guess: exact? if you do not know the name, Loogle if you know the shape, the convention if you can reconstruct the name, the docs if you need to verify.

Common Misconceptions

Watch Out

Lemma names are not arbitrary

The naming convention is followed consistently. If a lemma seems oddly named, check Mathlib's Mathlib.Tactic.Linarith.Lemmas or related auxiliary files; sometimes a lemma is named after its purpose (Nat.lt_irrefl) rather than its full statement. The fast way to learn the convention is to read the source of a few well-known files (Mathlib/Algebra/Order/Group/Defs.lean, Mathlib/Data/Real/Basic.lean) and observe the patterns.

Watch Out

exact? does not always find the closing lemma

exact? uses Lean's unifier and a curated index. It can miss lemmas that require nontrivial elaboration, especially under heavy type-class chains. When exact? fails, try apply?, then decompose the goal, then Loogle. Multiple search strategies catch different gaps.

Watch Out

Mathlib is not 'frozen'

Mathlib evolves daily. Lemma names occasionally change (e.g., a refactor moves MeasureTheory.Integrable.add to a different namespace). The TheoremPath FormalSLT manifest pins the Mathlib commit hash precisely for this reason: a theorem checked against Mathlib at commit X is not necessarily valid against the head of master. See /methodology for the pinning policy.

Watch Out

Naming-convention lookups beat blind search

For a known lemma class, the naming convention is faster than any search tool. "Is 0 + a = a in Mathlib?" → guess zero_add and check. The lookup is two seconds. Search tactics are for lemmas you do not know exist; the convention is for lemmas you can guess.

Exercise

Problem. Find the Mathlib lemma that says "for any real number aa, if a0a \neq 0 then a>0|a| > 0." State it, find its full name in Mathlib4, and write a Lean theorem that invokes it.

Hint. The Mathlib name follows the convention "absolute value positive given nonzero," with namespace dependent on the type. For real numbers, the namespace is at the top level.

Solution.

The lemma is abs_pos, statement abs_pos : a ≠ 0 ↔ 0 < |a| (note: it is an iff, so you take the .mpr direction). In a Lean file:

import Mathlib.Analysis.Normed.Order.Basic

example (a : ℝ) (h : a ≠ 0) : 0 < |a| := abs_pos.mpr h

Or with exact?:

example (a : ℝ) (h : a ≠ 0) : 0 < |a| := by exact?
-- Try this: exact abs_pos.mpr h

The naming convention route: "absolute value is positive" reads as abs_pos. The iff is the standard packaging; for the strictly forward direction you would name it abs_pos_of_ne_zero if it existed separately, which it does (abs_pos_of_ne_zero : a ≠ 0 → 0 < |a| is also defined, syntactically abs_pos.mpr).

Implementation Note

Configure your Lean editor for productive search:

  • VS Code. Install the Lean 4 extension; the docstring popover shows the lemma signature on hover.
  • CLI lookup. From the terminal in a Mathlib project, grep -rn "theorem mul_pos" Mathlib/ gives the source. Faster than scrolling the docs site for known lemmas.
  • Zulip. The Lean community Zulip (leanprover.zulipchat.com) has a #new members channel; questions of the form "I want to prove X, which Mathlib lemma do I use?" are answered within hours.
  • Loogle on the command line. Loogle has a CLI version (pip install loogle) for batch search.

The productivity rule for a new Mathlib contributor: if you have been writing a proof for more than 10 minutes and have not yet run a search tactic, run one. Most "this should be easy" proofs are one Mathlib lemma away.

Connection to TheoremPath

The FormalSLT manifest lists each kernel-checked theorem and the Mathlib lemmas it invokes. The TheoremPath methodology requires that any lemma used in a FormalSLT proof must be in the pinned Mathlib commit; new lemmas added to head-of-Mathlib are not available until the manifest pin advances. See /methodology for the version-pinning policy and /evidence for the per-theorem audit trail.

The worked Lean proof tour walks through one FormalSLT proof file and shows every Mathlib lemma invoked, with the docs link and naming-convention rationale.