Simplicial Complexes

Overview & Motivation

Take 200 points sampled from a ring (an annulus) and 200 points sampled from a Gaussian blob. Match their means and variances exactly. Now try to tell them apart: a $t$ -test sees nothing, PCA sees nothing, and the covariance matrices are indistinguishable. Yet the ring has a hole through its center and the blob doesn’t — and that topological difference can be the signal that matters, whether you’re classifying dynamical regimes, detecting sensor network anomalies, or finding structure in high-dimensional embeddings.

The problem is that point clouds are discrete — they don’t have “holes” in any obvious sense. To detect shape, we need to build a continuous structure on top of the raw data. That structure is a simplicial complex: a systematic way of connecting nearby points with edges, triangles, and their higher-dimensional analogues to create a combinatorial scaffolding that does have well-defined topology.

Once you have a simplicial complex, the full machinery of algebraic topology becomes available — you can count connected components ( $\beta_0$ ), detect loops ( $\beta_1$ ), find enclosed voids ( $\beta_2$ ), and quantify how these features persist across scales. If you’ve used ripser, gudhi, or scikit-tda, you’ve been building simplicial complexes without thinking about it. This topic makes the implicit explicit.

Formal Framework

The Building Blocks: Simplices

Definition 1.

A $k$ -simplex $\sigma$ is the convex hull of $k+1$ affinely independent points $v_0, v_1, \ldots, v_k$ in $\mathbb{R}^n$ :

$\sigma = [v_0, v_1, \ldots, v_k] = \left\{ \sum_{i=0}^{k} t_i v_i \;\middle|\; t_i \geq 0, \; \sum_{i=0}^{k} t_i = 1 \right\}$

The number $k$ is the dimension of the simplex. The points $v_0, \ldots, v_k$ are its vertices.

In concrete terms:

A 0-simplex is a point
A 1-simplex is an edge (line segment between two points)
A 2-simplex is a filled triangle
A 3-simplex is a filled tetrahedron
A $k$ -simplex for $k \geq 4$ follows the same pattern — it’s the “solid” version of the shape, not just the boundary

Definition 2.

A face of a simplex $\sigma = [v_0, \ldots, v_k]$ is any simplex formed by a non-empty subset of its vertices. If $\tau$ is a face of $\sigma$ , we write $\tau \leq \sigma$ .

A proper face satisfies $\tau < \sigma$ (i.e., $\tau \neq \sigma$ ).

Example 1.

The triangle $[v_0, v_1, v_2]$ has seven faces:

Three 0-faces (vertices): $[v_0]$ , $[v_1]$ , $[v_2]$
Three 1-faces (edges): $[v_0, v_1]$ , $[v_0, v_2]$ , $[v_1, v_2]$
One 2-face (itself): $[v_0, v_1, v_2]$

In general, a $k$ -simplex has $\binom{k+1}{j+1}$ faces of dimension $j$ , for a total of $2^{k+1} - 1$ faces.

From Simplices to Complexes

Definition 3.

A simplicial complex $K$ is a finite collection of simplices satisfying two properties:

Face closure: If $\sigma \in K$ and $\tau \leq \sigma$ , then $\tau \in K$ .
Intersection property: If $\sigma, \tau \in K$ , then $\sigma \cap \tau$ is either empty or a face of both $\sigma$ and $\tau$ .

The first condition says “if you include a triangle, you must include all its edges and vertices.” The second says “two simplices can only intersect along a shared face — no partial overlaps.” Together, these ensure the complex is combinatorially well-behaved.

Definition 4.

The dimension of a simplicial complex $K$ is the maximum dimension of any simplex in $K$ :

$\dim(K) = \max_{\sigma \in K} \dim(\sigma)$

Remark.

In practice, TDA pipelines rarely compute simplices beyond dimension 2 or 3. Computing the full Vietoris-Rips complex is combinatorially explosive — for $n$ points, the number of potential $k$ -simplices is $\binom{n}{k+1}$ . Algorithms like Ripser (Bauer, 2021) use clever algebraic shortcuts to compute persistent homology without ever explicitly constructing high-dimensional simplices.

Abstract vs. Geometric

Definition 5.

An abstract simplicial complex on a vertex set $V$ is a collection $\mathcal{A}$ of finite subsets of $V$ such that if $\sigma \in \mathcal{A}$ and $\tau \subseteq \sigma$ , then $\tau \in \mathcal{A}$ .

The distinction matters: a geometric simplicial complex lives in $\mathbb{R}^n$ with coordinates and convex hulls. An abstract simplicial complex is purely combinatorial — it’s a list of which vertex subsets are “connected.” For TDA, we work with abstract complexes (defined by distance thresholds) and use geometric realizations only for visualization.

The Vietoris-Rips Construction

This is the construction that matters for data analysis. Given a point cloud and a scale parameter, it builds a simplicial complex by connecting points that are “close enough.”

Definition 6.

Let $(X, d)$ be a finite metric space and $\varepsilon \geq 0$ . The Vietoris-Rips complex $\text{VR}(X, \varepsilon)$ is the abstract simplicial complex whose simplices are:

$\text{VR}(X, \varepsilon) = \{ \sigma \subseteq X \mid d(x, y) \leq \varepsilon \text{ for all } x, y \in \sigma \}$

Equivalently, a subset $\sigma$ of $k+1$ points forms a $k$ -simplex if and only if every pair of points in $\sigma$ is within distance $\varepsilon$ .

Remark.

Čech vs. Vietoris-Rips. The pairwise condition in Definition 6 is what makes VR complexes computationally tractable — you only need the distance matrix. The theoretically “correct” construction is the Čech complex $\check{C}(X, \varepsilon)$ , where a subset $\sigma \subseteq X$ forms a simplex if and only if the $\varepsilon/2$ -balls centered at its points have a common intersection:

$\check{C}(X, \varepsilon) = \left\{ \sigma \subseteq X \;\middle|\; \bigcap_{x \in \sigma} B(x, \varepsilon/2) \neq \emptyset \right\}$

The Čech complex satisfies the Nerve Theorem: its geometric realization is homotopy equivalent to the union of balls $\bigcup_{x \in X} B(x, \varepsilon/2)$ , which means it captures the “true” topology of the thickened point cloud. The VR complex doesn’t have this guarantee — it can introduce spurious high-dimensional simplices. However, the two complexes bracket each other:

$\check{C}(X, \varepsilon) \subseteq \text{VR}(X, \varepsilon) \subseteq \check{C}(X, \sqrt{2}\,\varepsilon)$

In practice, this means VR and Čech usually give very similar persistent homology (up to a constant factor in the scale parameter), and for $H_0$ and for the most prominent $H_1$ features they tend to agree in many datasets — though VR can still produce short-lived phantom $H_1$ cycles and extra higher-dimensional simplices that do not appear in the Čech complex. Since the Čech complex requires computing higher-order ball intersections (expensive in high dimensions) while VR needs only pairwise distances, every major TDA library — ripser, gudhi, dionysus — defaults to Vietoris-Rips. For the full construction, proof of the Nerve Theorem, and a detailed analysis of where Čech and VR diverge, see Čech Complexes & Nerve Theorem.

Example 2.

Consider five points in $\mathbb{R}^2$ : three forming an equilateral triangle with side length 1.0, and two more at distance 0.6 from each other but far from the triangle.

At $\varepsilon = 0.3$ : Only vertices (5 isolated points). No edges.
At $\varepsilon = 0.8$ : The distant pair connects (one edge). The triangle vertices are still isolated.
At $\varepsilon = 1.1$ : The triangle’s edges appear (three edges forming a cycle). The pair is connected.
At $\varepsilon = 1.6$ : The triangle fills in (one 2-simplex). All pairwise distances in the triangle are $\leq \varepsilon$ , so the face closure condition is satisfied automatically.

This progression — vertices → edges → triangles appearing at increasing scales — is the filtration that persistent homology tracks.

Homology Preview

A simplicial complex has well-defined shape — but how do we quantify that shape? The answer is homology: an algebraic machine that assigns integer invariants to a complex, counting its connected components, loops, voids, and higher-dimensional “holes.”

We give the key definitions here at an intuitive level. The full algebraic development — including the matrix computations that make homology practical — is developed on the Persistent Homology page.

Definition 7.

The $k$ -chain group $C_k(K; \mathbb{Z}_2)$ is the vector space over $\mathbb{Z}_2 = \{0, 1\}$ whose basis elements are the $k$ -simplices of $K$ . A $k$ -chain is a formal sum $c = \sum_i a_i \sigma_i$ with $a_i \in \mathbb{Z}_2$ .

Working over $\mathbb{Z}_2$ means we don’t need to worry about orientations: coefficients are 0 or 1, and $1 + 1 = 0$ . This is the standard choice in TDA software.

Definition 8.

The boundary operator $\partial_k : C_k(K) \to C_{k-1}(K)$ maps each $k$ -simplex to the sum of its $(k-1)$ -dimensional faces (over $\mathbb{Z}_2$ ):

$\partial_k [v_0, \ldots, v_k] = \sum_{i=0}^{k} [v_0, \ldots, \hat{v}_i, \ldots, v_k]$

where $\hat{v}_i$ denotes omission of the $i$ -th vertex.

The boundary of an edge $[v_0, v_1]$ is $[v_0] + [v_1]$ — its two endpoints. The boundary of a triangle $[v_0, v_1, v_2]$ is $[v_0, v_1] + [v_0, v_2] + [v_1, v_2]$ — its three edges. The fundamental property is that the boundary of a boundary is always zero: $\partial_{k-1} \circ \partial_k = 0$ (proved on the Persistent Homology page, Theorem 1).

Definition 9.

The $k$ -th Betti number $\beta_k$ of a simplicial complex $K$ is:

$\beta_k = \dim\left(\ker(\partial_k) / \text{im}(\partial_{k+1})\right)$

Equivalently, $\beta_k$ counts the number of independent $k$ -dimensional “holes” — $k$ -cycles that are not the boundary of any $(k+1)$ -chain.

In geometric terms:

$\beta_0$ = number of connected components
$\beta_1$ = number of independent loops (cycles not bounding a filled region)
$\beta_2$ = number of enclosed voids (cavities)

Example 3.

Consider the wireframe triangle $K_1 = \{[v_0,v_1], [v_0,v_2], [v_1,v_2], [v_0], [v_1], [v_2]\}$ (three edges, three vertices, no fill).

The three edges form a 1-cycle: $\partial_1([v_0,v_1] + [v_0,v_2] + [v_1,v_2]) = 0$ (over $\mathbb{Z}_2$ ). But there is no 2-simplex whose boundary is this cycle, so $\beta_1 = 1$ — the wireframe triangle has one hole.

Now add the filled face to get $K_2 = K_1 \cup \{[v_0,v_1,v_2]\}$ . The boundary of the 2-simplex is exactly our 1-cycle: $\partial_2[v_0,v_1,v_2] = [v_0,v_1] + [v_0,v_2] + [v_1,v_2]$ . The cycle is now a boundary, so $\beta_1 = 0$ — the filled triangle has no holes.

Both complexes have $\beta_0 = 1$ (one connected component) and $\beta_2 = 0$ (no enclosed voids).

This is the core mechanism: the same cycle can be a hole or not, depending on whether something fills it in. Persistent homology tracks exactly when filling events happen as the scale parameter $\varepsilon$ increases.

The Euler Characteristic

The Euler characteristic is the oldest and simplest topological invariant — Euler discovered it for polyhedra in 1758, long before homology was invented. It turns out to be a shadow of the Betti numbers.

Definition 10.

The Euler characteristic of a simplicial complex $K$ is the alternating sum of simplex counts:

$\chi(K) = \sum_{k=0}^{\dim(K)} (-1)^k f_k$

where $f_k$ is the number of $k$ -simplices in $K$ . Equivalently, $\chi = f_0 - f_1 + f_2 - f_3 + \cdots$ (vertices minus edges plus triangles minus tetrahedra, and so on).

For a surface, this reduces to the familiar $V - E + F$ formula. For a convex polyhedron, Euler proved that $V - E + F = 2$ — regardless of whether the polyhedron is a cube, a dodecahedron, or any other convex shape. This is a topological invariant: it depends only on the shape’s topology (homeomorphic to a sphere), not on the specific triangulation.

Theorem 1 (Euler–Poincaré Formula).

For any finite simplicial complex $K$ :

$\chi(K) = \sum_{k=0}^{\dim(K)} (-1)^k \beta_k$

The Euler characteristic equals the alternating sum of Betti numbers.

Proof.

The proof uses the rank-nullity theorem applied to the chain complex. For each boundary operator $\partial_k : C_k \to C_{k-1}$ , rank-nullity gives $f_k = \dim(C_k) = \text{rank}(\partial_k) + \dim(\ker(\partial_k))$ . Since $\beta_k = \dim(\ker(\partial_k)) - \text{rank}(\partial_{k+1})$ , the alternating sum telescopes:

$\sum_{k=0}^{n} (-1)^k \beta_k = \sum_{k=0}^{n} (-1)^k \bigl[\dim(\ker(\partial_k)) - \text{rank}(\partial_{k+1})\bigr]$

Expanding $\dim(\ker(\partial_k)) = f_k - \text{rank}(\partial_k)$ and collecting terms, each $\text{rank}(\partial_k)$ appears once with sign $(-1)^{k-1}$ (from the $\beta_{k-1}$ term) and once with sign $(-1)^k$ (from the $f_k$ expansion), so the rank terms cancel pairwise, leaving $\sum(-1)^k f_k = \chi(K)$ .

∎

Example 4.

Three classical surfaces and their invariants:

Sphere $S^2$ (tetrahedron boundary: 4V, 6E, 4F): $\chi = 4 - 6 + 4 = 2$ . Betti numbers $\beta_0 = 1, \beta_1 = 0, \beta_2 = 1$ , giving $1 - 0 + 1 = 2$ . ✓
Torus $T^2$ (standard triangulation: 9V, 27E, 18F): $\chi = 9 - 27 + 18 = 0$ . Betti numbers $\beta_0 = 1, \beta_1 = 2, \beta_2 = 1$ , giving $1 - 2 + 1 = 0$ . ✓
Real projective plane $\mathbb{R}P^2$ (minimal triangulation: 6V, 15E, 10F): $\chi = 6 - 15 + 10 = 1$ . Betti numbers $\beta_0 = 1, \beta_1 = 0, \beta_2 = 0$ (over $\mathbb{Z}_2$ ), giving $1 - 0 + 0 = 1$ . ✓

In every case, the simplex-counting formula agrees with the Betti number formula — as the Euler–Poincaré theorem guarantees.

Remark.

The Euler characteristic connects combinatorial topology to differential geometry via the Gauss–Bonnet theorem: for a compact orientable surface $M$ with Gaussian curvature $K$ ,

$\int_M K \, dA = 2\pi \chi(M)$

The total curvature is a topological invariant. This remarkable bridge between geometry and topology is developed on the Geodesics & Curvature page.

The explorer below lets you build a simplicial complex by adding and removing simplices, watching the Euler characteristic update in real time:

Euler Characteristic Explorer

Interactive visualization coming soon — build a simplicial complex and watch χ = V − E + F update in real time.

Other Complex Constructions

The Vietoris-Rips complex is the workhorse of TDA, but it is not the only way to build topology from data. Two important alternatives trade off geometric fidelity against computational cost.

Definition 11.

The alpha complex $\text{Alpha}_\varepsilon(P)$ of a point cloud $P \subset \mathbb{R}^d$ is the subcomplex of the Delaunay triangulation $\text{Del}(P)$ consisting of all simplices $\sigma$ whose smallest enclosing ball (the miniball of $\sigma$ ) has radius at most $\varepsilon$ :

$\text{Alpha}_\varepsilon(P) = \{\sigma \in \text{Del}(P) \mid \text{miniball}(\sigma) \leq \varepsilon\}$

Equivalently, $\text{Alpha}_\varepsilon(P) = \text{Del}(P) \cap \check{C}_\varepsilon(P)$ — it is the Delaunay-restricted Čech complex.

The alpha complex has a crucial advantage: its size is bounded by $O(n^{\lceil d/2 \rceil})$ regardless of $\varepsilon$ , because the Delaunay triangulation has at most that many simplices. In low dimensions ( $d = 2, 3$ ), this is nearly linear in $n$ — far smaller than the $\binom{n}{k+1}$ potential simplices of the VR complex. Moreover, the Nerve Theorem applies to the alpha complex (it is a subcomplex of the Čech complex), so its homology faithfully captures the topology of the union of balls.

Definition 12.

The witness complex $W_\varepsilon(L, P)$ is defined relative to a set of landmark points $L \subset P$ and the full data $P$ . A simplex $\sigma = \{l_0, \ldots, l_k\} \subseteq L$ belongs to $W_\varepsilon$ if there exists a witness $w \in P$ such that:

$\max_{i} d(w, l_i) \leq d(w, L \setminus \sigma) + \varepsilon$

Intuitively, $w$ is a data point that is “closer to $\sigma$ than to any other landmark,” up to tolerance $\varepsilon$ .

Remark.

The witness complex is designed for large datasets where the VR complex is infeasible. By choosing a small set of landmarks, the witness complex has size independent of the full dataset. The trade-off: the witness complex may miss topological features that the VR complex would detect, because the landmark subsampling can distort the geometry.

Computational complexity comparison:

Vietoris-Rips: Up to $2^n - 1$ simplices. Per-simplex check: pairwise distances only. Best for general-purpose TDA.
Čech: Same size bound as VR. Per-simplex check: ball intersection (expensive in high dimension). Best for theoretical exactness.
Alpha: Bounded by the Delaunay triangulation size, roughly $O(n)$ in $\mathbb{R}^2$ and $O(n^2)$ in $\mathbb{R}^3$ . Best for low-dimensional data.
Witness: Size depends on the number of landmarks, not the full dataset. Best for large point clouds.

Computational Notes

Proposition 1 (Simplex Count Bound).

A Vietoris-Rips complex on $n$ points has at most $C(n, k+1)$ potential $k$ -simplices, where $C(n,m)$ is the binomial coefficient. The full VR complex (at $\varepsilon = \infty$ ) has $2^n - 1$ simplices in total. For realistic datasets ( $n$ in the thousands), this is astronomically large for $k \geq 3$ .

This combinatorial explosion is why no one actually constructs the full VR complex. The state-of-the-art approach, implemented in Ripser (Bauer, 2021), avoids explicit construction entirely:

Implicit representation: simplices are never stored — they are generated on-the-fly from the distance matrix during the matrix reduction algorithm.
Clearing optimization: when a simplex $\sigma$ creates a birth event, the simplex that kills it (its “partner”) can be identified without processing the remaining columns — roughly halving the work.
Cohomology: Ripser computes cohomology (the dual of homology) because the coboundary matrix is sparser than the boundary matrix for VR complexes, leading to faster column reduction.
Apparent pairs: many birth-death pairs can be identified from the combinatorial structure of the filtration without any matrix reduction at all.

These optimizations combine to make Ripser orders of magnitude faster than naive implementations. For 1000 points in $\mathbb{R}^{10}$ , computing $H_0$ and $H_1$ persistence takes seconds, not hours.

Remark.

The practical implication for data scientists: you almost never need to think about the combinatorial complexity of simplicial complexes. Libraries like ripser, gudhi, and dionysus handle the optimization transparently. What you do need to think about is the choice of complex (VR vs. alpha vs. witness), the maximum homology dimension (computing $H_2$ is much more expensive than $H_1$ ), and the scale range of interest.

Visual Intuition

The visualization below shows a Vietoris-Rips complex built from points sampled near a circle. Drag the $\varepsilon$ slider to watch the complex form:

At small $\varepsilon$ : only isolated vertices — the complex captures no structure
At moderate $\varepsilon$ : edges connect nearby points, and you can see the circular arrangement emerge
At large $\varepsilon$ : triangles fill in, eventually collapsing the loop into a solid disk

ε = 0.30

What to notice: There’s a “sweet spot” for $\varepsilon$ where the edges form a cycle around the circle but the interior hasn’t filled in yet. At that scale, the complex has a one-dimensional hole — this is the $H_1$ feature that persistent homology will detect. Finding that sweet spot automatically, across all possible scales simultaneously, is exactly what persistent homology does.

The Betti number tracker below shows $\beta_0$ and $\beta_1$ updating as the scale parameter changes — watch the loop appear and then get filled in:

Betti Number Tracker

Interactive visualization coming soon — drag ε and watch β₀, β₁ update as the complex evolves.

Working Code

Building the Vietoris-Rips Complex

Building the Vietoris-Rips complex from scratch in Python:

import numpy as np
from itertools import combinations
from scipy.spatial.distance import pdist, squareform

def vr_complex(points, epsilon):
    """Build Vietoris-Rips complex at scale epsilon.

    Parameters
    ----------
    points : ndarray of shape (n, d)
        Point cloud in R^d.
    epsilon : float
        Scale parameter. Points within this distance are connected.

    Returns
    -------
    vertices : list of int
    edges : list of (int, int)
    triangles : list of (int, int, int)
    """
    dist_matrix = squareform(pdist(points))
    n = len(points)
    vertices = list(range(n))
    edges = [(i, j) for i, j in combinations(range(n), 2)
             if dist_matrix[i, j] <= epsilon]
    triangles = [(i, j, k) for i, j, k in combinations(range(n), 3)
                 if all(dist_matrix[a, b] <= epsilon
                        for a, b in combinations([i, j, k], 2))]
    return vertices, edges, triangles

# Example: 12 points on a circle
theta = np.linspace(0, 2 * np.pi, 12, endpoint=False)
points = np.column_stack([np.cos(theta), np.sin(theta)])

for eps in [0.3, 0.8, 1.2, 1.8]:
    v, e, t = vr_complex(points, eps)
    print(f"ε={eps:.1f}: {len(v)} vertices, {len(e)} edges, {len(t)} triangles")

ε=0.3: 12 vertices, 0 edges, 0 triangles
ε=0.8: 12 vertices, 12 edges, 0 triangles
ε=1.2: 12 vertices, 24 edges, 12 triangles
ε=1.8: 12 vertices, 42 edges, 64 triangles

At $\varepsilon = 0.8$ , we get exactly 12 edges forming a cycle — the complex “sees” the circle. By $\varepsilon = 1.8$ , the complex is nearly complete (66 possible edges, 220 possible triangles) and the topological signal is lost in combinatorial noise. This is why we need persistent homology: it tracks which features are real (they persist across a range of scales) and which are artifacts.

For real-world analysis, use ripser instead of this brute-force construction:

from ripser import ripser

# ripser builds the VR filtration and computes persistence in one step
result = ripser(points, maxdim=1)
print(result['dgms'][0][:5])  # H₀ intervals (connected components)
print(result['dgms'][1][:5])  # H₁ intervals (loops)

Betti Numbers from Boundary Matrices

Computing Betti numbers directly from the boundary matrices, using the rank-nullity formula $\beta_k = \dim(\ker \partial_k) - \text{rank}(\partial_{k+1})$ :

import numpy as np

def boundary_matrix_1(vertices, edges):
    """Build the boundary operator ∂₁ over Z₂.
    Rows = vertices, columns = edges. Entry (v, e) = 1 if v is an endpoint of e."""
    n_v, n_e = len(vertices), len(edges)
    D1 = np.zeros((n_v, n_e), dtype=int)
    v_idx = {v: i for i, v in enumerate(vertices)}
    for j, (a, b) in enumerate(edges):
        D1[v_idx[a], j] = 1
        D1[v_idx[b], j] = 1
    return D1 % 2  # Z₂ coefficients

def boundary_matrix_2(edges, triangles):
    """Build the boundary operator ∂₂ over Z₂.
    Rows = edges, columns = triangles. Entry (e, t) = 1 if e is a face of t."""
    e_set = {tuple(sorted(e)): i for i, e in enumerate(edges)}
    n_e, n_t = len(edges), len(triangles)
    D2 = np.zeros((n_e, n_t), dtype=int)
    for j, (a, b, c) in enumerate(triangles):
        for face in [(a,b), (a,c), (b,c)]:
            key = tuple(sorted(face))
            if key in e_set:
                D2[e_set[key], j] = 1
    return D2 % 2

def gf2_rank(matrix):
    """Compute matrix rank over Z₂ via Gaussian elimination (XOR)."""
    A = np.array(matrix, dtype=np.uint8, copy=True) % 2
    n_rows, n_cols = A.shape
    rank = 0
    pivot_row = 0
    for col in range(n_cols):
        found = None
        for row in range(pivot_row, n_rows):
            if A[row, col]:
                found = row
                break
        if found is None:
            continue
        if found != pivot_row:
            A[[pivot_row, found]] = A[[found, pivot_row]]
        for row in range(pivot_row + 1, n_rows):
            if A[row, col]:
                A[row] ^= A[pivot_row]
        rank += 1
        pivot_row += 1
        if pivot_row == n_rows:
            break
    return rank

def betti_numbers(vertices, edges, triangles):
    """Compute β₀, β₁ over Z₂ via rank-nullity."""
    D1 = boundary_matrix_1(vertices, edges)
    D2 = boundary_matrix_2(edges, triangles)
    rank_D1 = gf2_rank(D1)
    rank_D2 = gf2_rank(D2)
    beta_0 = len(vertices) - rank_D1
    beta_1 = (len(edges) - rank_D1) - rank_D2
    return beta_0, beta_1

# Wireframe triangle (no fill): should give β₀=1, β₁=1
verts = [0, 1, 2]
edgs = [(0,1), (0,2), (1,2)]
print(f"Wireframe triangle: β₀={betti_numbers(verts, edgs, [])[0]}, "
      f"β₁={betti_numbers(verts, edgs, [])[1]}")

# Filled triangle: should give β₀=1, β₁=0
print(f"Filled triangle:    β₀={betti_numbers(verts, edgs, [(0,1,2)])[0]}, "
      f"β₁={betti_numbers(verts, edgs, [(0,1,2)])[1]}")

Wireframe triangle: β₀=1, β₁=1
Filled triangle:    β₀=1, β₁=0

Alpha Complexes with GUDHI

For low-dimensional data, the alpha complex is dramatically more efficient than VR:

import gudhi
import numpy as np

# 100 points on a noisy circle in R²
theta = np.random.uniform(0, 2 * np.pi, 100)
r = 1.0 + np.random.normal(0, 0.1, 100)
circle_pts = np.column_stack([r * np.cos(theta), r * np.sin(theta)])

# Alpha complex: size bounded by O(n) in R²
alpha = gudhi.AlphaComplex(points=circle_pts)
stree = alpha.create_simplex_tree()
print(f"Alpha complex: {stree.num_simplices()} simplices")

# Compare with VR complex at a similar scale
vr = gudhi.RipsComplex(points=circle_pts, max_edge_length=0.8)
stree_vr = vr.create_simplex_tree(max_dimension=2)
print(f"VR complex:    {stree_vr.num_simplices()} simplices")

# Compute persistence — both should detect the circle's H₁ feature
stree.persistence()
dgm_alpha = stree.persistence_intervals_in_dimension(1)
print(f"Alpha H₁ features: {len(dgm_alpha)}, max persistence: "
      f"{max(d-b for b, d in dgm_alpha if d < float('inf')):.3f}")

Euler Characteristic Verification

Verify the Euler–Poincaré formula on the VR complex at various scales:

for eps in [0.3, 0.8, 1.2, 1.8]:
    v, e, t = vr_complex(points, eps)
    chi_simplex = len(v) - len(e) + len(t)
    b0, b1 = betti_numbers(list(range(len(v))), e, t)
    chi_betti = b0 - b1  # β₂ = 0 for dim ≤ 2
    print(f"ε={eps:.1f}: χ(simplex)={chi_simplex}, χ(Betti)={chi_betti}, "
          f"match={chi_simplex == chi_betti}")

Connections & Cross-Track Bridges

Simplicial complexes are the combinatorial substrate for nearly everything in topological data analysis and connect deeply to several other areas of mathematics on this site.

Within the Topology & TDA track:

Persistent Homology tracks how the homology of $\text{VR}(X, \varepsilon)$ changes as $\varepsilon$ increases. The filtration — the nested sequence of VR complexes at increasing scales — is the input to the persistence algorithm.
Čech Complexes & Nerve Theorem develops the geometrically exact alternative to VR, where the Nerve Theorem guarantees that the complex faithfully captures the topology of the union of balls.
The Mapper Algorithm constructs a 1-dimensional simplicial complex (a graph) that captures the data’s topological skeleton via the nerve of a cover. The Mapper graph is the 1-skeleton of the nerve of a refined open cover.
Sheaf Theory defines cellular sheaves on the face poset of a simplicial complex — assigning vector spaces to simplices and linear maps to face inclusions. The coboundary map $\delta_0$ in sheaf cohomology is a generalization of the boundary operator $\partial_k$ developed here.

Across tracks:

Graph Laplacians & Spectrum: The 1-skeleton of a simplicial complex is a graph, and its Laplacian spectrum governs spectral clustering. The boundary operator $\partial_1$ and the graph Laplacian $L = \partial_1^T \partial_1$ (over $\mathbb{R}$ ) are directly related — the Laplacian measures how “boundary-like” a 0-chain is, and $\ker(L) = \ker(\partial_1^T)$ counts connected components.
Categories & Functors: Simplicial complexes form a category $\textbf{SComp}$ with simplicial maps as morphisms. Homology $H_k(-; \mathbb{F})$ is a functor from $\textbf{SComp}$ to the category of vector spaces — a simplicial map $f: K \to L$ induces a linear map $f_*: H_k(K) \to H_k(L)$ that respects composition. This functoriality is what makes persistent homology well-defined: the inclusion maps $K_i \hookrightarrow K_j$ in a filtration induce the maps between homology groups that the persistence algorithm tracks.
Message Passing & GNNs: The message-passing framework on graphs generalizes naturally to simplicial complexes. Simplicial neural networks pass messages along triangles and higher simplices — not just edges — enabling them to capture higher-order interactions that graph neural networks miss.

Exercises

Foundational

Exercise 1. A 3-simplex (tetrahedron) $[v_0, v_1, v_2, v_3]$ has faces of dimensions 0, 1, 2, and 3. Count the number of faces of each dimension and verify that the total number of faces is $2^4 - 1 = 15$ . Hint: A $j$ -face corresponds to a choice of $j+1$ vertices from 4.

Exercise 2. The octahedron can be triangulated as a simplicial complex with 6 vertices, 12 edges, and 8 triangular faces (it is homeomorphic to $S^2$ ). Verify the Euler–Poincaré formula: compute $\chi = V - E + F$ directly, then confirm that $\beta_0 = 1$ , $\beta_1 = 0$ , $\beta_2 = 1$ gives $\chi = 1 - 0 + 1 = 2$ .

Intermediate

Exercise 3. Prove that the Vietoris-Rips complex is a flag complex — that is, it is completely determined by its 1-skeleton (the graph of edges). Specifically, show that a subset $\sigma \subseteq X$ is a simplex of $\text{VR}(X, \varepsilon)$ if and only if every pair of vertices in $\sigma$ is connected by an edge. Hint: This follows immediately from Definition 6, but spell out why the face closure property is also satisfied.

Exercise 4. Let $P$ be 6 points equally spaced on the unit circle in $\mathbb{R}^2$ . For the Vietoris-Rips complex $\text{VR}(P, \varepsilon)$ :

At what value of $\varepsilon$ do the first edges appear?
At what value of $\varepsilon$ does $\beta_1$ first become 1 (a loop appears)?
At what value of $\varepsilon$ does $\beta_1$ return to 0 (the loop gets filled)?

Compute these thresholds analytically from the geometry, then verify with ripser.

Advanced

Exercise 5. Implement the witness complex from scratch for a set of $n = 500$ points on a noisy torus in $\mathbb{R}^3$ with $|L| = 50$ landmarks chosen via farthest-point sampling. Compare the persistence diagram (up to $H_1$ ) with the VR diagram on the full dataset. How many landmarks are needed for the witness complex to detect both independent loops ( $\beta_1 = 2$ ) of the torus?

Exercise 6. The Čech-to-VR interleaving inequality states $\check{C}_\varepsilon(P) \subseteq \text{VR}_\varepsilon(P) \subseteq \check{C}_{2\varepsilon}(P)$ . Construct a specific point configuration in $\mathbb{R}^2$ where the VR complex contains a 2-simplex (filled triangle) that the Čech complex does not, at the same scale parameter. Compute the Betti numbers of both complexes and show that they disagree on $\beta_1$ . This is a concrete instance of the phantom cycle problem.

Overview & Motivation

Formal Framework

The Building Blocks: Simplices

From Simplices to Complexes

Abstract vs. Geometric

The Vietoris-Rips Construction

Homology Preview

The Euler Characteristic

Other Complex Constructions

Computational Notes

Visual Intuition

Working Code

Building the Vietoris-Rips Complex

Betti Numbers from Boundary Matrices

Alpha Complexes with GUDHI

Euler Characteristic Verification

Connections & Cross-Track Bridges

Exercises

Foundational

Intermediate

Advanced

Connections

References & Further Reading