intermediate topology 45 min read

Persistent Homology

Tracking topological features across scales — the workhorse of topological data analysis

Prerequisites: Simplicial Complexes

Overview & Motivation

Consider two point clouds with identical first- and second-order statistics — same mean, same variance, same covariance. One is sampled from an annulus (a ring), the other from a Gaussian cluster. Every classical statistical test says they’re the same. But one has a hole and the other doesn’t, and that difference can be the signal you need — whether you’re detecting anomalies in sensor networks, classifying dynamical regimes in time series, or finding structure in high-dimensional embeddings.

The problem is: what does “has a hole” even mean for a finite point cloud? Points are discrete; they don’t have holes in any obvious sense. The Vietoris-Rips construction (see Simplicial Complexes) gives us a way to build a continuous-ish structure from discrete data, but it introduces a new problem: the result depends on a scale parameter ε\varepsilon. Choose ε\varepsilon too small and you see only isolated points. Too large and everything collapses into a single blob.

Persistent homology is the answer: instead of choosing a single scale, track the topology at every scale simultaneously. Features that persist across a wide range of ε\varepsilon are real structure; features that flicker in and out are noise. This insight — that persistence is a proxy for significance — is what makes TDA useful for machine learning.

Not all complexes are created equal: the Vietoris-Rips complex can introduce phantom cycles — topological artifacts that do not correspond to real features of the data. The Čech complex avoids this via the Nerve Theorem.


Formal Framework

Filtrations

Definition 1.

A filtration of a simplicial complex KK is a nested sequence of subcomplexes:

=K0K1K2Km=K\emptyset = K_0 \subseteq K_1 \subseteq K_2 \subseteq \cdots \subseteq K_m = K

Each KiK_i is a simplicial complex, and the inclusion KiKi+1K_i \hookrightarrow K_{i+1} adds one or more simplices.

For the Vietoris-Rips complex, the filtration is parameterized by ε\varepsilon: as ε\varepsilon increases from 0 to \infty, simplices enter the complex at their birth time (the minimum ε\varepsilon at which they appear). An edge [vi,vj][v_i, v_j] is born at ε=d(vi,vj)\varepsilon = d(v_i, v_j). A triangle [vi,vj,vk][v_i, v_j, v_k] is born at ε=max(d(vi,vj),d(vi,vk),d(vj,vk))\varepsilon = \max(d(v_i, v_j), d(v_i, v_k), d(v_j, v_k)).

Chain Groups and the Boundary Operator

To detect “holes,” we need algebra. The key idea: a loop is a collection of edges that forms a cycle, and a hole is a loop that isn’t the boundary of something filled in.

Definition 2.

The kk-chain group Ck(K)C_k(K) of a simplicial complex KK is the free abelian group (or Z2\mathbb{Z}_2-vector space) generated by the kk-simplices of KK. A kk-chain is a formal sum of kk-simplices:

c=iaiσi,aiZ2,σiK with dim(σi)=kc = \sum_i a_i \sigma_i, \quad a_i \in \mathbb{Z}_2, \quad \sigma_i \in K \text{ with } \dim(\sigma_i) = k

Remark.

Working over Z2\mathbb{Z}_2 (coefficients are 0 or 1, with 1+1=01 + 1 = 0) simplifies everything: we don’t need to worry about orientations. Most TDA software uses Z2\mathbb{Z}_2 coefficients. Over Z\mathbb{Z}, you get torsion information (e.g., distinguishing a Möbius band from a cylinder), but this is rarely needed in data analysis.

Definition 3.

The boundary operator k:Ck(K)Ck1(K)\partial_k : C_k(K) \to C_{k-1}(K) maps each kk-simplex to the alternating sum of its (k1)(k-1)-dimensional faces:

k[v0,v1,,vk]=i=0k(1)i[v0,,v^i,,vk]\partial_k [v_0, v_1, \ldots, v_k] = \sum_{i=0}^{k} (-1)^i [v_0, \ldots, \hat{v}_i, \ldots, v_k]

where v^i\hat{v}_i denotes omission. Over Z2\mathbb{Z}_2, the signs disappear and this reduces to the sum of all faces.

Example 1.

For the triangle [v0,v1,v2][v_0, v_1, v_2] with vertices, edges, and the filled face:

2\partial_2: The boundary of the triangle is the sum of its edges: 2[v0,v1,v2]=[v1,v2][v0,v2]+[v0,v1]\partial_2 [v_0, v_1, v_2] = [v_1, v_2] - [v_0, v_2] + [v_0, v_1]

1\partial_1: The boundary of each edge is the difference of its endpoints: 1[vi,vj]=[vj][vi]\partial_1 [v_i, v_j] = [v_j] - [v_i]

Written as matrices (with Z\mathbb{Z} coefficients):

2=(111),1=(110101011)\partial_2 = \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix}, \quad \partial_1 = \begin{pmatrix} -1 & -1 & 0 \\ 1 & 0 & -1 \\ 0 & 1 & 1 \end{pmatrix}

The rows of 1\partial_1 correspond to vertices v0,v1,v2v_0, v_1, v_2; columns to edges [v0,v1],[v0,v2],[v1,v2][v_0,v_1], [v_0,v_2], [v_1,v_2].

The following property is the engine of homology:

Theorem 1 (Fundamental Property of the Boundary Operator).

For all k1k \geq 1:

k1k=0\partial_{k-1} \circ \partial_k = 0

That is, the boundary of a boundary is always zero.

Proof.

It suffices to check on generators. Let σ=[v0,v1,,vk]\sigma = [v_0, v_1, \ldots, v_k] be a kk-simplex. We compute k1(k(σ))\partial_{k-1}(\partial_k(\sigma)) by expanding both operators:

k(σ)=i=0k(1)i[v0,,v^i,,vk]\partial_k(\sigma) = \sum_{i=0}^{k} (-1)^i [v_0, \ldots, \hat{v}_i, \ldots, v_k]

Applying k1\partial_{k-1} to each term, we omit a second vertex vjv_j. For each term (1)i[v0,,v^i,,vk](-1)^i [v_0, \ldots, \hat{v}_i, \ldots, v_k], the boundary is:

k1((1)i[v0,,v^i,,vk])=(1)ij=0jik(1)j[v0,,v^i,,v^j,,vk]\partial_{k-1}\bigl((-1)^i [v_0, \ldots, \hat{v}_i, \ldots, v_k]\bigr) = (-1)^i \sum_{\substack{j=0 \\ j \neq i}}^{k} (-1)^{j'} [v_0, \ldots, \hat{v}_i, \ldots, \hat{v}_j, \ldots, v_k]

where the exponent jj' depends on the position of vjv_j in the face after viv_i has been removed: j=jj' = j if j<ij < i, and j=j1j' = j - 1 if j>ij > i.

Now consider a specific (k2)(k-2)-face [v0,,v^i,,v^j,,vk][v_0, \ldots, \hat{v}_i, \ldots, \hat{v}_j, \ldots, v_k] with i<ji < j. It appears exactly twice in the double sum:

  1. First omit viv_i, then omit vjv_j: sign is (1)i(1)j1(-1)^i \cdot (-1)^{j-1}
  2. First omit vjv_j, then omit viv_i: sign is (1)j(1)i(-1)^j \cdot (-1)^{i}

The sum of these two signs is:

(1)i+j1+(1)i+j=(1)i+j1(1+(1))=0(-1)^{i+j-1} + (-1)^{i+j} = (-1)^{i+j-1}(1 + (-1)) = 0

Every (k2)(k-2)-face appears exactly twice with opposite signs, so the entire sum vanishes. Over Z2\mathbb{Z}_2, the argument is even simpler: each face appears twice, and 1+1=01 + 1 = 0.

Example 2.

We can verify this directly for the triangle. From Example 1:

12=(110101011)(111)=(000)\partial_1 \circ \partial_2 = \begin{pmatrix} -1 & -1 & 0 \\ 1 & 0 & -1 \\ 0 & 1 & 1 \end{pmatrix} \begin{pmatrix} 1 \\ -1 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}

The boundary of the triangle is the cycle [v0,v1][v0,v2]+[v1,v2][v_0,v_1] - [v_0,v_2] + [v_1,v_2], and the boundary of that cycle is zero — exactly because it closes up.

Homology Groups

The =0\partial \circ \partial = 0 property gives us two important subgroups:

Definition 4.

The cycle group is Zk=ker(k)Z_k = \ker(\partial_k) — chains with zero boundary (they “close up”).

The boundary group is Bk=im(k+1)B_k = \text{im}(\partial_{k+1}) — chains that are the boundary of something.

Since kk+1=0\partial_k \circ \partial_{k+1} = 0, every boundary is a cycle: BkZkB_k \subseteq Z_k.

Definition 5.

The kk-th homology group of KK is the quotient:

Hk(K)=Zk/Bk=ker(k)/im(k+1)H_k(K) = Z_k / B_k = \ker(\partial_k) / \text{im}(\partial_{k+1})

The kk-th Betti number is βk=rank(Hk)\beta_k = \text{rank}(H_k).

Homology counts holes that are not filled in:

  • β0\beta_0 = number of connected components
  • β1\beta_1 = number of loops (independent 1-cycles not bounding a filled region)
  • β2\beta_2 = number of voids (enclosed cavities)

Example 3.

For the filled triangle K={[v0,v1,v2],[v0,v1],[v0,v2],[v1,v2],[v0],[v1],[v2]}K = \{[v_0,v_1,v_2], [v_0,v_1], [v_0,v_2], [v_1,v_2], [v_0], [v_1], [v_2]\}:

  • ker(1)\ker(\partial_1) has rank 1 (one independent cycle: the loop around the triangle)
  • im(2)\text{im}(\partial_2) has rank 1 (that loop is the boundary of the filled face)
  • β1=11=0\beta_1 = 1 - 1 = 0 — no holes. The apparent loop is a boundary.
  • β0=1\beta_0 = 1 — one connected component.

Now remove the 2-simplex (just the wireframe triangle, no fill):

  • ker(1)\ker(\partial_1) still has rank 1
  • im(2)\text{im}(\partial_2) now has rank 0 (no 2-simplices to bound anything)
  • β1=10=1\beta_1 = 1 - 0 = 1 — one hole. The loop is no longer a boundary.

This is the key insight: the same cycle can be a hole or not, depending on whether there’s something to fill it.

Persistent Homology

Now we combine filtrations with homology. As ε\varepsilon increases, new simplices enter the complex. Each new simplex either:

  1. Creates a topological feature (a new cycle appears that isn’t a boundary), or
  2. Destroys a topological feature (a cycle that was a hole gets filled in)

Definition 6.

Given a filtration K0K1KmK_0 \subseteq K_1 \subseteq \cdots \subseteq K_m, the persistent homology tracks, for each topological feature:

  • Its birth time bb: the filtration index where the feature first appears
  • Its death time dd: the filtration index where the feature is destroyed
  • Its persistence: dbd - b, measuring how long the feature survives

A feature with d=d = \infty (never destroyed) is called an essential feature.

Definition 7.

The persistence diagram Dgm(K)\text{Dgm}(K_\bullet) is the multiset of points (bi,di)R2(b_i, d_i) \in \mathbb{R}^2 corresponding to birth-death pairs, together with infinitely many copies of every point on the diagonal {(x,x)xR}\{(x,x) \mid x \in \mathbb{R}\}.

Equivalently, the persistence barcode represents each feature as a horizontal interval [bi,di)[b_i, d_i).

Points far from the diagonal have high persistence — they represent features that survive across a wide range of scales. Points near the diagonal have low persistence — they are topological “noise,” features that appear and immediately disappear. The diagonal points are a technical convenience that makes the space of persistence diagrams well-behaved.

Remark.

There is a deep algebraic structure here that we’re glossing over: the inclusions KiKjK_i \hookrightarrow K_j induce maps on homology Hk(Ki)Hk(Kj)H_k(K_i) \to H_k(K_j). The persistence module is the entire sequence of these maps, and the Structure Theorem for persistence modules (a consequence of the classification of finitely generated modules over a PID) guarantees that this module decomposes into interval summands — which are exactly the bars in the barcode. This decomposition is unique and computable via matrix reduction.


The Matrix Reduction Algorithm

Persistent homology is defined algebraically, but it is computed via matrix reduction — a variant of Gaussian elimination applied to the boundary matrix. Understanding this algorithm demystifies what ripser and gudhi are actually doing.

Definition 9.

The filtered boundary matrix DD is the boundary matrix of the full complex KK, with columns ordered by filtration time. Each column corresponds to a simplex σj\sigma_j, and the column entries encode the boundary (σj)\partial(\sigma_j) over Z2\mathbb{Z}_2. If σj\sigma_j enters the filtration at time tjt_j, columns are ordered so that t1t2tmt_1 \leq t_2 \leq \cdots \leq t_m.

The key operation is column reduction: for each column jj (processed left to right), find the lowest nonzero entry (the “pivot” or “low” of column jj). If another column i<ji < j has the same pivot row, add column ii to column jj (over Z2\mathbb{Z}_2, addition is XOR). Repeat until no two columns share the same pivot.

Theorem 3 (Persistence from Reduced Matrix).

After column reduction of the filtered boundary matrix DD to its reduced form RR:

  1. If column jj has pivot in row ii (i.e., low(j)=i\text{low}(j) = i), then the simplex σi\sigma_i is a creator and σj\sigma_j is a destroyer. The pair (σi,σj)(\sigma_i, \sigma_j) records a topological feature born at time tit_i and dying at time tjt_j.

  2. If column jj is entirely zero after reduction, the simplex σj\sigma_j creates a feature that is never destroyed — an essential feature with birth time tjt_j and death time \infty.

  3. If row ii is never a pivot (never appears as low(j)\text{low}(j) for any jj), then σi\sigma_i does not participate in any birth-death pair as a creator — it is “invisible” to persistence.

Example 4.

Consider three points v0,v1,v2v_0, v_1, v_2 with distances d(v0,v1)=1d(v_0, v_1) = 1, d(v0,v2)=2d(v_0, v_2) = 2, d(v1,v2)=3d(v_1, v_2) = 3. The filtration adds simplices in order of birth time:

  1. v0,v1,v2v_0, v_1, v_2 at ε=0\varepsilon = 0 (three vertices)
  2. [v0,v1][v_0, v_1] at ε=1\varepsilon = 1
  3. [v0,v2][v_0, v_2] at ε=2\varepsilon = 2
  4. [v1,v2][v_1, v_2] at ε=3\varepsilon = 3

The boundary matrix DD (rows = vertices, columns = edges in filtration order):

D=(110101011)D = \begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \end{pmatrix}

Reducing: Column 1 has low = row 1. Column 2 has low = row 2 (different from column 1, so no reduction needed). Column 3 has low = row 2, conflicting with column 2. Add column 2 to column 3: new column 3 = (0,1,1)T+(1,0,1)T=(1,1,0)T(0, 1, 1)^T + (1, 0, 1)^T = (1, 1, 0)^T, which has low = row 1, conflicting with column 1. Add column 1: (1,1,0)T+(1,1,0)T=(0,0,0)T(1, 1, 0)^T + (1, 1, 0)^T = (0, 0, 0)^T. Column 3 is now zero.

Reading the persistence pairs:

  • Column 1 (edge [v0,v1][v_0,v_1], born at ε=1\varepsilon = 1) has low=\text{low} = row 1 (v1v_1, born at ε=0\varepsilon = 0): the edge merges v1v_1 into v0v_0‘s component. This gives H0H_0 pair (0,1)(0, 1).
  • Column 2 (edge [v0,v2][v_0,v_2], born at ε=2\varepsilon = 2) has low=\text{low} = row 2 (v2v_2, born at ε=0\varepsilon = 0): the edge merges v2v_2. This gives H0H_0 pair (0,2)(0, 2).
  • Column 3 (edge [v1,v2][v_1,v_2], born at ε=3\varepsilon = 3) is zero after reduction: this edge creates a 1-cycle that is never destroyed (no 2-simplices in our complex). This gives an essential H1H_1 feature born at ε=3\varepsilon = 3.
  • v0v_0 (row 0) is never a pivot target: it is the surviving H0H_0 component, essential with pair (0,)(0, \infty).

Proposition 1 (Uniqueness of Persistence Pairing).

The persistence pairing produced by column reduction is unique: it does not depend on the order in which column operations are performed (as long as we only add earlier columns to later ones). This is because the set of pivot positions in the reduced matrix RR is an invariant of the persistence module, not of the particular reduction.

Remark.

The standard algorithm has complexity O(m3)O(m^3) where mm is the number of simplices. Ripser achieves dramatic speedups via four key optimizations: (1) implicit matrix representation (never store the full boundary matrix — generate columns on demand from the distance matrix), (2) the clearing optimization (when a column jj creates a death, clear its partner column without processing it), (3) computing cohomology instead of homology (the coboundary matrix is sparser for VR complexes), and (4) the apparent pairs shortcut (many birth-death pairs can be read off from the filtration structure without any matrix reduction). These reduce the practical running time to near-linear in the number of features detected.

Matrix Reduction Animator

Interactive visualization coming soon — step through the column reduction algorithm and watch birth-death pairs emerge.


Visual Intuition

The visualization below is the core interactive experience. The top panel shows a Vietoris-Rips complex built from points sampled near a circle. The bottom panels show the persistence diagram and barcode.

Drag the ε\varepsilon slider and watch all three panels update together:

Persistence Diagram

Persistence Barcode

What to notice:

  1. H0H_0 features (teal): At ε=0\varepsilon = 0, every point is its own connected component — 15 features are born simultaneously. As ε\varepsilon increases, components merge. Each merge event kills a component: you see teal points appear in the persistence diagram with increasing death values. One component survives to \infty (the essential H0H_0 feature).

  2. H1H_1 features (purple): At a critical ε\varepsilon around 0.7-0.9, the edges form a cycle around the point cloud but the interior hasn’t filled in yet — a loop is born. As ε\varepsilon increases further, 2-simplices fill the interior, and the loop dies. The persistence (death minus birth) of this H1H_1 feature is the key topological signal: it tells you “this point cloud has a circular structure.”

  3. Signal vs. noise: The long-lived H1H_1 bar stands out from the short bars near the diagonal. That gap between persistent and ephemeral features is what makes TDA work in practice — it’s a built-in notion of statistical significance, grounded in geometry rather than distributional assumptions.


The Stability Theorem

The practical utility of persistent homology rests on a single theorem: small perturbations of the input data produce small changes in the persistence diagram. Without this, the method would be useless — any noise would scramble the output.

Definition 8.

The bottleneck distance between two persistence diagrams DD and DD' is:

dB(D,D)=infγsuppDpγ(p)d_B(D, D') = \inf_\gamma \sup_{p \in D} \| p - \gamma(p) \|_\infty

where the infimum is over all bijections γ:DD\gamma: D \to D' (including diagonal points).

Theorem 2 (Stability Theorem (Cohen-Steiner, Edelsbrunner, Harer 2007)).

Let f,g:XRf, g: X \to \mathbb{R} be tame functions on a topological space XX. Then:

dB(Dgm(f),Dgm(g))fgd_B(\text{Dgm}(f), \text{Dgm}(g)) \leq \| f - g \|_\infty

The bottleneck distance between persistence diagrams is bounded by the LL^\infty distance between the functions that generate them.

Proof.

The full proof uses the theory of interleavings of persistence modules and is beyond our scope. The essential idea: if fgδ\|f - g\|_\infty \leq \delta, then the sublevel set filtrations of ff and gg are δ\delta-interleaved — f1(,t]g1(,t+δ]f^{-1}(-\infty, t] \subseteq g^{-1}(-\infty, t + \delta] and vice versa — which forces a matching between their persistence diagrams that moves no point by more than δ\delta. See Cohen-Steiner, Edelsbrunner & Harer (2007) for the complete argument.

For point cloud data, the relevant corollary is:

Corollary 1.

Let XX and YY be finite point clouds with dH(X,Y)δd_H(X, Y) \leq \delta (Hausdorff distance). Then:

dB(Dgm(VR(X)),Dgm(VR(Y)))2δd_B(\text{Dgm}(\text{VR}(X)), \text{Dgm}(\text{VR}(Y))) \leq 2\delta

In plain language: if you jiggle each point by at most δ\delta, the persistence diagram moves by at most 2δ2\delta. This is what makes persistent homology safe to use on noisy real-world data.


Extended Persistence

Ordinary persistent homology tracks features born and killed in a sublevel set filtration: as ε\varepsilon increases from 0 to \infty, simplices enter and topological features appear and disappear. But some features are essential — they are born and never die. The most important essential feature is the single surviving connected component (H0H_0 with death == \infty), but for spaces with nontrivial higher homology, essential features carry real topological information that ordinary persistence cannot capture.

Extended persistence (Cohen-Steiner, Edelsbrunner & Harer, 2009) solves this by appending a superlevel set filtration to the sublevel set filtration, creating a complete picture where every feature has a finite death time.

Definition 10.

Given a function f:KRf: K \to \mathbb{R} on a simplicial complex KK, the extended filtration consists of two phases:

  1. Sublevel phase: Kt=f1(,t]K_t = f^{-1}(-\infty, t] for tt increasing from -\infty to ++\infty. Simplices enter as tt increases.
  2. Superlevel phase: Kt=f1[t,+)K^t = f^{-1}[t, +\infty) for tt decreasing from ++\infty to -\infty. The complex is “peeled back” from above.

The two phases are concatenated to form a single filtration that begins with the empty set and ends with the empty set.

Definition 11.

The birth-death pairs in extended persistence fall into three classes:

  • Ordinary pairs (b,d)(b, d): the feature is born and killed during the sublevel phase. These are the same as standard persistent homology.
  • Relative pairs (b,d)(b, d): the feature is born during the superlevel phase and killed later in the superlevel phase. These capture topology “from above.”
  • Extended pairs (b,d)(b, d): the feature is born during the sublevel phase and killed during the superlevel phase. These are the features that are essential in ordinary persistence — extended persistence gives them finite death times.

Example 5.

Consider the height function f(x,y,z)=zf(x, y, z) = z on a torus embedded in R3\mathbb{R}^3. The torus has β0=1\beta_0 = 1, β1=2\beta_1 = 2 (two independent loops — meridional and longitudinal), and β2=1\beta_2 = 1 (an enclosed void).

Sublevel phase: As zz increases from the bottom of the torus, we see the following: one component is born at the minimum (essential H0H_0). Two H1H_1 features are born as the torus’s handle forms — one is killed when the hole fills in at the saddle, the other is essential. The H2H_2 void is essential.

Extended persistence kills all essential features during the superlevel phase: the surviving component, the surviving loop, and the void each receive a finite death time. The result is a complete set of finite birth-death pairs capturing the full topology of the torus.

Extended Persistence Diagram

Interactive visualization coming soon — explore ordinary, relative, and extended persistence pairs on a torus.

Remark.

Extended persistence is particularly useful for Morse-theoretic analysis of functions on manifolds. The extended persistence diagram of a Morse function f:MRf: M \to \mathbb{R} provides a complete topological summary of ff — every critical point of ff appears in exactly one persistence pair, and the persistence (death minus birth) measures the “importance” of that critical point.


Persistence Modules

The algebraic structure underlying persistent homology is the persistence module — a one-parameter family of vector spaces connected by linear maps. Understanding this algebraic framework clarifies why barcodes exist, why stability holds, and why multi-parameter persistence is fundamentally harder.

Definition 12.

A persistence module over a field F\mathbb{F} is a functor from the poset (R,)(\mathbb{R}, \leq) to the category of F\mathbb{F}-vector spaces. Concretely, it consists of:

  • A family of vector spaces {Vt}tR\lbrace V_t \rbrace_{t \in \mathbb{R}}
  • Linear maps vs,t:VsVtv_{s,t}: V_s \to V_t for all sts \leq t

satisfying vt,t=idv_{t,t} = \text{id} and vs,t=vr,tvs,rv_{s,t} = v_{r,t} \circ v_{s,r} for srts \leq r \leq t.

The persistence module arising from a filtration K0K1K_0 \subseteq K_1 \subseteq \cdots is the family of homology groups Vt=Hk(Kt)V_t = H_k(K_t), with the maps vs,tv_{s,t} induced by the inclusions KsKtK_s \hookrightarrow K_t. The maps tell us which features at scale ss “survive to” scale tt.

Theorem 4 (Structure Theorem for Persistence Modules).

Every pointwise finite-dimensional persistence module V\mathbb{V} over a field F\mathbb{F} decomposes uniquely (up to reordering) into a direct sum of interval modules:

Vi=1nI[bi,di)\mathbb{V} \cong \bigoplus_{i=1}^{n} \mathbb{I}[b_i, d_i)

where I[b,d)\mathbb{I}[b, d) is the module that is F\mathbb{F} for t[b,d)t \in [b, d) and 00 elsewhere. Each interval [bi,di)[b_i, d_i) corresponds to one bar in the persistence barcode.

This is the theorem that makes barcodes well-defined. Without it, the barcode would be an arbitrary visualization choice; with it, the barcode is the unique decomposition of the persistence module into indecomposable summands. The proof relies on the classification of finitely generated modules over a principal ideal domain — the same algebraic machinery that gives the Jordan normal form for matrices.

Remark.

The Structure Theorem fails for multi-parameter persistence. When the filtration is indexed by R2\mathbb{R}^2 (or any poset more complex than a total order), persistence modules do not decompose into interval summands. There is no barcode. This is the fundamental obstacle to multi-parameter persistence theory, and a major open area in applied algebraic topology. Partial invariants exist (rank invariants, persistence landscapes for multi-parameter data), but no complete discrete invariant comparable to the barcode.

For the metric structure on the space of persistence modules — including the interleaving distance, the Isometry Theorem, and the stability framework — see the Barcodes & Bottleneck Distance page.


Working Code

Computing Persistence with Ripser

import numpy as np
from ripser import ripser
from persim import plot_diagrams

# Generate an annulus (has a hole) and a Gaussian cluster (no hole)
np.random.seed(42)
n = 200

theta = np.random.uniform(0, 2 * np.pi, n)
r = 1.0 + np.random.normal(0, 0.1, n)
annulus = np.column_stack([r * np.cos(theta), r * np.sin(theta)])

cluster = np.random.randn(n, 2) * np.std(annulus, axis=0) + np.mean(annulus, axis=0)

# Compute persistence (H₀ and H₁)
result_annulus = ripser(annulus, maxdim=1)
result_cluster = ripser(cluster, maxdim=1)

# The annulus has a high-persistence H₁ feature (the loop)
dgm1 = result_annulus['dgms'][1]
max_pers = np.max(dgm1[:, 1] - dgm1[:, 0])
print(f"Annulus max H₁ persistence: {max_pers:.4f}")

# The cluster has only low-persistence H₁ noise
dgm1_c = result_cluster['dgms'][1]
max_pers_c = np.max(dgm1_c[:, 1] - dgm1_c[:, 0])
print(f"Cluster max H₁ persistence: {max_pers_c:.4f}")

Verifying the Stability Theorem Empirically

from persim import bottleneck

# Clean circle
n_pts = 100
theta_base = np.linspace(0, 2 * np.pi, n_pts, endpoint=False)
base_cloud = np.column_stack([np.cos(theta_base), np.sin(theta_base)])
base_dgm = ripser(base_cloud, maxdim=1)['dgms'][1]

# Add increasing noise and measure diagram movement
for noise_std in [0.0, 0.1, 0.2, 0.3, 0.5]:
    noise = noise_std * np.random.randn(n_pts, 2)
    noisy_cloud = base_cloud + noise
    noisy_dgm = ripser(noisy_cloud, maxdim=1)['dgms'][1]

    bn = bottleneck(base_dgm, noisy_dgm)
    linf = np.max(np.abs(noise)) if noise_std > 0 else 0
    print(f"noise σ={noise_std:.1f}: ||perturbation||∞={linf:.3f}, d_B={bn:.3f}, "
          f"bound holds: {bn <= linf + 1e-10 or noise_std == 0}")

Vectorization: Persistence Images for ML

Persistence diagrams aren’t fixed-size vectors, so we can’t feed them directly to most classifiers. Persistence images (Adams et al., 2017) solve this by placing a Gaussian at each point (b,db)(b, d-b) and evaluating on a grid:

from persim import PersistenceImager

# Create a persistence image from H₁
pimgr = PersistenceImager(pixel_size=0.05, birth_range=(0, 1.5), pers_range=(0, 1.0))
pimgr.fit([result_annulus['dgms'][1]])

img = pimgr.transform(result_annulus['dgms'][1])
print(f"Feature vector shape: {img.flatten().shape}")
# This fixed-size vector goes into any sklearn classifier

Full Pipeline: Time Series Classification

The Takens delay embedding theorem connects time series analysis to topology:

x(t)(x(t),  x(t+τ),  x(t+2τ),  ,  x(t+(d1)τ))x(t) \mapsto \bigl(x(t),\; x(t+\tau),\; x(t+2\tau),\; \ldots,\; x(t+(d-1)\tau)\bigr)

Periodic signals produce loops in the embedding; chaotic signals do not.

  1. 1D time series x(t)
  2. Takens delay embedding dim d, delay τ
  3. Point cloud in ℝᵈ
  4. Persistent homology Ripser
  5. Feature vector summary stats of persistence
  6. Classifier Random Forest, SVM, …
  7. Regime label
The full topological-features-for-time-series pipeline. Persistence features are invariant to time reparameterization and stable under noise (by the Stability Theorem), which makes this pipeline work end-to-end on noisy real-world signals.
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

def takens_embedding(ts, dim=2, delay=10):
    """Takens delay embedding of a 1D time series."""
    n = len(ts) - (dim - 1) * delay
    return np.array([ts[i:i + dim * delay:delay] for i in range(n)])

def topological_features(ts):
    """Extract topological feature vector from a time series."""
    embedded = takens_embedding(ts, dim=2, delay=10)
    result = ripser(embedded, maxdim=1)
    features = []
    for d in range(2):  # H₀ and H₁
        dgm = result['dgms'][d]
        finite = dgm[np.isfinite(dgm[:, 1])]
        if len(finite) > 0:
            pers = finite[:, 1] - finite[:, 0]
            features.extend([
                np.max(pers),        # Max persistence
                np.mean(pers),       # Mean persistence
                np.std(pers),        # Std of persistence
                np.sum(pers > 0.1),  # Count of significant features
                np.sum(pers ** 2),   # Total persistence (L² norm)
            ])
        else:
            features.extend([0, 0, 0, 0, 0])
    return np.array(features)

# Generate labeled time series from 3 dynamical regimes
X, y = [], []
for regime in range(3):
    for _ in range(50):
        t = np.linspace(0, 6 * np.pi, 300)
        noise = 0.05 * np.random.randn(300)
        if regime == 0:    # Periodic
            ts = np.sin(t) + noise
        elif regime == 1:  # Quasi-periodic
            ts = np.sin(t) + 0.5 * np.sin(np.sqrt(2) * t) + noise
        else:              # Chaotic (logistic map)
            ts = np.zeros(300)
            ts[0] = 0.1
            for i in range(1, 300):
                ts[i] = 3.9 * ts[i-1] * (1 - ts[i-1])
        X.append(topological_features(ts))
        y.append(regime)

X, y = np.array(X), np.array(y)
scores = cross_val_score(RandomForestClassifier(n_estimators=100, random_state=42),
                         X, y, cv=5)
print(f"5-fold CV accuracy: {scores.mean():.3f} ± {scores.std():.3f}")

This pipeline — time series → Takens embedding → persistent homology → feature vector → classifier — is one of the most successful applications of TDA in practice. It works because the persistence features are invariant to reparameterization of the time series (you can stretch or compress time and the topology doesn’t change) and stable under noise (by the Stability Theorem).

Matrix Reduction from Scratch

The standard algorithm for computing persistence, implemented directly from Theorem 3:

import numpy as np

def reduce_boundary_matrix(D):
    """Column-reduce a boundary matrix over Z₂.

    Parameters
    ----------
    D : ndarray of shape (m, m) — the filtered boundary matrix (Z₂ entries)

    Returns
    -------
    R : ndarray — the reduced matrix
    pairs : list of (creator_idx, destroyer_idx) or (creator_idx, None)
    """
    R = D.copy() % 2
    m = R.shape[1]
    low = {}     # row_index -> col_index of column that has this row as its pivot
    pairs = []

    for j in range(m):
        # Find the lowest 1 in column j
        nonzero = np.where(R[:, j] == 1)[0]
        while len(nonzero) > 0 and nonzero[-1] in low:
            # Another column i has the same low — add it (Z₂: XOR)
            i = low[nonzero[-1]]
            R[:, j] = (R[:, j] + R[:, i]) % 2
            nonzero = np.where(R[:, j] == 1)[0]

        if len(nonzero) > 0:
            pivot = nonzero[-1]
            low[pivot] = j
            pairs.append((pivot, j))  # pivot creates, j destroys
        # else: column j is zero — creates an essential feature

    # Identify essential features (creators not paired as destroyers)
    all_destroyers = set(j for _, j in pairs)
    all_creators = set(i for i, _ in pairs)
    for j in range(m):
        if j not in all_destroyers and j not in all_creators:
            pairs.append((j, None))  # essential feature

    return R, pairs

# Example: three points with edges at distances 1, 2, 3
# Filtration order: v0, v1, v2, [v0,v1], [v0,v2], [v1,v2]
D = np.array([
    # v0 v1 v2 e01 e02 e12
    [0, 0, 0, 1,  1,  0],  # v0
    [0, 0, 0, 1,  0,  1],  # v1
    [0, 0, 0, 0,  1,  1],  # v2
    [0, 0, 0, 0,  0,  0],  # e01
    [0, 0, 0, 0,  0,  0],  # e02
    [0, 0, 0, 0,  0,  0],  # e12
])
R, pairs = reduce_boundary_matrix(D)
print("Persistence pairs (creator_idx, destroyer_idx):")
for c, d in pairs:
    print(f"  σ_{c} paired with {'σ_' + str(d) if d is not None else '∞'}")

Extended Persistence with GUDHI

import gudhi
import numpy as np

# Height function on a sampled torus
n = 500
theta = np.random.uniform(0, 2 * np.pi, n)
phi = np.random.uniform(0, 2 * np.pi, n)
R, r = 2.0, 1.0
torus = np.column_stack([
    (R + r * np.cos(phi)) * np.cos(theta),
    (R + r * np.cos(phi)) * np.sin(theta),
    r * np.sin(phi)
])

# Build alpha complex and compute extended persistence
alpha = gudhi.AlphaComplex(points=torus)
stree = alpha.create_simplex_tree()

# Extended persistence of the height function (z-coordinate)
# gudhi computes this via the extended filtration
stree.extend_filtration()
dgms = stree.extended_persistence()

print("Extended persistence pairs by type:")
for i, (name, dgm) in enumerate(zip(
    ['Ordinary', 'Relative', 'Extended+', 'Extended-'], dgms)):
    print(f"  {name}: {len(dgm)} pairs")
    for dim, (b, d) in dgm[:3]:
        print(f"    H_{dim}: [{b:.2f}, {d:.2f})")

Connections & Cross-Track Bridges

Persistent homology sits at the center of a growing web of mathematical and applied techniques, connecting deeply to several other areas on formalML.

Within the Topology & TDA track:

  • Simplicial Complexes provides the combinatorial substrate: the VR filtration is a nested sequence of simplicial complexes, and the boundary operator defined there is the algebraic engine of homology.
  • Barcodes & Bottleneck Distance develops the metric structure on persistence diagrams. The Structure Theorem (Theorem 4 above) guarantees that barcodes are well-defined; the bottleneck and Wasserstein distances turn the space of diagrams into a metric space where stability makes sense.
  • Statistical TDA treats persistence diagrams as random objects — statistical estimators of the true homology of an underlying distribution. Bootstrap confidence sets separate topological signal from sampling noise.
  • The Mapper Algorithm provides a complementary geometric/visual summary. Where persistent homology produces algebraic invariants (barcodes), Mapper produces an interpretable graph. PH on the Mapper graph quantifies its topological features; multiscale Mapper uses PH to select optimal parameters.

Across tracks:

  • Graph Laplacians & Spectrum: The 0th persistent homology of a VR filtration is equivalent to single-linkage clustering — the threshold ε\varepsilon at which two points first become connected in H0H_0 equals the single-linkage distance. The number of connected components at scale ε\varepsilon equals the number of zero eigenvalues of the graph Laplacian of the ε\varepsilon-neighborhood graph.
  • Measure-Theoretic Probability: The convergence of empirical persistence diagrams to the population diagram relies on measure-theoretic concentration bounds. The convergence rate O((logn/n)1/d)O((\log n / n)^{1/d}) exhibits the curse of dimensionality — the same dimension-dependent rate that appears in density estimation and minimax theory.
  • Concentration Inequalities: The stability of persistence diagrams under noise is a Lipschitz condition, and the convergence rate of empirical diagrams involves Hausdorff concentration bounds that parallel McDiarmid-type arguments.

Applications: Persistence features have been used for protein classification (Cang & Wei, 2017), materials science (Hiraoka et al., 2016), cosmology (Xu et al., 2019), neuroscience (Giusti et al., 2015), and financial market analysis. The Takens delay embedding connects persistent homology to time series classification — see the working code above for the full pipeline.


Exercises

Foundational

Exercise 1. Consider four points at the corners of a unit square in R2\mathbb{R}^2: (0,0)(0,0), (1,0)(1,0), (1,1)(1,1), (0,1)(0,1). Write out the filtered boundary matrix for the VR filtration (vertices born at 0, four side-edges born at 1, two diagonal edges born at 2\sqrt{2}). Perform column reduction by hand and read off all birth-death pairs for H0H_0 and H1H_1.

Exercise 2. Verify 12=0\partial_1 \circ \partial_2 = 0 for the boundary of a tetrahedron [v0,v1,v2,v3][v_0, v_1, v_2, v_3] over Z2\mathbb{Z}_2. Write out the 6×46 \times 4 matrix 2\partial_2 (rows = edges, columns = triangular faces) and the 4×64 \times 6 matrix 1\partial_1 (rows = vertices, columns = edges). Compute the product and verify every entry is 0 mod 2.

Intermediate

Exercise 3. Prove that the longest H0H_0 bar in any VR persistence barcode always has death time == \infty. Hint: The last surviving connected component is never merged into anything.

Exercise 4. The total persistence of a persistence diagram is i(dibi)p\sum_i (d_i - b_i)^p for some power p>0p > 0. Show that for H0H_0 persistence of nn points, the total persistence (with p=1p = 1) equals the total weight of the minimum spanning tree of the point cloud. Hint: Each H0H_0 death corresponds to an MST edge.

Advanced

Exercise 5. Implement the full Takens embedding pipeline from the Working Code section on a real dataset: download daily closing prices for a stock index (e.g., S&P 500), compute the delay embedding, run persistent homology, and classify 60-day rolling windows into “high-volatility” vs “low-volatility” regimes using persistence features. Report cross-validation accuracy.

Exercise 6. Prove that persistent homology is invariant under bijective reparameterization of the filtration. That is, if ϕ:RR\phi: \mathbb{R} \to \mathbb{R} is a monotonically increasing bijection and KtK_t is a filtration, then the persistence diagram of Kϕ(t)K_{\phi(t)} is related to that of KtK_t by the coordinate change (b,d)(ϕ1(b),ϕ1(d))(b, d) \mapsto (\phi^{-1}(b), \phi^{-1}(d)). Hint: Reparameterization preserves the inclusion order of simplices, hence the matrix reduction is identical.

Connections

  • builds on simplicial-complexes
  • is closely related to cech-complexes
  • The Structure Theorem for persistence modules gives barcodes their mathematical foundation; bottleneck and Wasserstein distances turn the space of persistence diagrams into a metric space. barcodes-bottleneck
  • Statistical TDA treats persistence diagrams as random objects — statistical estimators of the true homology of an underlying distribution. statistical-tda
  • The 0th persistent homology of a VR filtration is equivalent to single-linkage clustering; the threshold at which two components merge equals the single-linkage distance, connecting persistence to the spectral gap of the graph Laplacian. graph-laplacians
  • Convergence of empirical persistence diagrams relies on measure-theoretic concentration bounds; the convergence rate exhibits the curse of dimensionality. measure-theoretic-probability

References & Further Reading