Chapter 2 Perron-Frobenius Theorem

Friday, January 22, 1993

In this lecture, we use the Perron Frobenius theory of non-negative matrices to obtain information on eigenvalues of a graph.

Let \(K = \mathbb{R}\). For \(n\in \mathbb{Z}^{> 0}\), pick a symmetric matrix \(C\in \mathrm{Mat}_n(\mathbb{R})\).

Definition 2.1 The matrix \(C\) is reducible if and only if there is a bipartition \(\{1, 2, \ldots, n\} = X^+ \cup X^-\) (disjoint union of nonempty sets) such that \(C_{ij} = 0\) for all \(i\in X^+\), and for all \(j\in X^-\), and for all \(i\in X^-\), and for all \(j\in X^+\), i.e., \[ C \sim \begin{pmatrix} \ast & O \\ O & \ast \end{pmatrix}.\]

Definition 2.2 The matrix \(C\) is bipartite if and only if there is a bipartition \(\{1, 2, \ldots, n\} = X^+ \cup X^-\) (disjoint union of nonempty sets) such that \(C_{ij} = 0\) for all \(i,j\in X^+\), and for all \(i,j\in X^-\), i.e., \[ C \sim \begin{pmatrix} O & \ast \\ \ast & O \end{pmatrix}.\]

Note.

If \(C\) is bipartite, for every eigenvalue \(\theta\) of \(C\), \(-\theta\) is an eigenvalue of \(C\) such that \(\mathrm{mult}(\theta) = \mathrm{mult}(-\theta)\).

Indeed, let \(C = \begin{pmatrix} O & A \\ B & O \end{pmatrix}\), \[\begin{pmatrix} O & A \\ B & O \end{pmatrix} \begin{pmatrix}x\\y\end{pmatrix} = \theta \begin{pmatrix}x\\y\end{pmatrix}\Leftrightarrow \begin{pmatrix} O & A \\ B & O \end{pmatrix} \begin{pmatrix}x\\-y\end{pmatrix} = -\theta \begin{pmatrix}x\\-y\end{pmatrix}, \] where \(Ay = \theta x\) and \(Bx = \theta y\).

If \(C\) is bipartite, \(C^2\) is reducible.
The matrix \(C\) is irreducible and \(C^2\) is reducible, if \(C_{ij} \geq 0\) for all \(i,j\) and \(C\) is bipartite. (Exercise)

HS MEMO

Note 1. Even if \(C\) is not symmetric \[\begin{pmatrix} O & A \\ B & O \end{pmatrix} \begin{pmatrix}x\\y\end{pmatrix} = \theta \begin{pmatrix}x\\y\end{pmatrix}\Leftrightarrow \begin{pmatrix} O & A \\ B & O \end{pmatrix} \begin{pmatrix}x\\-y\end{pmatrix} = -\theta \begin{pmatrix}x\\-y\end{pmatrix}\] holds. So the geometric multiplicities of \(\theta\) and \(-\theta\) coincide. How about the algebraic multiplicities?

Note 3. Set \(x \sim y\) if and only if \(C_{xy}>0\). So the graph may have loops. Then \[(C^2)_{xy} > 0 \Leftrightarrow \textrm{ if there exists } z\in X \textrm{ such that } x\sim z \sim y.\] Note that \(C\) is irreducible if and only if \(\Gamma(C)\) is connected. Let \[\begin{align} X^+ & = \{y\mid \textrm{there is a path of even length from }x \textrm{ to }y\}\\ X^- & = \{y\mid \textrm{there is no path of even length from }x \textrm{ to }y\} \neq \emptyset. \end{align}\] If there is an edge \(y\sim z\) in \(X^+\) and \(w\in X^-\). Then there would be a path from \(x\) to \(y\) of even length. So \(\mathrm{e}(X^+, X^+) = \mathrm{e}(X^-, X^-) = 0.\).

Theorem 2.1 (Perron-Frobenius) Given a matrix \(C\) in \(\mathrm{Mat}_n(\mathbb{R})\) such that

\((a)\) \(C\) is symmetric.

\((b)\) \(C\) is irreducible.

\((c)\) \(C_{ij} \geq 0\) for all \(i,j\).

Let \(\theta_0\) be the maximal eigenvalue of \(C\) with eigenspace \(V_0 \subseteq \mathbb{R}^n\), and let \(\theta_r\) be the minimal eigenvalue of \(C\) with eigenspace \(V_r \subseteq \mathbb{R}^n\). Then the following hold.

\((i)\) Suppose \(0\neq v = \begin{pmatrix}\alpha_1\\\alpha_2\\\vdots\\\alpha_n\end{pmatrix} \in V_0\). Then \(\alpha_i > 0\) for all \(i\), or \(\alpha_i < 0\) for all \(i\).

\((ii)\) \(\mathrm{dim}V_0 = 1\).

\((iii)\) \(\theta_r \geq -\theta_0\).

\((iv)\) \(\theta_r = \theta_0\) if and only if \(C\) is bipartite.

First, we prove the following lemma.

Lemma 2.1 Let \(\langle \;,\; \rangle\) be the dot product in \(V = \mathbb{R}^n\). Pick a symmetric matrix \(B \in \mathrm{Mat}_n(\mathbb{R})\). Suppose all eigenvalues of \(B\) are nonnegative. (i.e., \(B\) is positive semidefinite.) Then there exist vectors \(v_1, v_2, \ldots, v_n\in V\) such that \(B_{ij} = \langle v_i, v_j\rangle\) for all \(i,j\) \((1\leq i, j \leq n)\).

Proof. By elementary linear algebra, there exists an orthonormal basis \(w_1, w_2, \ldots, w_n\) of \(V\) consisting of eigenvectors of \(B\). Set the \(i\)-th column of \(P\) is \(w_i\) and \(D = \mathrm{diag}(\theta_1,\ldots, \theta_n)\). Then \(P^\top P = I\) and \(BP = PD\).

Hence, \[B = PDP^{-1} = PDP^\top = QQ^\top,\] where \[Q = P\cdot \mathrm{diag}(\sqrt{\theta_1}, \sqrt{\theta_2}, \ldots, \sqrt{\theta_n}) \in \mathrm{Mat}_n(\mathbb{R}).\] Now, let \(v_i\) be the \(i\)-th column of \(Q^\top\). Then \[B_{ij} = v_i^\top\cdot v_j = \langle v_i, v_j\rangle.\] This proves the lemma.

Now we start the proof of Theorem 2.1.

Proof. \((i)\) Let \(\langle \text{ }, \text{ }\rangle\) denote the dot product on \(V = \mathbb{R}^n\). Set \[\begin{align} B & = \theta I - C\\ & = \textrm{symmetric matrix with eigenvalues }\theta_0 - \theta_i \geq 0\\ & = (\langle v_i, v_j\rangle)_{1\leq i,j\leq n} \end{align}\] with the same \(v_1, \ldots, v_n \in V\) by Lemma 2.1.

Observe: \[\sum_{i = 1}^n \alpha_iv_i = 0.\]

Pf. \[\begin{align} \left\|\sum_{i = 1}^n \alpha_iv_i\right\|^2 & = \left\langle \sum_{i=1}^n\alpha_iv_i, \sum_{i=1}^n\alpha_iv_i\right\rangle\\ & = \begin{pmatrix} \alpha_1 &\ldots &\alpha_n\end{pmatrix}B\begin{pmatrix} \alpha_1\\\vdots\\\alpha_n\end{pmatrix}\\ & = v^\top Bv\\ & = 0, \end{align}\] since \(Bv = (\theta_0 I - C)v = 0\).

Now set \[s = \textrm{the number of indices} \; i, \textrm{ where } \alpha_i >0.\] Replacing \(v\) by \(-v\) if necessary, without loss of generality we may assume that \(s\geq 1\). We want to show \(s = n\).

Assume \(s < n\). Without loss of generality, we may assume that \(\alpha_i >0\) for \(1\leq i\leq s\) and \(\alpha_i \leq 0\) for \(s+1 \leq i \leq n\). Set \[ \rho = \alpha_1v_1 + \cdots + \alpha_sv_s = -\alpha_{s+1}v_{s+1} - \cdots - \alpha_nv_n.\] Then, for \(i = 1,\ldots, s\), \[\begin{align} \langle v_i, \rho \rangle & = \sum_{j = s+1}^n -\alpha_j\langle v_i, v_j\rangle \quad (\langle v_i, v_j\rangle = B_{ij}, B = \theta_0I - C)\\ & = \sum_{j = s+1}^n (-\alpha_{ij})(-C_{ij})\\ & \leq 0. \end{align}\] Hence \[0\leq \langle \rho, \rho\rangle = \sum_{i=1}^s \alpha_i \langle v_i, \rho\rangle \leq 0,\] as \(\alpha_i > 0\) and \(\langle v_i, \rho\rangle \leq 0\). Thus, we have \(\langle \rho, \rho \rangle = 0\) and \(\rho = 0\). For \(j = s+1, \ldots, n\), \[0 = \langle \rho, v_j\rangle = \sum_{i=1}^s \alpha_i\langle v_i, v_j\rangle \leq 0,\] as \(\langle v_i, v_j\rangle = -C_{ij}\).

Therefore, \[0 = \langle v_i, v_j \rangle = -C_{ij} \textrm{ for } 1\leq i \leq s, \: s+1 \leq j \leq n.\] Since \(C\) is symmetric, \[C = \begin{pmatrix} \ast & O \\ O & \ast\end{pmatrix}\] Thus \(C\) is reducible, which is not the case. Hence \(s = n\).

\((ii)\) Suppose \(\dim V_0 \geq 2\). Then, \[\dim\left(V_0 \cap \begin{pmatrix}1\\0\\\vdots\\0\end{pmatrix}^\bot\right) \geq 1.\] So, there is a vector \[0\neq v = \begin{pmatrix}\alpha_1\\\vdots\\\alpha_n\end{pmatrix} \in V_0\] with \(\alpha_1 = 0\). This contradicts \((i)\).

Now pick \[0\neq w = \begin{pmatrix}\beta_1\\\vdots\\\beta_n\end{pmatrix} \in V_r.\]

\((iii)\) Suppose \(\theta_r < -\theta_0\). Since the eigenvalues of \(C^2\) are the squares of those of \(C\), \(\theta_r^2\) is the maximal eigenvalue of \(C^2\).

Also we have \(C^2w = \theta_r^2w\).

Observe that \(C^2\) is irreducible. (As otherwise, \(C\) is bipartite by Note 3, and we must have \(\theta_r = -\theta_0\).) Therefore, \(\beta_i > 0\) for all \(i\) or \(\beta_i < 0\) for all \(i\). We have \[\langle v, w\rangle = \sum_{i=1}^n\alpha_i\beta_j \neq 0.\] This is a contradiction, as \(V_0 \bot V_r\).

\((iv)\) \(\Rightarrow\): Let \(\theta_r = -\theta_0\). Then \(\theta = \theta_1^2 = \theta_0^2\) is the maximal eigenvalue of \(C^2\), and \(v\) and \(w\) are linearly independent eigenvalues for \(\theta\) for \(C^2\). Hence, for \(C^2\), \(\mathrm{mult}(\theta) \geq 2\).

Thus by \((ii)\), \(C^2\) must be reducible. Therefore, \(C\) is bipartite by Note 3.

\(\Leftarrow\): This is Note 1.

Let \(\Gamma = (X, E)\) be any graph.

Definition 2.3 \(\Gamma\) is said to be bipartite if the adjacency matrix \(A\) is bipartite. That is, \(X\) can be written as a disjoint union of \(X^+\) and \(X^-\) such that \(X^+, X^-\) contain no edges of \(\Gamma\).

Corollary 2.1 For any (connected) graph \(\Gamma\) with \[\mathrm{Spec}(\Gamma) = \begin{pmatrix}\theta_0 & \theta_1 &\cdots & \theta_r\\m_1 & m_1 & \cdots & m_r\end{pmatrix} \:\textrm{ with }\: \theta_0 > \theta_1 > \cdots > \theta_r.\] Let \(V_i\) be the eigenspace of \(\theta_i\). Then the following holds.

Supppose \(0\neq v = \begin{pmatrix} \alpha_1\\\vdots \\\alpha_n \end{pmatrix} \in V_0\in \mathbb{R}^n\). Then \(\alpha_i > 0\) for all \(i\), or \(\alpha_i < 0\) for all \(i\).
\(m_0 = 1\).
\(\theta_r \geq -\theta_0\) if and only if \(\Gamma\) is bipartite. In this case, \[-\theta_i = \theta_{r-i} \; \textrm{and} \; m_i = m_{r-i} \quad (0\leq i\leq r).\]

Proof. This is a direct consequences of Theorem 2.1 and Note 3.