Jordan Normal Form, Spectral Radius and Positive Matrices

Topic: Jordan Normal Form, Spectral Radius and Positive Matrices

Date: Nov. 22, Nov 29 and Dec 6

Presenter: Liran Li

Source Material: https://iuuk.mff.cuni.cz/~rakdver/linalg/lesson15-9.pdf, which is lecture notes from a linear algebra course, and https://www.math.upenn.edu/~moose/240S2013/slides7-31.pdf, which serves as an introduction to a particular definition of generalized eigenvectors.

Any square complex matrix is similar to its Jordan normal form. Starting from its Jordan normal form, we are able to prove a lot of things about an element-wise non-negative matrix that becomes element-wise positive when raised to some power. This series of presentations are an adaptation of the lecture notes by Prof. Zdeněk Dvořák at Computer Science Institute of Charles University, Prague. We discuss some less familiar results about these positive marices, which then lead to a potential use case — PageRank.

Mirror Descent

Problem

\min\limits_{x \in C} f(x)

Mirror Descent

\phi: Legendre function

\phi^*: Legendre Dual with \nabla\phi^* = (\nabla\phi)^{-1}

D_{\phi}(x, y) = phi(x) - \phi(y) - \langle \nabla\phi(y), x - y\rangle: Bregman Divergence

Traditional Algorithm:

x_{k+1} = proj_C^{\phi}(\nabla\phi^*(\nabla\phi(x_k) - \alpha_kg_k)), where  proj_C^{\phi}(y) = \arg\min\limits_{x \in C} D_{\phi}(x, y) and g_k \in \partial f(x_k)

Equivalent Expression:

The Mirror Descent algorithm can be equivalently expressed as:

x_{k+1} = \arg\min_{x \in C}\{f(x_k) + \langle g_k, x - x_k\rangle + \frac{1}{\alpha_k}D_{\phi}(x, x_k)\}

Comparison with projected subgradient descent

projected subgradient descent algorithm:

x_{k+1} = proj_C^{\|\cdot\|_2}(x_k - \alpha_kg_k)

and similarly, it can be equivalently expressed as:

x_{k+1} = \arg\min_{x \in C}\{f(x_k) + \langle g_k, x - x_k\rangle + \frac{1}{2\alpha_k}\|x - x_k\|_2^2\}

So the difference between Mirror Descent and Projected Subgradient Descent is just it replace the regular 2-norm by a more general Bregman divergence function.

Convergence Rate for Mirror Descent

Assumption:

  1. \phi is \mu-strongly convex with respect to some norm \|\|, namely \phi(x) \geq \phi(y) + \langle \nabla \phi(y), x - y\rangle + \frac{\mu}{2}\|x - y\|^2
  2. D_{\phi}(x^*, x_1) \leq R^2 and  \|g_k\|_*^2 \leq R^2
  3. \alpha_k = \frac{R}{G\sqrt{T}}

Then we have \min\limits_{k = 1, \dots, T} f(x_k) - f(x^*) \leq \frac{RG}{\sqrt{\mu T}}

Reference

  1. paper linked to: https://web.iem.technion.ac.il/images/user-files/becka/papers/3.pdf
  2. also this note: http://users.cecs.anu.edu.au/~xzhang/teaching/bregman.pdf