Markov Chains

In this subsection we treat discrete time Markov processes chains with a finite or eventually countable state space $S$, also called Markov Chains. Recall that we only consider time homogenous Markov processes. Defining

\[ \begin{align*} \mu_x:=P[X_0=x] \quad p_{xy}:=P\left[X_{t+1}=y|X_t=x\right]=P[X_1=y|X_0=x] \end{align*} \]

it follows that $0\leq \mu_x, p_{xy}\leq 1$ and $\sum \mu_x=1$, as well as $\sum_y p_{xy}=1$ for every $x$. In other terms, the initial distribution $\mu:=(\mu_x)$ is a random vector and the one step transition probability can be represented by the stochastic matrix $P=(p_{xy})$. For $B\subseteq S$, it holds

\[ \begin{equation*} P_1(x,B)=\sum_{y \in B}p_{xy} \end{equation*} \]

By Chapman Kolmogorov relation, we have $P_t=P_{t-1}P_1$, hence, defining recursively

\[ \begin{equation}\label{eq:repetition_markov} p^t_{xy}=\sum_{z}p^{t-1}_{xz}p_{zy} \end{equation} \]

we have

\[ \begin{equation*} P_t(x,B)=\sum_{y \in B}p^t(x,y) \end{equation*} \]

In the discrete case the following strong Markov property holds.

Strong Markov Property

Let $Z$ be a bounded random variable and $\tau$ a stopping time. Then it holds

\[ \begin{equation*} 1_{\{\tau<\infty\}}E^{\mu}\left[ Z\circ\theta_{\tau}|\mathcal{F}_\tau \right]=1_{\{\tau<\infty\}} E^{X_{\tau}}\left[ Z \right]. \end{equation*} \]

Proof

For every $A \in \mathcal{F}_\tau$ we get by Theorem \ref{thm:markov property} and the fact that $A\cap \{\tau=t\}\in \mathcal{F}_t$ for every $t$,

\[ \begin{align*} E^{\mu}\left[1_A1_{\{\tau<\infty\}} Z\circ\theta_\tau\right] &=\sum_{t} E^\mu \left[1_A1_{\{\tau=t\}} Z\circ\theta_t\right]\\ &= \sum_{t} E^{\mu}\left[1_A1_{\{\tau=t\}} E^{X_t}\left[Z\right]\right]\\ &=E^\mu\left[1_A1_{\{\tau<\infty\}} E^{X_\tau}\left[Z\right]\right], \end{align*} \]

from which the claim follows.

Markov chains are defined locally: The next step is entirely defined by the place where you are now and the transition probability. The natural question therefore is whether you will come back or not. Given a state $y$ in $S$, define recursively

\[ \begin{align*} \tau^0_y & :=0\\ \tau^1_y & := \inf\left\{ t> \tau^0_y\colon X_t=y \right\}\\ \vdots\\ \tau^k_y & := \inf\left\{ t>\tau^{k-1}_y\colon X_t=y \right\}\\ \vdots \end{align*} \]

Here $\tau^k_y$ denotes the $k$-th return time to the state $y$. Let further

\[ \begin{equation*} \rho_{xy}:=P^{x}\left[ \tau_y<\infty \right] \end{equation*} \]

be the probability that starting from $x$, the Markov chain visits the state $y$ at least once, where we denote the first return time as $\tau_y = \tau^1_y$.

Definition

We say that a time homogeneous Markov chain is

recurrent: if $\rho_{xx}=1$;
transient: if $\rho_{xx}<1$.

Theorem

For any two states $x$ and $y$ in $S$, it holds

\[ \begin{equation*} P^x[\tau_y^k < \infty] = \rho_{xy}\rho_{yy}^{k-1}. \end{equation*} \]

Proof

We show it per induction. Per definition, it holds $P^x[\tau_y^1 < \infty]= \rho_{xy}$. Suppose therefore that the claim holds for every $l=1,\ldots,k-1$ and we show it for $k$.

Define $\tau=\tau_y^{k-1}$ and $Z:=1_{\{\tau_y < \infty \}}$. It follows that $\{ \tau_y^k < \infty \} = \{ \tau_y \circ \theta_\tau < \infty \}$. Furthermore, from $1_{\{\tau<\infty\}}$ bounded and $\mathcal{F}_{\tau}$-measurable and $X_{\tau}=y$ on $\{\tau<\infty\}$, using the strong Markov property it follows that

\[ \begin{align*} P^{x}[\tau_y^k < \infty ] &= P^{x}[\tau_y \circ \theta_\tau < \infty ]\\ &= E^x\left[1_{\{\tau < \infty\}} (Z \circ \theta_\tau) \right]\\ &= E^x\left[1_{\{\tau < \infty\}} E^x\left[Z \circ \theta_\tau| \mathcal{F}_\tau\right]\right]\\ &= E^x\left[1_{\{\tau < \infty\}} E^{X_\tau}\left[Z\right] \right]\\ &= E^x\left[1_{\{\tau < \infty\}} P^y[\tau_y<\infty]\right]\\ &= \rho_{yy} P^x[\tau_y^{k-1} < \infty]\\ &= \rho_{yy} \rho_{xy} \rho_{yy}^{k-2}\\ &= \rho_{xy} \rho_{yy}^{k-1}. \end{align*} \]

Theorem

Let $x$ and $y$ be two states in $S$, where $x$ is recurrent and $\rho_{xy}>0$. Then $y$ is recurrent and $\rho_{yx}=1$.

Proof

Let us first show that $\rho_{yx}=1$. Since $x$ is recurrent, it follows that $\tau_x(\omega)<\infty$ for almost all $\omega \in \Omega$. Hence, for almost all $\omega \in \Omega$ such that $\tau_y(\omega)<\infty$ we get $\tau_x \circ \theta_{\tau_y}(\omega) <\infty$.

Thus with $H:=1_{\{\tau_x=\infty\}}$, the strong markov property, and the fact that $X_{\tau_y}=y$, we get

\[ \begin{align*} 0 & =P^x[\tau_y < \infty, \tau_x \circ \theta_{\tau_y} = \infty]\\ & =E^x\left[1_{\{\tau_y<\infty\}} 1_{\{\tau_x \circ \theta_{\tau_y} = \infty\}} \right]\\ & =E^x\left[1_{\{\tau_y<\infty\}} E^{X_{\tau_y}}\left[H\right] \right]\\ & =E^x\left[1_{\{\tau_y<\infty\}} P^y[\tau_x=\infty]\right] \\ & =\rho_{xy}(1-\rho_{yx}). \end{align*} \]

Since $\rho_{xy} > 0$, it must be that $\rho_{yx}=1$.

Let us finally show that $y$ is recurrent. Let $i$ and $j$ be two states in $S$ and $k$ in$\mathbb{N}$. Then by Chapman-Kolmogorov relation and an induction we get $P_i[X_k=j]=p^k_{ij}$. Since $\rho_{xy}>0$ and $\rho_{yx}>0$, there exist $k_1$ and $k_2$ in $\mathbb{N}$ with $p_{xy}^{k_1}>0$ and $p_{yx}^{k_2}>0$. By the Chapman-Kolmogorov relation we get

\[ p_{yy}^{k_1+t+k_2} \geq p_{yx}^{k_2} p_{xx}^t p_{xy}^{k_1} \]

so that

\[ E^y\left[N_y\right] =p_{yx}^{k_2} E^x\left[N_x\right]p_{xy}^{k_1} =\infty. \]

Recurrence/Transience Theorem implies that $y$ is recurrent.

Definition

A set $C \subseteq S$ is called

closed, if for $x\in C$, $y\in S$ and $\rho_{xy}>0$ it holds $y \in C$,
irreducible, if $\rho_{xy}>0$ for all $x,y\in C$.

Theorem

Suppose $C \subset S$ is finite and closed.

There exists $x\in C$ such that $x$ is recurrent.
If $C$ is irreducible, then all its states are recurrent.

Proof

By contradiction, assume that all states are transient, that is $E^x[N_y] < \infty$ for all $x$ and $y$ in $C$ by the recurrent/transient Theorem. But then, since $C$ is finite we get with Tonelli's theorem

\[ \sum_{y\in C} E^x[N_y] =\sum_{y\in C} \sum_{t} P^x [X_t=y] =\sum_{t}\sum_{y\in C} P^x[X_t=y] =\sum_{t} P^x [ X_t\in C] =\sum_{t} 1=\infty. \]

Indeed, if $P^x [X_t\in C] < 1$, then there is $y$ in $C^c$ with $P^x[X_t=y] >0$, that is $\rho_{xy} >0$. Since $C$ is closed, it follows $y \in C$, a contradiction.
If $x \in C$ is recurrent and $\rho_{xy}>0$, then $y$ is recurrent by Theorem the previous theorem.

Theorem

Let $R:=\{ x \in S \colon \rho_{xx} =1 \}$ be the set of recurrent states. Then $R$ is a disjoint union of closed and irreducible sets.

Proof

Fix $x$ in$ R$ and let $C_x :={ y\in R \colon \rho_{xy} >0 } $.

For $x$, $y$ and $z$ in $R$ it holds $\rho_{xy} \geq \rho_{xz} \rho_{zy}$. Indeed, with $H:=1_{\{\tau_y < \infty\}}$ and the strong Markov property using the same argument as in recurrence Theorem we obtain

\[ \begin{align*} \rho_{xy} & =P^x[\tau_y<\infty]\\ & \geq P^x[\tau_z<\infty, \tau_y \circ \theta_{\tau_z} < \infty]\\ & =E^x[1_{\{\tau_z < \infty\}} E^x[H \circ \theta_{\tau_z} | \mathcal{F}_{\tau_z}]]\\ & =E^x[1_{\{\tau_z < \infty\}}] E^z[1_{\{\tau_y < \infty\}}]]\\ & =\rho_{xz}\rho_{zy}. \end{align*} \]
$C_x$ is irreducible and closed. Indeed, for $y$ in $C_x$ and $\rho_{yz} >0$, since $y$ is in $C_x$ we have $\rho_{xy}>0$ and therefore by the previous step $\rho_{xz} >0$, that is $z$ belongs to $C_x$. Furthermore, for $y$ and $z$ in $C_x$, it implies that $\rho_{xy} >0$, $\rho_{xz}>0$. Recurrence Theorem yields $\rho_{yx}>0$ and therefore $\rho_{yz}>0$.
We now show that $C_x \cap C_y \neq \emptyset$ implies $C_x=C_y$. Let $z$ be in $C_x \cap C_y$ and $y^\prime$ in $C_y$. Therefore $\rho_{xz}>0$, $\rho_{yz}>0$, $\rho_{yy^\prime}>0$ and by recurrence Theorem $\rho_{zy}>0$. This means $\rho_{xy^\prime} >0$ which yields $y^\prime\in C_x$, hence $C_y \subset C_x$. Analogously, $C_x \subset C_y$.

Definition

A measure $\mu$ on is called stationary—or invariant—if

$\mu(y)< \infty$ for all $y \in S$,
$\mu(y)=\sum_{x\in S} \mu(x)p_{xy}$.

If $\mu$ is a probability measure, then $\mu$ is called a stationary distribution.

Remark

Suppose that $\mu$ is a stationary distribution, and let $P^\mu$ the measure such that $X$ is Markov with start distribution $\mu$. Then

\[ \begin{align*} P^\mu[X_1=y] & = \sum_{x\in S}P^\mu[X_1=y | X_0=x] P^\mu[X_0=x]\\ & =\sum_{x\in S} \mu(x) p_{xy}\\ & =\mu(y) \end{align*} \]

By induction, we assume that $P^\mu[X_t=y]=\mu(y)$. Then

\[ \begin{align*} P^\mu[X_{t+1}=y] & =\sum_{z\in S} P^\mu[X_{t+1}=y|X_t=z]P^\mu[X_t=z]\\ & =\sum_{z\in S} \mu(z) p_{zy}\\ & =\mu(y) \end{align*} \]

This shows $P^\mu(X_t=y) = \mu(y)$ for every $t$.

Example: Randomw Walk

Let $X_t:=X_0+\sum_{s=1}^{t} Y_s$ and $\mathcal{F}_t:=\sigma(X_0,\dots,X_t)$. Then

\[ \begin{align*} p_{xy} & = P[X_{t+1}=y | X_t=x]=P[\xi_{t+1}=y-X_t | X_t=x]\\ & =P[\xi_1=y-x]\\ & =f(y-x) \end{align*} \]

where $f \colon S \to [0,1]$ is such that $\sum_{y\in S}f(y)=1$, that is $f(y):=P[\xi_1=y]$. In this case $\mu \equiv 1$ is a stationary measure. Indeed,

\[ \begin{align*} \sum_{x\in S} \mu(x)p_{xy} & = \sum_{x\in S} p_{xy}\\ & = \sum_{x\in S} f(y-x) \\ & =\sum_{x^\prime\in S} f(x^\prime) \\ & =1\\ & =\mu(y) \end{align*} \]

Remark

Let $x\in S$ be transient and $\pi$ a stationary distribution. Then $\pi(x)=0$.

Proof

Exercise.

Theorem

Suppose $x$ is recurrent and $\tau:=\inf\{ t\colon X_t = x \}$. Define

\[ \mu(y)=E^x\left[\sum_{t=0}^{\tau-1}1_{\{X_t=y\}}\right]=\sum_{t} P^x[X_t=y, t<\tau] \]

for all $y\in S$. Then $\mu$ is a stationary measure.

Proof

For $z$ in $S$ and $t$ let $\bar{p}_t(x,z):=P^x[X_t=z,\tau>t]$. We claim that $\sum_{t} \bar{p}_t(x,\cdot)$ is a stationary measure.

For $z\neq x$ we have

\[ \begin{align*} \sum_{y\in S} \bar{p}_t(x,y) p_{yz} & = \sum_{y\in S} P^x[X_t=y, \tau>t] p_{yz} \\ & = \sum_{y\in S} P^x(X_t=y,\tau>t,X_{t+1}=z]\\ & = P^x[\tau>t+1,X_{t+1}=z]\\ & =\bar{p}_{t+1}(x,z) \end{align*} \]

Hence,

\[ \begin{align*} \sum_{y\in S}\sum_{t} \bar{p}_t(x,y)p_{yz} & = \sum_{t}\sum_{y\in S} \bar{p}_t(x,y)p_{yz}\\ & = \sum_{t} \bar{p}_{t+1}(x,z)\\ & = \mu(z) \end{align*} \]

since $\bar{p}_0(x,z)=0$.
For $z=x$ we have

\[ \sum_{y\in S} \bar{p}_t(x,y) p_{yx} = \sum_{y\in S} P^x[X_t=y,\tau>t,X_{t+1}=x] = P^x[\tau=t+1]. \]

Hence,

\[ \sum_{t}\sum_{y\in S} \bar{p}_t(x,y)p_{yz} = \sum_{t} P^x[\tau=t+1] = 1 = \mu(x). \]
Finally we show that $\mu(y)<\infty$ for any $y$ in $S$. We have $\mu(x)=1$, and $\mu p^t=\mu$, showing that $\mu(y)<\infty$ if $p^t_{xy}>0$ for some $t$.
- If $P^x[\tau_y<\infty]=0$, then $\mu(y)=0$.
- If $P^x[\tau_y<\infty]>0$, then by recurrence Theorem we have $P^y[\tau_x<\infty]>0$ so that $p_{yx}^t>0$ for some $t$, hence $\mu(y)<\infty$.