Markov Chains
In this subsection we treat discrete time Markov processes chains with a finite or eventually countable state space \(S\), also called Markov Chains. Recall that we only consider time homogenous Markov processes. Defining
it follows that \(0\leq \mu_x, p_{xy}\leq 1\) and \(\sum \mu_x=1\), as well as \(\sum_y p_{xy}=1\) for every \(x\). In other terms, the initial distribution \(\mu:=(\mu_x)\) is a random vector and the one step transition probability can be represented by the stochastic matrix \(P=(p_{xy})\). For \(B\subseteq S\), it holds
By Chapman Kolmogorov relation, we have \(P_t=P_{t-1}P_1\), hence, defining recursively
we have
In the discrete case the following strong Markov property holds.
Strong Markov Property
Let \(Z\) be a bounded random variable and \(\tau\) a stopping time. Then it holds
Proof
For every \(A \in \mathcal{F}_\tau\) we get by Theorem \ref{thm:markov property} and the fact that \(A\cap \{\tau=t\}\in \mathcal{F}_t\) for every \(t\),
from which the claim follows.
Markov chains are defined locally: The next step is entirely defined by the place where you are now and the transition probability. The natural question therefore is whether you will come back or not. Given a state \(y\) in \(S\), define recursively
Here \(\tau^k_y\) denotes the \(k\)-th return time to the state \(y\). Let further
be the probability that starting from \(x\), the Markov chain visits the state \(y\) at least once, where we denote the first return time as \(\tau_y = \tau^1_y\).
Definition
We say that a time homogeneous Markov chain is
- recurrent: if \(\rho_{xx}=1\);
- transient: if \(\rho_{xx}<1\).
Theorem
For any two states \(x\) and \(y\) in \(S\), it holds
Proof
We show it per induction. Per definition, it holds \(P^x[\tau_y^1 < \infty]= \rho_{xy}\). Suppose therefore that the claim holds for every \(l=1,\ldots,k-1\) and we show it for \(k\).
Define \(\tau=\tau_y^{k-1}\) and \(Z:=1_{\{\tau_y < \infty \}}\). It follows that \(\{ \tau_y^k < \infty \} = \{ \tau_y \circ \theta_\tau < \infty \}\). Furthermore, from \(1_{\{\tau<\infty\}}\) bounded and \(\mathcal{F}_{\tau}\)-measurable and \(X_{\tau}=y\) on \(\{\tau<\infty\}\), using the strong Markov property it follows that
Theorem
Let \(x\) and \(y\) be two states in \(S\), where \(x\) is recurrent and \(\rho_{xy}>0\). Then \(y\) is recurrent and \(\rho_{yx}=1\).
Proof
Let us first show that \(\rho_{yx}=1\). Since \(x\) is recurrent, it follows that \(\tau_x(\omega)<\infty\) for almost all \(\omega \in \Omega\). Hence, for almost all \(\omega \in \Omega\) such that \(\tau_y(\omega)<\infty\) we get \(\tau_x \circ \theta_{\tau_y}(\omega) <\infty\).
Thus with \(H:=1_{\{\tau_x=\infty\}}\), the strong markov property, and the fact that \(X_{\tau_y}=y\), we get
Since \(\rho_{xy} > 0\), it must be that \(\rho_{yx}=1\).
Let us finally show that \(y\) is recurrent. Let \(i\) and \(j\) be two states in \(S\) and \(k\) in\(\mathbb{N}\). Then by Chapman-Kolmogorov relation and an induction we get \(P_i[X_k=j]=p^k_{ij}\). Since \(\rho_{xy}>0\) and \(\rho_{yx}>0\), there exist \(k_1\) and \(k_2\) in \(\mathbb{N}\) with \(p_{xy}^{k_1}>0\) and \(p_{yx}^{k_2}>0\). By the Chapman-Kolmogorov relation we get
so that
Recurrence/Transience Theorem implies that \(y\) is recurrent.
Definition
A set \(C \subseteq S\) is called
- closed, if for \(x\in C\), \(y\in S\) and \(\rho_{xy}>0\) it holds \(y \in C\),
- irreducible, if \(\rho_{xy}>0\) for all \(x,y\in C\).
Theorem
Suppose \(C \subset S\) is finite and closed.
- There exists \(x\in C\) such that \(x\) is recurrent.
- If \(C\) is irreducible, then all its states are recurrent.
Proof
-
By contradiction, assume that all states are transient, that is \(E^x[N_y] < \infty\) for all \(x\) and \(y\) in \(C\) by the recurrent/transient Theorem. But then, since \(C\) is finite we get with Tonelli's theorem
\[ \sum_{y\in C} E^x[N_y] =\sum_{y\in C} \sum_{t} P^x [X_t=y] =\sum_{t}\sum_{y\in C} P^x[X_t=y] =\sum_{t} P^x [ X_t\in C] =\sum_{t} 1=\infty. \]Indeed, if \(P^x [X_t\in C] < 1\), then there is \(y\) in \(C^c\) with \(P^x[X_t=y] >0\), that is \(\rho_{xy} >0\). Since \(C\) is closed, it follows \(y \in C\), a contradiction.
-
If \(x \in C\) is recurrent and \(\rho_{xy}>0\), then \(y\) is recurrent by Theorem the previous theorem.
Theorem
Let \(R:=\{ x \in S \colon \rho_{xx} =1 \}\) be the set of recurrent states. Then \(R\) is a disjoint union of closed and irreducible sets.
Proof
Fix \(x\) in$ R$ and let $C_x :={ y\in R \colon \rho_{xy} >0 } $.
-
For \(x\), \(y\) and \(z\) in \(R\) it holds \(\rho_{xy} \geq \rho_{xz} \rho_{zy}\). Indeed, with \(H:=1_{\{\tau_y < \infty\}}\) and the strong Markov property using the same argument as in recurrence Theorem we obtain
\[ \begin{align*} \rho_{xy} & =P^x[\tau_y<\infty]\\ & \geq P^x[\tau_z<\infty, \tau_y \circ \theta_{\tau_z} < \infty]\\ & =E^x[1_{\{\tau_z < \infty\}} E^x[H \circ \theta_{\tau_z} | \mathcal{F}_{\tau_z}]]\\ & =E^x[1_{\{\tau_z < \infty\}}] E^z[1_{\{\tau_y < \infty\}}]]\\ & =\rho_{xz}\rho_{zy}. \end{align*} \] -
\(C_x\) is irreducible and closed. Indeed, for \(y\) in \(C_x\) and \(\rho_{yz} >0\), since \(y\) is in \(C_x\) we have \(\rho_{xy}>0\) and therefore by the previous step \(\rho_{xz} >0\), that is \(z\) belongs to \(C_x\). Furthermore, for \(y\) and \(z\) in \(C_x\), it implies that \(\rho_{xy} >0\), \(\rho_{xz}>0\). Recurrence Theorem yields \(\rho_{yx}>0\) and therefore \(\rho_{yz}>0\).
-
We now show that \(C_x \cap C_y \neq \emptyset\) implies \(C_x=C_y\). Let \(z\) be in \(C_x \cap C_y\) and \(y^\prime\) in \(C_y\). Therefore \(\rho_{xz}>0\), \(\rho_{yz}>0\), \(\rho_{yy^\prime}>0\) and by recurrence Theorem \(\rho_{zy}>0\). This means \(\rho_{xy^\prime} >0\) which yields \(y^\prime\in C_x\), hence \(C_y \subset C_x\). Analogously, \(C_x \subset C_y\).
Definition
A measure \(\mu\) on is called stationary—or invariant—if
- \(\mu(y)< \infty\) for all \(y \in S\),
- \(\mu(y)=\sum_{x\in S} \mu(x)p_{xy}\).
If \(\mu\) is a probability measure, then \(\mu\) is called a stationary distribution.
Remark
Suppose that \(\mu\) is a stationary distribution, and let \(P^\mu\) the measure such that \(X\) is Markov with start distribution \(\mu\). Then
By induction, we assume that \(P^\mu[X_t=y]=\mu(y)\). Then
This shows \(P^\mu(X_t=y) = \mu(y)\) for every \(t\).
Example: Randomw Walk
Let \(X_t:=X_0+\sum_{s=1}^{t} Y_s\) and \(\mathcal{F}_t:=\sigma(X_0,\dots,X_t)\). Then
where \(f \colon S \to [0,1]\) is such that \(\sum_{y\in S}f(y)=1\), that is \(f(y):=P[\xi_1=y]\). In this case \(\mu \equiv 1\) is a stationary measure. Indeed,
Remark
Let \(x\in S\) be transient and \(\pi\) a stationary distribution. Then \(\pi(x)=0\).
Proof
Exercise.
Theorem
Suppose \(x\) is recurrent and \(\tau:=\inf\{ t\colon X_t = x \}\). Define
for all \(y\in S\). Then \(\mu\) is a stationary measure.
Proof
For \(z\) in \(S\) and \(t\) let \(\bar{p}_t(x,z):=P^x[X_t=z,\tau>t]\). We claim that \(\sum_{t} \bar{p}_t(x,\cdot)\) is a stationary measure.
-
For \(z\neq x\) we have
\[ \begin{align*} \sum_{y\in S} \bar{p}_t(x,y) p_{yz} & = \sum_{y\in S} P^x[X_t=y, \tau>t] p_{yz} \\ & = \sum_{y\in S} P^x(X_t=y,\tau>t,X_{t+1}=z]\\ & = P^x[\tau>t+1,X_{t+1}=z]\\ & =\bar{p}_{t+1}(x,z) \end{align*} \]Hence,
\[ \begin{align*} \sum_{y\in S}\sum_{t} \bar{p}_t(x,y)p_{yz} & = \sum_{t}\sum_{y\in S} \bar{p}_t(x,y)p_{yz}\\ & = \sum_{t} \bar{p}_{t+1}(x,z)\\ & = \mu(z) \end{align*} \]since \(\bar{p}_0(x,z)=0\).
-
For \(z=x\) we have
\[ \sum_{y\in S} \bar{p}_t(x,y) p_{yx} = \sum_{y\in S} P^x[X_t=y,\tau>t,X_{t+1}=x] = P^x[\tau=t+1]. \]Hence,
\[ \sum_{t}\sum_{y\in S} \bar{p}_t(x,y)p_{yz} = \sum_{t} P^x[\tau=t+1] = 1 = \mu(x). \] -
Finally we show that \(\mu(y)<\infty\) for any \(y\) in \(S\). We have \(\mu(x)=1\), and \(\mu p^t=\mu\), showing that \(\mu(y)<\infty\) if \(p^t_{xy}>0\) for some \(t\).
- If \(P^x[\tau_y<\infty]=0\), then \(\mu(y)=0\).
- If \(P^x[\tau_y<\infty]>0\), then by recurrence Theorem we have \(P^y[\tau_x<\infty]>0\) so that \(p_{yx}^t>0\) for some \(t\), hence \(\mu(y)<\infty\).