8 Introduction to stochastic processes

Many natural phenomena are stochastic — they involve randomness in their evolution over time.

In classical physics, this randomness may arise from our lack of knowledge about microscopic details. For example, in a gas, we do not know the precise positions and velocities of each particle, so their collisions with the container walls appear random.

In quantum mechanics, stochasticity is even more intrinsic. The fundamental objects are probability amplitudes, and outcomes are inherently probabilistic.

To describe these systems effectively, we use a coarse-grained probabilistic description, which tracks the likelihood of different outcomes rather than precise trajectories. One of the key tools in this approach is the master equation.

8.1 The Master Equation

Consider a system in microstate \(i\) with energy \(E_i\). It can transition to neighboring microstates \(j\), where the energy difference \(|E_j - E_i|\) is small (within \(\delta E\)).

Let \(\nu_{ij}\) be the rate at which the system jumps from state \(i\) to state \(j\). Over an infinitesimal time interval \(dt\), the probability \(p_i\) changes as:

\[ dp_i = \left[ -p_i \sum_j \nu_{ij} + \sum_j \nu_{ji} p_j \right] dt \]

This expression contains two terms:

A loss term: the system leaves state \(i\) at rate \(\nu_{ij}\),
A gain term: the system arrives in state \(i\) from other states \(j\) at rate \(\nu_{ji}\).

The master equation becomes:

\[ \frac{dp_i}{dt} = -\sum_j \nu_{ij} p_i + \sum_j \nu_{ji} p_j \]

This is a linear first-order differential equation for the vector of probabilities \(\{p_i\}\).

Alternatively, in matrix form:

\[ \frac{d\mathbf{p}}{dt} = W \mathbf{p} \]

where \(W\) is the rate matrix with entries:

\(W_{ij} = \nu_{ji}\) for \(i \neq j\),
\(W_{ii} = -\sum_{j \neq i} \nu_{ij}\).

This structure ensures probability conservation: the total probability \(\sum_i p_i = 1\) remains constant in time.

An aside on entropy production

The master equation is first order in time and does not have time reversal symmetry so describes an irreversible process. This irreversibility arises from the coarse-graining process that throws away information about underlying microphysics which is described by Newton’s equations and which are time reversible. Only by doing so is the entropy allowed to increase which is required by the second law of thermodynamics for an irreversible process. Consequently the increase of entropy is linked to our knowledge about the system rather than anything it is doing internally in a manner that may appear dubious. Can it be possible that macroscopic and reproducible phenomena such as heat flow depend on how we handle information? Perhaps yes since the division between work and heat is somewhat arbitrary. Were we able to track all the particle positions there would be no need to talk about heat energy or heat flow.

8.2 From the Master Equation to the Diffusion Equation

Now consider the situation where the state index \(i\) corresponds to a position in space, \(x_i = i a\), ie a one-dimensional lattice with lattice spacing \(a\), and transitions only occur between neighboring lattice sites.

We assume:

Transition rates are symmetric: \(\nu_{i, i+1} = \nu_{i, i-1} = \nu\),
The spacing \(a\to 0\) in the continuum limit.

The master equation becomes a finite-difference equation:

\[ \begin{aligned} \frac{dp_i}{dt} = &\sum_j\nu_{ij}(p_j-p_i)\\ \frac{dp_i}{dt} = &\nu(p_{i-1}-p_i) + \nu(p_{i+1} - p_i)\\ \frac{dp_i}{dt} = &\nu(p_{i+1} + p_{i-1} - 2p_i) \end{aligned} \]

We now define a continuous variable \(x = i a\), and a probability density \(p(x, t)\) such that \(p_i(t) \approx p(x, t)\).

Using Taylor expansions:

\[ p(x \pm a, t) = p(x, t) \pm a \frac{\partial p}{\partial x} + \frac{a^2}{2} \frac{\partial^2 p}{\partial x^2} + \cdots \]

Substituting into the master equation gives:

\[ \frac{\partial p}{\partial t} = \nu a^2 \frac{\partial^2 p}{\partial x^2} \]

Defining the diffusion constant \(D = \nu a^2\), we obtain the diffusion equation:

\[ \frac{\partial p}{\partial t} = D \frac{\partial^2 p}{\partial x^2} \]

The diffusion equation describes the evolution of the probability density of a particle with diffusion constant (sometimes called diffusivity) given by \(D = \nu a^2\). The dimensions of \(D\) are \([\text{length}]^2/[\text{time}]\). Typically, after expansion, we set the lattice spacing \(a\) to 1. Additionally, the diffusion equation can describe many non-interacting diffusing particles. In this case, we replace \(p\) with \(\rho\), representing the density or concentration of particles, and use the normalization \(\int dx \, \rho = M\), where \(M\) is the number of particles.

The diffusion equation, much like the master equation from which it originates, explicitly violates time-reversal symmetry, thus permitting entropy to increase.

The solution of the diffusion equation for an initial condition where the particle is initially localized at the origin (formally, \(p(x,0) = \delta(x)\)) is a Gaussian:

\[ p(x,t) = (4 \pi D t)^{-1/2} \exp\left[-\frac{x^2}{4 D t}\right] \]

We explicitly see the arrow of time by examining this Gaussian solution at various times \(t\). As \(t\) increases, the Gaussian “bell-shaped” curve spreads out. Its width grows according to \(\langle x^2 \rangle^{1/2} \sim t^{1/2}\). This is known as “diffusive scaling,” and it implies that, after time \(t\), a particle will typically be found at a distance roughly proportional to \(t^{1/2}\) from its starting point. Conversely, exploring a region of size \(L\) typically requires a time of order \(O(L^2)\).

The evolution of the solution to the 1d diffusion equation in a spatial region \(x=[0,1]\) as a function of times are shown in the movie below. The diffusion constant is \(D=0.01\). The movie corresponds to a particle initialised at \(x=0.5\). One sees how the probability density spreads out over the range as time increases. This can be used to model the diffusion of particles down a concentration gradient as you will see in the next part of the course.

Show python code

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import os

# Parameters
L = 1.0        # Length of the domain
T = 1.0        # Total time 
nx = 400       # Number of spatial points
nt = 2000      # Number of time steps
D = 0.1        # Diffusion coefficient

# Discretization
dx = L / (nx - 1)
dt = T / nt

# Stability condition auto-adjust
if D * dt / dx**2 > 0.5:
    print("Adjusting dt and nt to satisfy stability condition...")
    dt = 0.4 * dx**2 / D
    nt = int(T / dt)
    dt = T / nt  # Recalculate dt exactly

nt = int(nt / 50)  # Artificial slowdown for animation

print(f"Using dt = {dt:.4e}, nt = {nt}")

# Initialize x and u
x = np.linspace(0, L, nx)

# Initial condition: smooth narrow Gaussian
sigma = 0.01
u = np.exp(-(x - L/2)**2 / (2 * sigma**2))

# Normalize initial condition
u /= np.sum(u) * dx

# Setup figure
plt.rcParams.update({
    "text.usetex": True,
    "font.family": "serif"
})

fig, ax = plt.subplots()
line, = ax.plot(x, u)

# Compute peak height for setting y-axis
peak_height = 1 / (np.sqrt(2 * np.pi) * sigma)
ax.set_ylim(0, peak_height * 1.05)

ax.set_xlabel(r'$x$', fontsize=20)
ax.set_ylabel(r'$p(x,t)$', fontsize=20)

ax.tick_params(axis='both', which='major', labelsize=14)

# Tiny time counter text
time_text = ax.text(0.85, 0.05, '', transform=ax.transAxes, fontsize=10, verticalalignment='bottom')


# Function to update the plot
def update(frame):
    global u
    unew = np.copy(u)
    unew[1:-1] = u[1:-1] + D * dt / dx**2 * (u[2:] - 2*u[1:-1] + u[:-2])
    u = unew

    # Normalize at every step
    u /= np.sum(u) * dx

    line.set_ydata(u)
    current_time = frame * dt
    time_text.set_text(r'$t=%.4f$' % current_time)
    return line, time_text

ani = animation.FuncAnimation(fig, update, frames=nt, interval=100, blit=True)


# Save the animation as a movie
Writer = animation.writers['ffmpeg']
writer = Writer(fps=15, metadata=dict(artist='Me'), bitrate=1800)
ani.save("../Movies/diffusion_evolution.mp4", writer=writer)


plt.show()

8.3 Consequences of time reversal symmetry

As we have seen by introducing a type of coarse graining, the master equation violates time reversal symmetry of the underlying Newtonian dynamics. Remarkably however, the fact that the underlying microphysics is actually time reversal symmetric has several deep consequences which survive the coarse graining procedure These results are some of the cornerstones of nonequilibrium thermodynamics

8.3.1 Detailed balance

Recall from year 2 Statistical Mechanics that for a system in equilibrium, the principle of equal a-priori probabilities of microstates holds. Therefore

\[ \nu_{ij} p_i^{eq} = \nu_{ji} p_j^{eq} \]

Hence, on average, the actual rate of quantum jumps from \(i\) to \(j\) (the left-hand side) is the same as from \(j\) to \(i\). This is a stronger statement than the master equation, which asserts only that there is overall balance between the rate of jumping into and out of state \(i\) in equilibrium. The result above is known as the principle of detailed balance.

This principle is powerful because it applies not only to individual states but also to any grouping of states.

Exercise: Show that for two groups of states, \(A\) and \(B\), the overall rate of transitions from group \(A\) to group \(B\) is balanced, in equilibrium, by those from \(B\) to \(A\):

\[ \nu_{AB} p_A^{eq} = \nu_{BA} p_B^{eq} \]

Hence, detailed balance arguments can be extended to subsystems within a large isolated system, and even to systems that are not isolated. However, in such cases, the principle is far from obvious, because once states are grouped together:

\[ \nu_{AB} \ne \nu_{BA}, \quad p_A \ne p_B \]

(This can be easily demonstrated, for example, by considering two groups that contain different numbers of states with similar energies.) Nonetheless, the detailed balance relation holds, in equilibrium, in the general form above.

8.3.2 Computer simulation

In computer simulation, good results will be obtained if one accurately follows the microscopic equations of motion. This is the molecular dynamics (MD) method which we now outline.

Molecular dynamics

Molecular Dynamics (MD) involves a system of classical particles interacting through specified interparticle forces. The motion of these particles is determined by numerically integrating Newton’s equations of motion. In MD simulations, averages of state variables are obtained as time averages over trajectories in phase space. Typically, the forces acting between particles are conservative, ensuring that the total energy \(E\) remains constant. This conservation implies that the motion is restricted to a \((2dN - 1)\)-dimensional surface in phase space, denoted by \(\Gamma(E)\).

A central aspect of MD is the averaging of observables. For a given observable \(A\), its average is computed as the time average along the trajectory. Mathematically, this is expressed as:

\[ \left\langle A(\{ \mathbf{p}_i \}, \{ \mathbf{r}_i \}) \right\rangle = \frac{1}{\tau} \int_{t_0}^{t_0+\tau} dt\, A(\{ \mathbf{p}_i(t) \}, \{ \mathbf{r}_i(t) \}) \]

This formula represents the integral of the observable over a time interval \(\tau\), normalized by the length of that interval.

The practical steps of an MD simulation start with generating an initial random configuration of particle positions \(\{ \mathbf{r}_i \}\) and momenta \(\{ \mathbf{p}_i \}\). The system’s equations of motion are then iteratively solved using a suitable algorithm to allow it to reach equilibrium. After equilibration, a production run is performed over many time steps to collect meaningful data. Finally, relevant averages, such as pressure or kinetic energy, are calculated from the collected data.

Monte Carlo

However, to obtain the equilibrium properties of the system, it may be much faster to use a dynamics which is nothing like the actual equations of motion.

At first sight, this looks very dangerous; however, if one can prove that in the required equilibrium distribution, the artificial dynamics obey the principle of detailed balance, then it is (almost) guaranteed that the steady state found by simulation is the true equilibrium state.

The best known example is the Monte Carlo method, in which the dynamical algorithm consists of random jumps. The jump rates \(\nu_{AB}\) for all pairs of states \((A, B)\) take the form:

\[ \nu_{AB} = \nu_0 \quad \text{if } E_B \le E_A \]

\[ \nu_{AB} = \nu_0 e^{-\beta (E_B - E_A)} \quad \text{if } E_B \ge E_A \]

where \(\nu_0\) is a constant.

Exercise: Show that this gives the canonical distribution in steady state.

Solution

To show that this jump rate rule gives the canonical distribution in steady state, we assume the system reaches a steady-state probability distribution \(P_A\) for state \(A\) and apply the condition of detailed balance.

Detailed balance requires: \[ P_A \nu_{AB} = P_B \nu_{BA} \]

Assume the canonical distribution: \[ P_A = \frac{1}{Z} e^{-\beta E_A}, \quad P_B = \frac{1}{Z} e^{-\beta E_B} \]

Now consider two cases for \(\nu_{AB}\) and \(\nu_{BA}\):

Case 1: \(E_B \leq E_A\)

Then: \[ \nu_{AB} = \nu_0, \quad \nu_{BA} = \nu_0 e^{-\beta (E_A - E_B)} \]

Substitute into the detailed balance condition: \[ P_A \nu_0 = P_B \nu_0 e^{-\beta (E_A - E_B)} \]

Cancel \(\nu_0\): \[ P_A = P_B e^{-\beta (E_A - E_B)} \]

Use canonical form: \[ \frac{1}{Z} e^{-\beta E_A} = \frac{1}{Z} e^{-\beta E_B} e^{-\beta (E_A - E_B)} = \frac{1}{Z} e^{-\beta E_A} \]

Verified.

Case 2: \(E_B > E_A\)

Then: \[ \nu_{AB} = \nu_0 e^{-\beta (E_B - E_A)}, \quad \nu_{BA} = \nu_0 \]

Substitute: \[ P_A \nu_0 e^{-\beta (E_B - E_A)} = P_B \nu_0 \]

Cancel \(\nu_0\): \[ P_A e^{-\beta (E_B - E_A)} = P_B \]

Again using canonical form: \[ \frac{1}{Z} e^{-\beta E_A} e^{-\beta (E_B - E_A)} = \frac{1}{Z} e^{-\beta E_B} = P_B \]

Verified.

Hence, in both cases the detailed balance condition is satisfied with the canonical distribution, and the steady state is indeed the canonical distribution.