The physical Sackur-Tetrode entropy of an ideal gas

Marton Trencseni - Mon 29 November 2021 - Physics


In the previous article, I derived a coarse-grained formula for the entropy of an ideal gas, taking into account only the uncertainty in position of the particles. The derived expression for entropy cannot be physical, since it depends on how we as observer's choose to coarse grain the volume of the gas ($M$ bins). Also, it doesn't take into account the velocity of the particles. The next step, coming in this article, is to take make physical arguments and compute an expression for $W$ that takes into account the velocities, parametrized using the kinetic energy of the system and derive the Sackur–Tetrode equation for a monatomic ideal gas of indistinguishible particles. Once we have that, we can also derive the ideal gas equation of state $pV = NkT$.

Macroscopic state

When defining the entropy of an ideal gas, we first need to define some of its macroscopic properties. For example, are we talking about 1 liter of gas or 2 liters? What is the density of the gas? The parameters we need to define are:

  • $V$ volume
  • $N$ number of particles (specifying the number of particles $N$ is the same as specifying density $\rho$, since $\rho = N/V$)
  • $E$ total kinetic energy of the particles making up the gas

The first 2 make sense, but what about the 3rd one? We know from real-world experience that at the human, macroscopic scale, objects have a temperature. We will see that the total kinetic energy of particles is related to the temperature of the gas (slower movement is colder, faster movement is hotter).

Quantum mechanics

In classical physics, both position $x$ and velocity $v$ can be measured at arbitrary accuracy. So, in a thought experiment, we could measure $x$ and $v$ at time $t_0$, and we would know everything about the system at any other time. All we have to do is run a simulation, either backward or forward in time, and we could always tell $x_i(t)$ and $v_i(t)$ for all particles $i=1...N$. So, where does the uncertainty come from? Does it come from our unwillingness or inability to measure all the positions and velocities? If that were the case, entropy would be an observer or measurement dependent quantity, similar to the expression derived in the previous article.

The way out is to remember Heisenberg's uncertainty principle from quantum mechanics, which states that although a particle's position or velocity (momentum) can each be measured to arbitrary precision, both cannot be at the same time. There is a joint uncertainty in the measurement:

           $\Delta x_i \Delta p_i \ge h$

Here $x_i$ is the position of the $i$th particle, $p_i=m v_i$ is the $i$th particle's momentum, $m$ is the common particle mass, and $h$ is a physical constant called Planck constant. In other words, the uncertainty principle introduces a natural coarse graining on the combined $(x, p)$ space!

Phase space

For an ideal gas, the microscopic state corresponds to specifying the position $x_i$ and momenta $p_i$ of all $i=1...N$ particles. For our treatment, we will take all the these and build one large phase vector $(x_1, p_1, x_2, p_2, ... x_N, p_N)$. Here each $x$ and $p$ is itself an (x, y, z) Cartesian 3D vector, so this vector lives in the $6N$-dimensional so-called phase space. Each point in phase space defines all the positions and momenta of the particles making up the gas and corresponds to a possible microscopic state of the system.

However, we must put additional constraints on this allowable points in the phase space:

  • the positions $x_i$ have to fall within the volume of the gas
  • the kinetic energies of the particles have to add up to the total kinetic energy $E$ of the gas

Marbles, bins and indistinguishible particles

In the previous article, we did not use a phase space formalism. Instead, we assumed we coarse grain the available volume into $M$ bins, and compute how many $W(M, N)$ ways we can put $N$ particles into the $M$ bins. In the phase space formalism, each point (and after quantization into $h$-sided hypercubes, each hypercube) in phase space already corresponds to a complete specification of the system. In other words, in this formalism, we are not computing $M$, the number of bins, we are directly computing $W$, the number of microscopic states!

How do we take into account the indistinguishibility of particles in the phase space formalism? If we pick a phase vector (or surrounding $h$-sided hypercube), where $(x_1, p_1)=A$ and $(x_2, p_2)=B$, that is really the same phase vector as when $(x_1, p_1)=B$ and $(x_2, p_2)=A$, since we can't tell particles 1 and 2 apart. So in the phase space formalism, if we have indistinguishible particles, we are double (multi) counting microscopic states, so we have to devide $W_{phase}$ by some factor to get the real $W$. What is the factor? We have to use some physical intuition and hand-waving. Because $h$ is a very small number, dividing the phase space into $h$-sided hypercubes yields a very large number of hypercubes, even compared to the number of particles $N,$ so the phase space will have a lot more hypercubes than $N$. This means that most of the time, no two particles will be in the same hypercube, ie. we can ignore cases like $(x_1, p_1)=A$ and $(x_2, p_2)=A$ (where $=A$ denotes "falls into hypercube A"). In that case, if all $N$ particles are in a different hypercube, the degree of overcounting is just $N!$, the number of permutations of N distinct letters.

Sackur-Tetrode equation

Based on the above considerations, we can now derive the physical entropy of a monatomic ideal gas. Let's assume the gas is in a cube of volume $V=L^3$, and the total kinetic energy of particles is $E= \frac{m}{2} \sum_i v_i^2 $. Let's switch to using momenta $p_i = m v_i$ and $v_i^2 = \frac{1}{m^2} p_i^2$, then the energy in terms of momenta is $E = \frac{1}{2m} \sum_i p_i^2$. So the phase space $(x_1, p_1, x_2, p_2, ..., x_N, p_N)$ is constrained to the manifold:

  • $ 0 < x_i < L$ — this contraint for $x_i$s defines a hypercube of volume $V_{hyper}=V^N$, of dimensionality $3N$ (note that each $x_i$ here has 3 internal coordinates $(x, y, z)$)
  • $E = \frac{1}{2m} \sum_i p_i^2$ — this contraint for $p_i$s defines the $(3N-1)$-dimensional shell of a $3N$-dimensional hypersphere of radius $r^2 = 2mE$ (note that each $p_i$ here has 3 internal coordinates $(p_x, p_y, p_z)$). This shell has area $ A_{hyper} = \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{\Gamma(\frac{3N}{2})} $

In lower dimensions if we take a circle (with circumference $C$) and a square (with area $A$), they form a cylinder with wall area $C A$. This is also what happens in higher dimensions, a hypersphere with shell area $A_{hyper}$ and a hypercube with volume $V_{hyper}$ form a hypercylinder with wall area $ A_{hyper} V_{hyper} $. So the phase space volume $ P = V_{hyper} A_{hyper} $ is:

           $ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{\Gamma(\frac{3N}{2})} $

Here $\Gamma(x)$ is the Gamma function. Let's assume that $N$ is even, so $\frac{3N}{2}$ is integer, in which case the $\Gamma$ function is $\Gamma(x)=(x-1)!$. With this:

           $ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} $

Now we face a problem, or problems. The phase space itself is of 6N-dimensions ($N$ particles, each has 3 position and 3 momenta coordinates), but the manifold is actually a "surface" of 6N-1 dimensions, because the momenta contraints defined a $(3N-1)$-dimensional shell of a $3N$-dimensional hypersphere. To turn the area into a volume, we need to give the shell some thickness. So we will assume that the shell has an infinitesimal thickness $\Delta p$. Since $E=\frac{m}{2} v^2=\frac{1}{2m}p^2$, so $p^2=2mE$ and $p(E)=\sqrt{2mE}$. This means that $dP = \sqrt{\frac{m}{2E}} dE$, so the thickness is $ \Delta p = \sqrt{\frac{m}{2E}} \Delta E $. This term will be approximately ignored later, so it's not that important.

At this point, the fixed phase space volume is:

           $ P = V^N \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $

Next, we use the uncertainty principle to coarse-grain this volume into hypercubes. Each $x-p$ pair (3 per particle, for $(x, y, z)$ physical 3D space) contributes an $h$, so the hypercubes have volume $h^{3N}$. We divide $P$ by this to get $W_{phase}$:

           $ W_{phase} = \frac{P}{h^{3N}} = \frac{V^N}{h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $

Now we just need to divide by the factor $N!$ to take into account the indistinguishibility of particles and get rid of the overcounting to get the final expression:

           $ W = \frac{W_{phase}}{N!} = \frac{V^N}{N! h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E $

The entropy $S$ is then (in units of $k$, the Boltzmann constant):

           $ \frac{S}{k} = ln[W] = ln[ \frac{V^N}{N! h^{3N}} \frac{2 \pi ^ {\frac{3N}{2}} (2mE)^{\frac{3N-1}{2}} }{(\frac{3N}{2}-1)!} \sqrt{\frac{m}{2E}} \Delta E ] $

This looks quite complicated, but can be handled quickly. The trick is to use the properties of the logarithm ($ln[ab]=ln[a]+ln[b]$, $ln[\frac{a}{b}]=ln[a]-ln[b]$ and $ln[a^k]=k ln[a]$) and Stirling's approxmiation $ln[n!]=n ln[n]-n$ to create a long list of log-terms. Then, put it in a form like $W=N A + B$, where $A$ is all the terms that have an $N$ multiplier outside of the logarithm. In the $A$ part, approximate $ln[\frac{3N-1}{2}] \approx ln[\frac{3N}{2}]$, and the entire $B$ part can be approximated to be small compared to the large $N$ multiplier $A$ part, so we are only left with the $N A$ part (I did it with pen and paper..). This leaves us with the famous Sackur-Tetrode equation for entropy of a monatomic ideal gas:

           $\frac{S}{kN} = ln[\frac{V}{N}(\frac{4 \pi m E}{3 h^2 N})^\frac{3}{2}] + \frac{5}{2}$

Note that the entropy $S$ is a function of $S(V, N, E)$, and the Planck constant $h$ also appears.

Entropy is extensive

The above expression is obviously extensive. If we double our system, which takes $V \rightarrow 2V, N \rightarrow 2N, E \rightarrow 2E$, then due to the $\frac{V}{N}$ and $\frac{E}{N}$ term the logarithm value does not change, however the $N$ outside introduces the 2 multiplier, so $S(2V, 2N, 2E) = 2S(V, N, E)$.

Temperature, pressure and the equation of state

For a closed system, the 1st law of thermodynamics for an infinitesimal process where N does not change can be written as:

           $dE = T dS - P dV$

Now let's take the Sackur-Tetrode equation, and rewrite it like:

           $ S = k N ( ln[V] + \frac{3}{2} ln[E] ) + f(N) $

We kept just the $V$ and $E$ terms, and put all the rest into the function $f$, which is only a function of $N$. Now take the differential:

           $ dS = k N ( \frac{1}{V} dV + \frac{3}{2E} dE ) + f'(N) dN$

Since we're realing with a process where $N$ does not change, $dN = 0$, we can ignore the last term. We can rewrite the remaining terms like:

           $ dE = \frac{2E}{3kN} dS - \frac{2E}{3V} dV $

Comparing this with the 1st law, we can read off the expression for temprate $T=\frac{2E}{3kN}$ and pressure $P=\frac{2E}{3V}$.

Multiplying $P$ by $V$ yields:

           $PV = \frac{2E}{3}$

Multiplying $T$ by $N k$ yeilds:

           $NkT = \frac{2E}{3}$

We can read off the famous ideal gas equatation of state $PV=NkT$. We derived this from computing the entropy from physical considerations and using the 1st law of thermodynamics, which for our purposes effectively defined $T$ and $P$.

$h$ but no $c$

In Physics, when a formula contains the Plank constant $h$ it means that some aspect of quantum mechanics was taken into account. The Sackur-Tetrode equation contains $h$ because we used the uncertainty principle from quantum mechanics to quantize the phase space. Similarly, if a formula contains the speed of light $c$ it means that special relativity is taken into account. The Sackur-Tetrode equation does not contain $c$, it is not a relativistically correct. There is no contraint on the maximum speed any given particle can obtain, there is just the global contraint $E = \frac{1}{2m} \sum_i p_i^2$, so a microscopic state where one of the particles has a very high speed exceeding $c$ is allowed, even though it is not physically possible.


The article used the example of the physical entropy of a monatomic ideal gas to show the similarities and differences between entropy as it appears in phsyics and math (data science). In math, we have well-defined random variables and a simple defintion of entropy on top of that. In physics, we concept of entropy is more nuanced, and involves lots of idealizations and approximations (eg. we silently assumed all microscopic states we considered have the same probability).