How could it be that the entropy of gas is not infinite? If we use the formula $S=k\ln(\Omega)$ we get infinity because there are an infinite number of possible situations (because there are infinite possibilities of position and momentum for each molecule).
-
7If a quantity of gas has $10^{25}$ molecules, how do you infer that w is infinity? – DanielC Nov 30 '17 at 08:53
-
8In order for entropy to be well-defined, space has to be quantized in some way. – probably_someone Nov 30 '17 at 08:55
-
15@DanielC $w$ in this formula is "the number of possible states of the system". From the point of view of classical mechanics even a single molecule in a box can be in an infinitely many states. – lesnik Nov 30 '17 at 09:27
-
5@probably_someone Entropy can be defined in classical mechanics just fine without anything being quantized; all you need is a measure on the phase space of the system (which you get basically for free from the symplectic form). You do need to divide by an appropriate power of $2 \pi \hbar$ to keep the argument of the logarithm dimensionless, but if you replace $2 \pi \hbar$ with any other positive number with units of action you would get an entropy which only differs by an overall constant and everything works just as well (at the level of classical mechanics). – Logan M Nov 30 '17 at 18:51
3 Answers
The problem you are thinking about is known as the question of thermodynamic coarse graining. This will hopefully give you a phrase that you can search on to find out more.
Sometimes possible states of ensemble members are obviously discrete, as they are in a collection of quantum harmonic oscillators. Much of quantum mechanics depends on the underlying Hilbert state space's being separable (i.e. has a countable dense subset) and for Hilbert spaces this is equivalent to the assertion that the vector basis itself is countable. Thus, even if if an observable such as momentum or position has a continuous spectrum (i.e. can give a continuous random variable as a measurement), the underlying state space is often discrete. In the OPs particular example, you can model the gas as a system of particles in a 3D box (hat tip to user knzhou for reminding me of this point), so that the state space of ensemble members is clearly discrete. As we raise the volume of our box, the density of states (discussed more in Chris's answer) increases in proportion with the box's spatial volume, and therefore so does the entropy. In the limit of a very large gas volume, the entropy per unit volume is a well defined, finite limit.
In cases where the state space is not obviously discrete, one must resort to either the use of coarse graining or relative entropy.
Coarse graining is the somewhat arbitrary partitioning of ensemble members' state space into discrete subsets, with states belonging to a given partition then being deemed to be the same. Thus a continuous state space is clumped into a discrete approximation. Many conclusions of statistical mechanics are insensitive to such clumping.
Relative entropy, in the information theoretic sense, is defined for a continuous random variable as an roughly the entropy change relative to some "standard" continuous random variable, such as one governed by a Gaussian distribution. We see the problem you are dealing with if we try naïvely to work out the Shannon entropy of a continuous random variable with probability distribution $p(x)$ as the limit of a discrete sum:
$$S \approx -\sum\limits_i p(x_i)\,\Delta x \,\log(p(x_i)\,\Delta x) = -\log(\Delta x)\,\sum\limits_i p(x_i)\,\Delta x - \sum\limits_i p(x_i)\,\Delta x \,\log(p(x_i))\tag{1}$$
The two sums in the rightmost expression converge OK but we are thwarted by the factor $\log(\Delta x)$, which of course diverges. However, if we take the difference between the entropy for our $p(x)$ and that of a "standard" distribution, our calculation gives:
$$\Delta S \approx -\log(\Delta x)\,\left(\sum\limits_i p(x_i)\,\Delta x-\sum\limits_i q(x_i)\,\Delta x\right) - \sum\limits_i \left(p(x_i)\,\log(p(x_i))-q(x_i)\,\log(q(x_i))\right)\,\Delta x\tag{2} $$
a quantity which does converge to $\int\left(p\log p - q\,\log q\right)\mathrm{d}x$. The usual relative entropy is not quite the same as this definition (see articles - the definition is modified to make the measure independent of reparameterization) but this is the the basic idea. Often the constants in the limit of (2) are dropped and one sees the quantity $-\int\,p\,\log p\,\mathrm{d}x$ defined as the unqualified (relative) entropy of the distribution $p(x)$.
Coarse graining, in this calculation would be simply choosing a constant $\Delta x$ in (1). (1) is then approximately the relative entropy $-\int \,p\,\log p\,\mathrm{d}x$ offset by the constant $-\log(\Delta x)$. Therefore, as long as:
- We stick with a constant $\Delta x$ in a given discussion;
- $\Delta x$ is small enough relative to the variations in the probability density so that $\sum\limits_i p(x_i)\,\Delta x \,\log(p(x_i))\approx \int p\,\log p\,\mathrm{d} p$;
- Our calculations and physical predictions are to do only with differences between entropies (as is mostly the case)
then the approaches of coarse graining and relative entropies give identical physical predictions, independently of the exact $\Delta x$ chosen.
A good review of these ideas, with historical discussion, is to be found in:
- 88,112
-
3I agree with everything you wrote, but aren't the states for an ideal gas already discrete? It's just a bunch of particles in a box, so the usual method of counting discrete microstates works. – knzhou Dec 01 '17 at 00:24
-
+1, I consider this's the right approach. But I'm confused by the use of a difference between two entropies, rather than the usual $\int p\log\frac{p}{q}$. It seems like this version will not be invariant to reparameterisation (e.g. substituting $y=x^2$), whereas the usual relative entropy (aka Kullback-Leibler divergence) is. – N. Virgo Dec 01 '17 at 00:38
-
@knzhou Indeed they are in that particular example. See my edits. – Selene Routley Dec 01 '17 at 00:54
-
@Nathaniel I did say that the "usual relative entropy is not quite the same as this definition (see articles) but this is the the basic idea"; I've also changed that phrase to state the problem of reparameterization; I wanted to emphasize the choice of $\Delta x$ as the "main problem". Indeed in older information theoretic texts (including I believe Shannon and Weaver), you'll see relative entropy of a pdf defined simply as $-\int p,\log p,\mathrm{d} p$. I don't know whether this is a signal processing/ communications theorist's usage as opposed to a statistician's / statistical mechanic's. – Selene Routley Dec 01 '17 at 01:01
-
@WetSavannaAnimalakaRodVance I think it's more a question of the way the field developed historically. Shannon defined continuous entropy as just $-\int p(x)\log p(x) dx$ but it always had the reparameterisation problem. This is solved by the Kullback-Leibler divergence, but that was developed a few years later and in a different field (statistics rather than communication theory), so it took quite a long time for it to percolate through the literature. Nowadays the KL divergence tends to be seen as the most fundamental quantity in information theory, even more so than entropy. – N. Virgo Dec 01 '17 at 02:51
-
(There is a comment in one of Edwin Jaynes' papers to the effect that he thinks Shannon's integral definition is a simple mistake - he points out that Shannon is incredibly careful in deriving the discrete entropy as the unique measure of uncertainty under certain desiderata, but then simply states $-\int p(x)\log p(x) dx$ as the obvious continuous analogue. If one takes more care over deriving the contintinuum limit one is forced to come up with something analogous to the KL divergence instead.) – N. Virgo Dec 01 '17 at 03:01
-
The link to Ridderbos's paper is dead, and a google search only turned up things behind a paywall. Can any one help here? – Mike Wise Aug 25 '18 at 19:31
In a continuous system, $w$ is taken to be an integral over the possible microstates of the system. This is typically referred to as the density of states, and is quite finite. Specifically it's something like $$w(E)=\mathcal N\int d^{3N}x~d^{3N}p~\delta(E-\epsilon(\vec p,\vec x))$$
where $\epsilon(\vec p,\vec x)$ is the energy of the system as a function of all the momenta and positions, $\delta$ is a delta function that fixes the energy of the system and $\mathcal N$ is some normalization.
This leaves some ambiguity (namely, in the normalization), but the ratio of $w$ between two states is well-defined, so $\Delta S=k\ln(\frac{w_f}{w_i})$ is well-defined, and the ambiguity is resolved by defining entropy to be zero at absolute zero, which gives the normalization to use.
- 53,248
- 20
- 131
- 253
- 17,189
-
Some things I do not understand:
- The epsilon function should always be equal to E because it is the total energy of the system
- The result of this integral is not compatible with the well-known expression of gas entropy (which depends on temperature). Perhaps I did not understand correctly. Can you write the result of this integral?
-
- Are you familiar with delta functions? The point of it is to mathematically only integrate over the volume in phase space where the energy is $E$. 2. It can be evaluated and then related to the temperature. The evaluation depends on the system- it's different depending on the degrees of freedom in the system and any potential energies for instance. It is consistent with the entropy for an ideal gas, though it may be a fair amount of work to show that.
– Chris Nov 30 '17 at 10:08 -
So this expression actually means to sum all the points in the phase space that correspond to energy E, but this is exactly my problem: there are infinite such points and therefore the result of the integral is infinity – Jacob Nov 30 '17 at 10:15
-
An integral over phase space is finite, though. For instance, in a box of volume $V$, the position integral comes out to $V^N$, not $\infty$. The momentum integral is also finite. – Chris Nov 30 '17 at 10:35
-
OK you're right. But something here does not make sense because we sum up an infinite number of possibilities and get a finite number. Can you give me some intuition for this? (In calculating the integral of an area, for example, we sum an infinite number of points but each has a small infinitesimal weight, here each option has the same weight that it is 1) – Jacob Nov 30 '17 at 11:03
-
3Why do you assume the "weight" is 1? It is not- it is the same infinitesimal weight as in a normal integral. – Chris Nov 30 '17 at 11:08
-
@Jacob On an even more basic level, this is the same question as "why does $\sum^\infty_{n=1}\frac{1}{n^2}=\frac{\pi^2}{6}$ instead of $\infty$?", and the answer is that you can force the things you're summing to get small sufficiently fast as your terms increase and get a finite sum as a result. Integrals of continuous functions always do this, by definition -- just look at the actual definition of an integral: https://en.wikipedia.org/wiki/Darboux_integral. – Alec Rhea Nov 30 '17 at 16:20
-
Alec Rhea. This is exactly the point, that here we sum up a number of possibilities, it is not infinitesimal things, each option gets the number 1, it is simply sum of 1 + 1 + 1 .... (One option + one option + one option ...) So here the idea of an integral seems to me strange and inappropriate. – Jacob Nov 30 '17 at 16:57
-
1@Jacob You are missing the point. In a continuous phase space, $\Omega$ is not simply the number of accessible states, which you correctly recognize to be infinite. Instead, $\Omega$ is proportional to the "volume" of the accessible phase space. – J. Murray Nov 30 '17 at 18:27
-
An alternative approach, which is taken in the other answers, is to discretize the phase space so the number of accessible states becomes finite. As long as the discretization scale is sufficiently small, these two approaches are equivalent, and physical quantities never end up depending on the discretization scale anyway. – J. Murray Nov 30 '17 at 18:29
-
1J. Murray Can you explain in simple terms what justifies measuring the number of states by volume in the phase space rather than what they really are - the number of states. We agree that if we count the number of states in the simple way we get infinity, so what justifies this trick of phase space. – Jacob Nov 30 '17 at 18:40
-
@Jacob It is effectively multiplying the number of states by a constant. Since multiplying two states by a constant does not change the ratio between them, it does not change the difference in entropy either, since $\ln(Cx_1)-\ln(Cx_2)=\ln\left(\frac{Cx_1}{Cx_2}\right)=\ln\left(\frac{x_1}{x_2}\right)$. – Chris Dec 01 '17 at 03:23
-
@ Chris We talking about counting the number of states, not about finding the ratio between two states. Because the entropy is simply: ln(number of states) – Jacob Feb 20 '18 at 20:36
-
@Jacob For a finite number of states, that is true. The method I've shown is one of the ways you can handle an infinite number of states. As long as you only talk about differences in entropy between two states, an overall normalization of the number of states like this makes no difference. – Chris Feb 20 '18 at 20:43
-
@ Chris But that is the definition of the entropy. We changing the definition for the case of infinite number of states? I don't understand the logic. – Jacob Feb 20 '18 at 20:55
-
@Jacob Note that the number of states, if handled with field theory instead of a classical theory, would not genuinely be infinite. This process is similar to renormalization in QFT, or the inconsistencies you get of you try to apply classical E&M to a point particle. It's an artifact of trying to apply a theory to a smaller distance scale than it is valid for. – Chris Feb 20 '18 at 20:55
-
@ Chris If I understand what you mean, although the number of states is infinite, the theory can’t handle it without “ignoring” the "real" number, and we are forced to count only quantized number of states. I’m right? – Jacob Feb 20 '18 at 21:06
-
@Jacob Basically we're saying we know classical theory isn't right to arbitrary scales. Since we would need it to be in order to find entropy, we give up on that and just talk about changes in entropy, which lets us ignore the unphysical infinity you get by naively applying statistical mechanics to a classical theory. – Chris Feb 20 '18 at 21:07
-
@Jacob The other way around. The real number of accessible states is finite, and given by quantum mechanics. We get infinity by applying a classical theory where we really need a quantum one. – Chris Feb 20 '18 at 21:09
-
@ Chris But according to QM the number of states of the gas in the box is infinite. – Jacob Feb 20 '18 at 21:12
-
1The number of orthogonal states for a bound system, with constraints on the energy, is finite. – Chris Feb 21 '18 at 00:26
-
@Chris This is the most interesting answer because it doesn't require an arbitrary discrete approximation, and it mentions how the normalisation constants are dealt with by the absolute zero requirement. However it is very brief, and I can't find any links that detail this approach. Can you supply a link, paper ref, or a book that deals with this? – Mike Wise Aug 25 '18 at 19:35
-
-
Well, thanks anyway. And that comment about the renormalization using the absolute zero requirement was extremely helpful - and seemingly applies in the discrete approximation derivations as well - though no one seemed to mention it. And I think it is the key reason why choosing different discretization intervals would lead to the same value for the entropy. I think one of the main problems here is using that phrase "number of microstates". It seems more like "a measure of the multiplicity of the microstates". – Mike Wise Aug 26 '18 at 00:27
I advice you to have a look at any Statistical Mechanics textbook. This point is usually addressed in the first chapters. You want to compute the micro-canonical entropy $$S(E)=k_B\ln\Omega(E)$$ where $\Omega(E)$ counts the number of micro-states with energy $E$. As you pointed out, this number is infinite if the coordinates are continuous. The standard approach is to discretize the phase space and divide it in cells. Each cell is assumed to correspond to a single micro-state. For $N$ particles in a 3D space, the dimension of the phase space is $6N$. The width of a cell is chosen to be $\Delta q$ in the $3N$ directions associated to coordinates and $\Delta p$ in the $3N$ directions associated to momenta. It is then assumed that $$\Delta q\Delta p=h_0$$ Now the number of states $\Omega(E)$ with energy $E$ is finite and you can compute the microcanonical entropy. This is a special case of coarse-graining mentioned in the previous answers.
The advantage of this approach is that you can easily estimate $\Omega(E)$, the number of cells of energy $E$, as the volume $Vol$ of the phase space corresponding to an energy $E$ divided by the volume $h_0^{3N}$ of a cell (I simplify a bit here: what should be computed first is actually the volume of the phase space corresponding to an energy lower than $E$). This estimate is accurate only if $h_0$ is chosen sufficiently small. See what happen to the entropy: since $\Omega(E)=Vol/h_0^{3N}$, entropy reads $$S(E)=k_B\ln Vol-3Nk_B\ln h_0$$ The parameter $h_0$ appears only as an additive constant. Since thermodynamic averages are given by derivatives of the entropy, they do not depend on $h_0$ (hopefully since the discretization of phase space is not physical) and you can give any value to $h_0$, as long it remains sufficiently small. Usually, no definite value is given to $h_0$.
- 3,548