ISSN 0032-9460, Problems of Information Transmission, 2009, Vol. 45, No. 2, pp. 75–94.
Pleiades Publishing, Inc., 2009.
Original Russian Text
F.J. Piera, P. Parada, 2009, published in Problemy Peredachi Informatsii, 2009, Vol. 45, No. 2, pp. 3–24.
On Convergence Properties of Shannon Entropy
F. J. Piera and P. Parada
Department of Electrical Engineering, University of Chile, Santiago, Chile
Received October 17, 2008; in ﬁnal form, December 23, 2008
Abstract—Convergence properties of Shannon entropy are studied. In the diﬀerential setting,
it is known that weak convergence of probability measures (convergence in distribution) is
not suﬃcient for convergence of the associated diﬀerential entropies. In that direction, an
interesting example is introduced and discussed in light of new general results provided here
for the desired diﬀerential entropy convergence, which take into account both compactly and
uncompactly supported densities. Convergence of diﬀerential entropy is also characterized in
terms of the Kullback–Liebler discriminant for densities with fairly general supports, and it is
shown that convergence in variation of probability measures guarantees such convergence under
an appropriate boundedness condition on the densities involved. Results for the discrete setting
are also provided, allowing for inﬁnitely supported probability measures, by taking advantage
of the equivalence between weak convergence and convergence in variation in that setting.
Convergence of a sequence of probability measure entropies plays a key role in information
theory, from both theoretical and applied points of view, mostly in the context of estimation of the
entropy of an information source [1–4].
The problem has been partially studied in the context of discrete sources; an evidence for
this is that some convergence results can be found in today’s standard textbooks of information
theory [5,6], and some recent works . A more general approach is found in the works of A. Barron,
where a proof of the central limit theorem based on entropy convergence  and of the entropy
convergence of stationary processes  are presented. The discussion of information topologies for
general sources  touches tangentially the problem of convergence in a more general setting.
However, the focus of many of these works is on continuity rather than convergence properties
of Shannon entropy. On the one hand, continuity properties embrace results that guarantee con-
vergence of entropy for all approximating sequences of probability measures converging to a given
limiting probability measure. Emphasis is put there in identifying the largest class of probability
measures for which the corresponding convergence of entropy takes place. On the other hand,
convergence properties are usually related to deciding whether convergence of entropy takes place
for a given ﬁxed family of probability measures, also converging in a certain topology to a limit-
ing probability measure. In the continuity context all requirements are imposed on the limiting
probability measure, in order to ensure convergence of entropy for all possible approximating se-
quences, whereas in the convergence context one can and should exploit any underlying structure
of a particular approximating sequence at hand, as is usually done in applied probability problems.
Supported in part by the Millennium Science Nucleus on Information and Randomness, University of
Chile, project no. P04-069-F.