Application of Statistical Physics to Coding Theory: Agenda

MAIN PAGE

REGISTER

WORKSHOP INFO

CNLS

LANL

Applications of Statistical Physics to Coding Theory
Santa Fe, New Mexico
Jan 10 — Jan 12, 2005

Sun, Jan 9
6:00pm--8:00pm Welcome Reception (Bishops lodge)

Mon, Jan 10
7:00am--8:00am Breakfast
8:00am--8:15am Registration
8:15am--8:30am Welcome: LANL, CNLS executives, workshop organizers
8:30am--9:15am Ralph Koetter (UIUC): Iterative Coding Techniques, Pseudocodewords, And Their Relationship
9:15am--10:00am Tom Richardson (Flarion, NJ): Scaling Laws and Error Floors
10:00am--10:15am Break
10:15am--11:00am Andrea Montanari (ENS): Area Theorems For Iterative Coding Systems
11:00am--11:45am Vladimir Chernyak (Wayne State U.): Local theory of BER for LDPC codes: instantons on a Tree
11:45am--12:30pm Richard Hughes (LANL, P-21): Error-control coding in Quantum key distribution
12:30pm--2:00pm Lunch
2:00pm--2:45pm Nicolas Sourlas (ENS): Soft annealing: A new approach to difficult computational problems
2:45pm--3:30pm Yoshiyuki Kabashima (Tokyo Tech): Statistical Mechanical Approach To CDMA Multiuser Detection Algorithm
3:30pm--3:45pm Break
3:45pm--4:30pm Alexei Ashikhmin (Bell-Labs): On EXIT Functions for Discrete Symmetric Channels
4:30pm--5:30pm Discussion (Physicists led/ Mezard-Sourlas)

Tue, Jan 11

7:00am--8:30am Breakfast
8:30am--9:15am Marc Mezard (Orsay, Paris): Source Coding With Low Density Nonlinear Nodes
9:15am--10:00am Martin Wainwright (Berkeley): Codeword Polytopes And Linear Programming Relaxations For Error-Control Coding
10:00am--10:15am Break
10:15am--11:00am M. Amin Shokrollahi (EPFL, Switzerland): Use of Threshold Phenomena for the Design of Fountain Codes
11:00am--11:45am Jonathan S. Yedidia (Mitsubishi): Some Surprises in the Theory of Generalized Belief Propagation
11:45am--12:30pm Olgica Milenkovic (Boulder): Coding For DNA Computing: Combinatorial and Biophysical Aspects
12:30pm--2:00pm Lunch
2:00pm--2:45pm Paul H. Siegel (UCSD): Capacity of noiseless and noisy two-dimensional channels
2:45pm--3:30pm Alexander Barg (U Maryland): Multilevel generalizations of expander codes
3:30pm--3:45pm Break
3:45pm--4:30pm David Saad (Aston,Birmingham): Statistical-Mechanics Analysis of LDPC-Coded CDMA
4:30pm--5:30pm Discussion (Coding Theorists led/ Koetter-Richardson)
7:00pm--10:00pm Banquet (Bishops lodge)

Wed, Jan 12
7:00am--8:30am Breakfast
8:30am--9:15am Zoltan Toroczkai (LANL, CNLS/T-13): Fat Tail Distributions And Efficiency Of Flow Processing On Complex Networks
9:15am--10:00am Ljupco Kocarev (UCSD): Nonlinear Dynamics of Iterative Decoding Systems: Analysis and Applications
10:00am--10:15am Break
10:15am--11:00am Misha Stepanov (LANL, CNLS/T-13): Instanton approach for codes without/with loops
11:00am--11:15am Closing Remarks
12:00pm--2:00pm Lunch

ABSTRACTS

Alexei Ashikhmin (Lucent)
On EXIT Functions for Discrete Symmetric Channels
The EXIT functions are the tool for predicting the convergence behavior of iterative decoding algorithms. In this talk I will present some results on the EXIT functions in discrete memoryless channels. In particular, we will consider analytical expressions for the EXIT functions of repetition and single parity check codes, upper and lower bounds for the EXIT functions of these codes, application of these results for estimation of performance of LDPC codes, and approximation of the EXIT functions for the Hamming and Simplex codes. The talk is based on joint works with Yibo Jian, Simon Litsyn, and Eran Sharon
Alexander Barg (U Maryland)
Multilevel generalizations of expander codes
By a well-known sequence of results, serially concatenated codes can be generalized to a multilevel construction which improves their error correction capabilities. In a series of papers with Gilles Zemor we have defined a family of bipartite graph codes that matches the performace of serial concatenations (with improved decoding complexity). This talk will cover generalizations of this family to multilevel parallel concatenations.
Vladimir Chernyak (Wayne State U.)
Local theory of BER for LDPC codes: instantons on a Tree
We introduce a method based on instanton (optimal fluctuation) calculus that allows analytical or semianalytical estimating of the post-error correction bit error rate (BER) when forward error correction (FEC) is applied for transmitting information through a noisy channel. The generic method is applied to low-density parity check (LDPC) codes when an iterative message passing (MP) decoding strategy is applied. We utilize the universal tree-like local structure of LDPC codes and identify the phenomena that are determined by local code properties only and, therefore, can be analyzed on the trees. In particular, we show that the Shannon’s transition for an LDPC code is universal and does not depend on the global structure of the code. Exploring the statistical physics formulation of the problem we find that even in the loop-free case the BER decreases with the signal-to-noise ratio (SNR) nonuniformly, i.e. crossing over through a sequence of phases. The higher is the SNR the lower the symmetry of the phase that dominates BER. We also discuss the instanton structure of LDPC codes with loops in the high SNR case.
Richard Hughes (LANL/P-21)
Error-control coding in Quantum key distribution
Quantum key distribution (QKD), invented 20 years ago by Charles Bennett and Gilles Brassard, uses single-photon communications to generate the shared, secret random number sequences that are used to encrypt and decrypt secret communications. The security of QKD is based on the interplay between fundamental principles of quantum physics and information theory. An adversary can neither successfully tap the transmissions, nor evade detection (eavesdropping raises the key error rate above a threshold value). I will describe the quantum-physical and information-theoretic ingredients of the most widely-adopted QKD protocol, known as BB84, and illustrate these principles with examples from QKD experiments that our team has performed over both multi-kilometer line-of-sight atmospheric paths, and over optical fiber networks. I will also describe the requirements and particular challenges of error-control coding for QKD.
Yoshiyuki Kabashima (Tokyo Institute of Technology)
Statistical mechanical approach to CDMA multiuser detection algorithm
Code division multiple access (CDMA) is a core technology of modern wireless communication employing data transmission between multiple terminals and a single base station. In this scheme, the base station has to detect the transmitted multiple binary signals from the received noisy mixture of the signals. This process can be cast to a problem of a certain type of Ising spin glass system. We show that an approximate detection algorithm of excellent performance can be developed based on techniques and concepts of the mean field theory of spin glasses.
Ljupco Kocarev (UCSD)
Nonlinear Dynamics of Iterative Decoding Systems: Analysis and Applications
Iterative decoding algorithms may be viewed as high-dimensional nonlinear dynamical systems, depending on a large number of parameters. In this talk, I introduce a simplified description of several iterative decoding algorithms in terms of the a posteriori average entropy, and study them as a function of a single parameter that closely approximates the signal-to-noise ratio (SNR). Using this approach, I show that virtually all the iterative decoding schemes in use today exhibit similar qualitative dynamics. In particular, a whole range of phenomena known to occur in nonlinear systems, such as existence of multiple fixed points, oscillatory behavior, bifurcations, chaos, and transient chaos are found in iterative decoding algorithms.
Ralph Koetter (UIUC)
Iterative coding techniques, pseudocodewords, and their relationship.
While it is fair to say that iterative decoding techniques have revolutionized much of communications, the theoretical understanding of these algorithms is still incomplete. We review the concepts of pseudocodewords in iterative decoding and their relation to other decoding algorithms. Special attention is given to the relationship of pseudocodewords with similar concepts as stopping-sets and near-codewords (trapping-sets). (joint work with Pascal O. Vontobel)
Marc Mezard (Orsay)
Source coding with low density nonlinear nodes
I present a new source coding scheme which uses a low density factor graph, with function nodes which are not parity checks, but rather some nonlinear nodes. Encoding is done using the survey propagation algorithm, decoding is trivial. Joint work with S. Ciliberti (Orsay) and R. Zecchina (Trieste).
Olgica Milenkovich (Boulder)
Coding For DNA Computing: Combinatorial and Biophysical Aspects
In this talk, we will introduce some important new problems arising in the area of error-control and constrained coding for DNA computers, with application to biosensing, DNA storage and ``smart disease diagnostics''. The coding constraints are determined by the requirement that DNA oligonucleotides used in the process do not hybridize in an undesired fashion, nor form (individually) secondary or tertiary structures. Biophysical analysis is required to determine the thermodynamical properties of sequences which would achieve such a goal. This is a joint work with N. Kashyap, Queen's University, CA
Andrea Montanari (ENS,Paris)
Area theorems for iterative coding systems
The performances of low density parity check and turbo codes have been often described in terms of empirical rules governing the areas under extrinsic entropy curves. We provide an exact formulation of such empirical rules, and prove several consequences for the asymptotic analysis of code ensembles. Connections with statistical mechanics will be discussed.
Thomas Richardson (Flarion)

In this talk we will review some recent results on scaling laws for waterfall curves of iterative decoding and a computational technique used to predict error floors. Some practical implications and related open issues will be discussed.
David Saad (Aston,Birmingham)
Statistical-Mechanics Analysis of LDPC-Coded CDMA
A code-division multiple-access (CDMA) system, in which regular low-density parity-check (LDPC) codes are used for channel coding, is analysed in the large-system and the infinite-codelength limits, using a statistical-mechanics framework. An analytical result for performance of the optimum joint detection/decoding scheme applied to the LDPC-coded CDMA system is given. Comparison with the performance of the separate detection/decoding scheme is also discussed (joint work with Toshiyuki Tanaka).
Amin Shokrollahi (EPFL)
Use of Threshold Phenomena for the Design of Fountain Codes
In this talk I will give an overview of methods used to construct Fountain Codes based on threshold phenomena for random graphs. The examples deal primarily with the case of the BEC, where the phenomena are best understood. At the end of the talk I will move towards the case of general symmetric channels, and report on some partial results.
Paul H. Siegel (UCSD)
Capacity of noiseless and noisy two-dimensional channels
In this talk, I will address the problem of finding the capacity of noiseless, as well as noisy, finite-state channels used to model input-constrained digital data recording systems. I will consider channels in both one and two dimensions. The former correspond to the "track-oriented" recording found in today's ubiquitous magnetic and optical disk drives, while the latter relate to the "page-oriented" recording typical of more exploratory two-dimensional optical (TwoDOS) and holographic storage technologies. There are some interesting connections between these information-theoretic problems and problems that arise in statistical physics; for example, determining the capacity of certain two-dimensional noiseless "runlength-limited" channels is closely related to "solving" certain two-dimensional lattice models of gases. In two dimensions, especially, there are many more questions than answers, and I will highlight some of the challenging problems that remain open.
Nicolas Sourlas (ENS)
Soft annealing: A new approach to difficult computational problems
I propose a new method to study computationally difficult problems. I imbed the system I want to study into a larger system. The original system is recovered by imposing constrains on the large system. I simulate the large system with the hard constrains replaced by soft constrains. I illustrate the method in the case of the notoriously difficult case the three dimensional spin-glass model. I show that the phases of the soft problem have the same properties as the phases of the original model and that the softened model belongs to the same universality class as the original one. I show that correlation times are much shorter in the larger soft constrained system and that it is computationally advantageous to study it instead of the original system. This method is quite general and can be applied to many other systems.
Misha Stepanov (LANL, CNLS/T-13)
Instanton approach for codes without/with loops
The instanton method of BER estimation is tested on short codes. The dependence of ``optimal'' noise configuration (most probable one that causes an error) on SNR is analyzed. It is demonstrated that the configuration can bifurcate several times while SNR goes from low values to high ones. Change in the optimal noise configuration with SNR is allegedly connected with error floor phenomenon.
Zoltan Toroczkai (CNLS & T-13)
Fat tail distributions and efficiency of flow processing on complex networks
It has recently been observed that many large-scale information networks, for example the router-level internet, the www, some e-mail networks, exhibit a power-law behavior for their degree distributions. Here we review some of our recent results on connecting this observation with the processing efficiency of flows on these networks. We then apply some of these ideas and study the role of fat tail distributions in erasure correcting graph codes. Authors: Y.-J. Chung and Z. Toroczkai.
Martin Wainwright (Berkeley)
Codeword polytopes and linear programming relaxations for error-control coding
The problem of maximum likelihood (ML) decoding can be formulated as an integer-programming problem that is NP-hard in general. This talk discusses the use of linear programming relaxations for approximate decoding of binary linear codes. The basic idea is to construct an outer bound or relaxation of the codeword polytope, and solve the relaxed linear program. We describe a hierarchy of relaxations associated with codes in graphical form, such as LDPC and turbo codes. As we discuss, the first-order relaxation has natural connections to standard iterative methods like belief propagation and min-sum decoding. We describe some performance guarantees for LP decoding in the finite-length setting, in terms of pseudocodewords, fractional distance, and expansion properties. We discuss methods to construct stronger relaxations, and natural questions associated with the resulting LP decoders. Based on joint work with Jon Feldman, David Karger, Tal Malkin, Rocco Servedio, Cliff Stein,
Jonathan Jedidia (MERL)
Some Surprises in the Theory of Generalized Belief Propagation
I first review the theoretical framework for generalized belief propagation algorithms and their relationship to approximations to the variational free energy developed using "region graphs." I then describe some surprising problems that can occur if an approximation is chosen that does not obey the "maxent-normal" property, and describe a simple heuristic for choosing good approximations.

Alexei Ashikhmin (Lucent) On EXIT Functions for Discrete Symmetric Channels	The EXIT functions are the tool for predicting the convergence behavior of iterative decoding algorithms. In this talk I will present some results on the EXIT functions in discrete memoryless channels. In particular, we will consider analytical expressions for the EXIT functions of repetition and single parity check codes, upper and lower bounds for the EXIT functions of these codes, application of these results for estimation of performance of LDPC codes, and approximation of the EXIT functions for the Hamming and Simplex codes. The talk is based on joint works with Yibo Jian, Simon Litsyn, and Eran Sharon
Alexander Barg (U Maryland) Multilevel generalizations of expander codes	By a well-known sequence of results, serially concatenated codes can be generalized to a multilevel construction which improves their error correction capabilities. In a series of papers with Gilles Zemor we have defined a family of bipartite graph codes that matches the performace of serial concatenations (with improved decoding complexity). This talk will cover generalizations of this family to multilevel parallel concatenations.
Vladimir Chernyak (Wayne State U.) Local theory of BER for LDPC codes: instantons on a Tree	We introduce a method based on instanton (optimal fluctuation) calculus that allows analytical or semianalytical estimating of the post-error correction bit error rate (BER) when forward error correction (FEC) is applied for transmitting information through a noisy channel. The generic method is applied to low-density parity check (LDPC) codes when an iterative message passing (MP) decoding strategy is applied. We utilize the universal tree-like local structure of LDPC codes and identify the phenomena that are determined by local code properties only and, therefore, can be analyzed on the trees. In particular, we show that the Shannon’s transition for an LDPC code is universal and does not depend on the global structure of the code. Exploring the statistical physics formulation of the problem we find that even in the loop-free case the BER decreases with the signal-to-noise ratio (SNR) nonuniformly, i.e. crossing over through a sequence of phases. The higher is the SNR the lower the symmetry of the phase that dominates BER. We also discuss the instanton structure of LDPC codes with loops in the high SNR case.
Richard Hughes (LANL/P-21) Error-control coding in Quantum key distribution	Quantum key distribution (QKD), invented 20 years ago by Charles Bennett and Gilles Brassard, uses single-photon communications to generate the shared, secret random number sequences that are used to encrypt and decrypt secret communications. The security of QKD is based on the interplay between fundamental principles of quantum physics and information theory. An adversary can neither successfully tap the transmissions, nor evade detection (eavesdropping raises the key error rate above a threshold value). I will describe the quantum-physical and information-theoretic ingredients of the most widely-adopted QKD protocol, known as BB84, and illustrate these principles with examples from QKD experiments that our team has performed over both multi-kilometer line-of-sight atmospheric paths, and over optical fiber networks. I will also describe the requirements and particular challenges of error-control coding for QKD.
Yoshiyuki Kabashima (Tokyo Institute of Technology) Statistical mechanical approach to CDMA multiuser detection algorithm	Code division multiple access (CDMA) is a core technology of modern wireless communication employing data transmission between multiple terminals and a single base station. In this scheme, the base station has to detect the transmitted multiple binary signals from the received noisy mixture of the signals. This process can be cast to a problem of a certain type of Ising spin glass system. We show that an approximate detection algorithm of excellent performance can be developed based on techniques and concepts of the mean field theory of spin glasses.
Ljupco Kocarev (UCSD) Nonlinear Dynamics of Iterative Decoding Systems: Analysis and Applications	Iterative decoding algorithms may be viewed as high-dimensional nonlinear dynamical systems, depending on a large number of parameters. In this talk, I introduce a simplified description of several iterative decoding algorithms in terms of the a posteriori average entropy, and study them as a function of a single parameter that closely approximates the signal-to-noise ratio (SNR). Using this approach, I show that virtually all the iterative decoding schemes in use today exhibit similar qualitative dynamics. In particular, a whole range of phenomena known to occur in nonlinear systems, such as existence of multiple fixed points, oscillatory behavior, bifurcations, chaos, and transient chaos are found in iterative decoding algorithms.
Ralph Koetter (UIUC) Iterative coding techniques, pseudocodewords, and their relationship.	While it is fair to say that iterative decoding techniques have revolutionized much of communications, the theoretical understanding of these algorithms is still incomplete. We review the concepts of pseudocodewords in iterative decoding and their relation to other decoding algorithms. Special attention is given to the relationship of pseudocodewords with similar concepts as stopping-sets and near-codewords (trapping-sets). (joint work with Pascal O. Vontobel)
Marc Mezard (Orsay) Source coding with low density nonlinear nodes	I present a new source coding scheme which uses a low density factor graph, with function nodes which are not parity checks, but rather some nonlinear nodes. Encoding is done using the survey propagation algorithm, decoding is trivial. Joint work with S. Ciliberti (Orsay) and R. Zecchina (Trieste).
Olgica Milenkovich (Boulder) Coding For DNA Computing: Combinatorial and Biophysical Aspects	In this talk, we will introduce some important new problems arising in the area of error-control and constrained coding for DNA computers, with application to biosensing, DNA storage and ``smart disease diagnostics''. The coding constraints are determined by the requirement that DNA oligonucleotides used in the process do not hybridize in an undesired fashion, nor form (individually) secondary or tertiary structures. Biophysical analysis is required to determine the thermodynamical properties of sequences which would achieve such a goal. This is a joint work with N. Kashyap, Queen's University, CA
Andrea Montanari (ENS,Paris) Area theorems for iterative coding systems	The performances of low density parity check and turbo codes have been often described in terms of empirical rules governing the areas under extrinsic entropy curves. We provide an exact formulation of such empirical rules, and prove several consequences for the asymptotic analysis of code ensembles. Connections with statistical mechanics will be discussed.
Thomas Richardson (Flarion)	In this talk we will review some recent results on scaling laws for waterfall curves of iterative decoding and a computational technique used to predict error floors. Some practical implications and related open issues will be discussed.
David Saad (Aston,Birmingham) Statistical-Mechanics Analysis of LDPC-Coded CDMA	A code-division multiple-access (CDMA) system, in which regular low-density parity-check (LDPC) codes are used for channel coding, is analysed in the large-system and the infinite-codelength limits, using a statistical-mechanics framework. An analytical result for performance of the optimum joint detection/decoding scheme applied to the LDPC-coded CDMA system is given. Comparison with the performance of the separate detection/decoding scheme is also discussed (joint work with Toshiyuki Tanaka).
Amin Shokrollahi (EPFL) Use of Threshold Phenomena for the Design of Fountain Codes	In this talk I will give an overview of methods used to construct Fountain Codes based on threshold phenomena for random graphs. The examples deal primarily with the case of the BEC, where the phenomena are best understood. At the end of the talk I will move towards the case of general symmetric channels, and report on some partial results.
Paul H. Siegel (UCSD) Capacity of noiseless and noisy two-dimensional channels	In this talk, I will address the problem of finding the capacity of noiseless, as well as noisy, finite-state channels used to model input-constrained digital data recording systems. I will consider channels in both one and two dimensions. The former correspond to the "track-oriented" recording found in today's ubiquitous magnetic and optical disk drives, while the latter relate to the "page-oriented" recording typical of more exploratory two-dimensional optical (TwoDOS) and holographic storage technologies. There are some interesting connections between these information-theoretic problems and problems that arise in statistical physics; for example, determining the capacity of certain two-dimensional noiseless "runlength-limited" channels is closely related to "solving" certain two-dimensional lattice models of gases. In two dimensions, especially, there are many more questions than answers, and I will highlight some of the challenging problems that remain open.
Nicolas Sourlas (ENS) Soft annealing: A new approach to difficult computational problems	I propose a new method to study computationally difficult problems. I imbed the system I want to study into a larger system. The original system is recovered by imposing constrains on the large system. I simulate the large system with the hard constrains replaced by soft constrains. I illustrate the method in the case of the notoriously difficult case the three dimensional spin-glass model. I show that the phases of the soft problem have the same properties as the phases of the original model and that the softened model belongs to the same universality class as the original one. I show that correlation times are much shorter in the larger soft constrained system and that it is computationally advantageous to study it instead of the original system. This method is quite general and can be applied to many other systems.
Misha Stepanov (LANL, CNLS/T-13) Instanton approach for codes without/with loops	The instanton method of BER estimation is tested on short codes. The dependence of ``optimal'' noise configuration (most probable one that causes an error) on SNR is analyzed. It is demonstrated that the configuration can bifurcate several times while SNR goes from low values to high ones. Change in the optimal noise configuration with SNR is allegedly connected with error floor phenomenon.
Zoltan Toroczkai (CNLS & T-13) Fat tail distributions and efficiency of flow processing on complex networks	It has recently been observed that many large-scale information networks, for example the router-level internet, the www, some e-mail networks, exhibit a power-law behavior for their degree distributions. Here we review some of our recent results on connecting this observation with the processing efficiency of flows on these networks. We then apply some of these ideas and study the role of fat tail distributions in erasure correcting graph codes. Authors: Y.-J. Chung and Z. Toroczkai.
Martin Wainwright (Berkeley) Codeword polytopes and linear programming relaxations for error-control coding	The problem of maximum likelihood (ML) decoding can be formulated as an integer-programming problem that is NP-hard in general. This talk discusses the use of linear programming relaxations for approximate decoding of binary linear codes. The basic idea is to construct an outer bound or relaxation of the codeword polytope, and solve the relaxed linear program. We describe a hierarchy of relaxations associated with codes in graphical form, such as LDPC and turbo codes. As we discuss, the first-order relaxation has natural connections to standard iterative methods like belief propagation and min-sum decoding. We describe some performance guarantees for LP decoding in the finite-length setting, in terms of pseudocodewords, fractional distance, and expansion properties. We discuss methods to construct stronger relaxations, and natural questions associated with the resulting LP decoders. Based on joint work with Jon Feldman, David Karger, Tal Malkin, Rocco Servedio, Cliff Stein,
Jonathan Jedidia (MERL) Some Surprises in the Theory of Generalized Belief Propagation	I first review the theoretical framework for generalized belief propagation algorithms and their relationship to approximations to the variational free energy developed using "region graphs." I then describe some surprising problems that can occur if an approximation is chosen that does not obey the "maxent-normal" property, and describe a simple heuristic for choosing good approximations.