Information

Information?: An Inquiry

Tuesdays, 9:30-11 am
Science Building, Room 227
Schedule | On-line Forum | Evolving Resource List

For further information contact Paul Grobstein.

Discussion Notes
15 June 2004

Participants:Al Albano (Physics), Doug Blank (Computer Science), Peter Brodfuehrer (Biology), Anne Dalke (English, Feminist and Gender Studies), Paul Grobstein (Biology), Eric Raimy (Linguistics, Swarthmore/Trico), George Weaver (Logic/Philosophy), Ted Wong (Biology)

Summary by Paul Grobstein
(presentation notes available;
see also George's additional notes)

Eric's linguistic example last week motivated a consideration of a potentially interesting/significant difference between the deep assumptions about information made by linguists/biologists/neurobiologists on the one hand and physicists on the other. I presented an argument (see notes) that physicists operate on the principal that information can neither be destroyed nor created whereas linguists/biologists/neurobiologists are not only quite comfortable with information destruction and creation but regard it as a fundamental aspect of the processes they are investigating.

There is, of course, substantial heterogeneity within any population of investigators, but a general physics/other distinction nonetheless seems useful as a pointer to distinct "logics" that may help to more generally and rigorously characterize "information". After considerable discussion, what emerged as the touchstone for this distinction was the situation of two differently crumbled pieces of paper each of which was burned. The result in both cases is, consistent with the second law of thermodynamics, smoke and ashes (greater disorder). From the "physics" perspective, the two situations are different. One can in principle (by addition of energy) reverse each of the two burnings and so recover the two differently crumbled pieces of paper. To put it differently, the smoke and ashes in each case still contain all of the information present originally. From the "other" perspective, what occurs with burning is a significant "loss" of information, a genuine "increased randomness": there is no difference between the smoke and ashes in the two cases that would allow reconstruction of the two original crumblings.

The core issue here is whether one interprets the second law of thermodynamics as a description of loss of information accessible to an observer or as loss of information in some relatively observer-independent sense. Boltzmann's characterization of entropy increase involves the former rather than the latter. Burning in the two cases creates "microstates" that are quite distinct and map in a one to one fashion back onto the original observable "macrostate", but the two microstates are indistinguishable by an observer both from each other and from a state that would be produced by a randomizer of particle locations and energies. Poincare called attention to this odd observer-dependence, which results from an assumption that all particle interactions are reversible in time, early in the 20th century. The alternate interpretation of the second law, more consistent with the "loss of information" perspective, is that the production of the two microstates actually involves some processes that are irreversible (in a sense to be defined below). In this case, the original configurations are not even in principle recoverable.

What emerged from this discussion was a recognition that information "loss" was a bigger concern of physicists than of others because of the the successes of physics in characterizing processes using the concept of "reversibility", ie the successes of physical laws stated in terms of equations that are time reversible. Despite these successes, there remains in physics an acknowledged uncertainty about how processes that are locally reversible can yield outcomes that are globally irreversible. Outside of physics, investigators tend to presume varying degrees of irreversibility and so the quandry doesn't arise (or is less sharply felt).

An example from neurobiology was used to explore the notion of information loss outside physics, and its relation to "irreversibility". I argued that what seemed to be information "loss" in the case of a lateral inhibition network in the retina was actually more complicated. A Shannon information comparison of the inputs and outputs of such a network would show a loss since the network removes all representation of low spatial frequencies. But there is also an information "gain". The network contains information (resulting from evolution, and saying essentially "these components of the signal are noteworthy because ...") and so the output signal can be usefully thought of as the input signal combined with the information inherent in the network. Any effort to quantify biological information must thus deal with both loss and gain, and frequently with both occuring simultaneously, and must include the idea that the information content of any particular signal depends in part on the decoders it has (or has not) passed through.

The lateral inhibition example also was useful in thinking about "irreversibility". What is going on in this case is not information "compression" in the sense the term is usually used; the point is not to put the information in a small package with the intent of faithfully re-expanding it. The point is instead to describe the input in terms of significant categories (in the lateral inhibition case "edges", invariants in an input pattern which is otherwise largely characterized by continuous change, both temporally and spatially, due to changes in position and intensity of illumination sources). It was noted that not only here but generally "categorization" involves irreversible information "loss" because it involves the representation of inputs with higher dimensionality (more degrees of freedom) in frameworks of lower dimensionality (fewer degrees of freedom). At the same time, categorization involves information "gain" (as above) and yields the potential to create "new" information by the combination of categories (eg a four-dimensional cube may or may not have even occurred in input signals but can be conceived because of the way input signals are categorized by human brains). This led to the suggestion that cycles of information "compression" and "expansion" may be among the most fundamental characteristics of biological information processing, with the objective not being "faithful" representation of information but rather the creation of new information. An intriguing specific to follow up on in this regard is the possible use of the same circuitry in one direction for speech comprehension and in the other for speech production (with a "story teller" to detect, make further use of change/novelty?). Just as categorization produces irreversible information "loss", the movement from lower dimensional to higher dimensional representations necessary involves the de novo production of information (at least locally, and perhaps globally, if the added information results from a random process).

One set of issues that arose in discussion related to the relation between these ideas and the realities of social/political change. If information is "lost" in creating categories, is it necessarily the case that some activities of human beings (and groups of human beings) are "lost" in terms of impact on future human activities. If "history" is the record of the past using categories created by a subset of humans (white males?) what about the activities of other humans? It was suggested that some human activity may indeed be "lost" (is so inevitably?) but that "academic history" is only one set of categories by which traces of the past influence the present/future. That set of categories interacts with other categories (eg folk history, popular culture, individual histories) that may effectively represent (more or less so, depending on individual perspectives) other human activities using other category schemes.

The other set of issues that arose in discussion represented an effort to try and understand why "irreversibility" (in the sense of information loss) would be comfortable terrain for many (who might even regard categorization, and hence irreversibility, as fundamental to information processing) but something to be avoided by many physicists. In this regard it seemed useful to note that "irreversibility" is a characteristic not only of nervous systems but also of mathematics and logic. And that the "categorization" notion could be generalized using the ideas of functions and "equivalence classes".

By mapping elements of one set (an "input" set) to elements of another (an "output" set) functions create "equivalence classes", subsets of elements in an input set that are mapped to the same element in the output set. Equivalence classses of a size larger than 1 constitute "categories". And these equivalence classes can be used to create new equivalence classes whereas equivalence classes of size 1 (one to one mappings) "don't get one anywhere". This formalism thus displays the same complex information loss/gain/potential new creation pattern that emerged from thinking about categories but does so in a realm that may be usefully less laden with what may be context-specific considerations.

In these terms what is interesting is the contrast between two "logics" (deep thought structures) and possible differences that may be needed to think effectively about matter/energy on the one hand and "information" on the other. The contrast would seem to be an inclination to use functions of differing characters on differing sets

physics	other
continuous	discrete
determinate	indeterminate
infinite	finite (or at least countable)

This scheme raises some interesting suggestions about the character of progress/change in various sciences (artificial intellgience, mathematics, biology, computer science, physics itself?) and poses an open question as to whether there is something inevitable about the way the three dichotomies associate and whether it relates in some fashion to whether one is studying mass/energy or "information".

One perhaps relevant consideration along the latter, more general line is that one to one mappings are much easier to obtain/presume when one is working with infinite sets (they are impossible in cases of finite sets of different sizes). And that one to one mappings correspond to "determinate" (reversible) systems. In addition, discrete functions (eg threshholding) have a strong tendency to create logical non-reversibility ("information loss") whereas continuous functions don't. A particularly interesting aspect of this is that pattern detection/categorization would seem on that face of it to require discrete/irreversible processing. What is it about matter/energy that makes pattern detection/categorization more effectively done using continuous/infinite presumptions? Is it possible that limitations of this approach in relation to "information" relate to the entangled presumption of determinacy? And a resulting difficulty in detecting patterns in sequential processes? Here there is a noteworthy distinction between "classical" physics and the approach Wolfram typifies in which the presumption of a continuity in time (and space) is abandoned but determinacy is retained. Another way to say this is that physics uses a one to one continous mapping in time (hence precludes equivalence classes of size greater than one); the inherent determinacy remains if one shifts to a discrete mapping in time. Could this create problems in making sense of "information" (and elsewhere?)

What has been remarkable about physics is its demonstrated capability to create e models that have extraordinary predictive power with regard to mass/energy. To do so it has largely depending on "cutting the world into lots of little pieces", establishing rules of local interaction among the local pieces, and then inferring higher order properties/results from the continuous/determinate/infinite models that result. The upshot, however, may be that physics is relatively blind to sequential equivalence classes (patterns/categories in time) and to indeterminacy/irreversibility.

If categorization/indeterminacy/irreversibility are fundamental aspects of the "information" (the transformation among "non-random distributions of matter/energy" themselves brought about by non-random distributions of matter/energy), then laws of information need to be compatible with existing laws of matter/energy but may in fact have a quite different "logical" character, more akin to the second logic in the comparisons discussed. This wouldn't make "information" fundamentally a human observer based concept (it depends on a decoder that may or may not be a human) but would require humans to think about information in ways different from the ways that have been successful in thinking about matter/energy. Along this line, it is worth thinking more about how "matter/energy" might differ from "non-random distributions of matter/energy".

Information?: An Inquiry

Discussion Notes15 June 2004

Discussion Notes
15 June 2004