26 Eylül 2012 Çarşamba

Measuring Complexity

To contact us Click HERE
How can we measure complexity?Whether we examine the pattern of notes of a piece of music, the pattern oflight and color in a painting, or the pattern of architecturalfeatures in a building, it is intuitively clear that some patterns are morecomplex than others. Is there a way to quantify this ideamathematically? Yes; in fact, there are several ways, with interestingconnections to Information Theory and Entropy.

Today we begin with a short illustration of one approach(known as Kolmogorov-Chaitin Complexity) from Guest Author Nikos A. Salingaros, Professor of Mathematics, Urbanist & Architectural Theorist at the University of Texas at San Antonio.I will follow his contribution with a few additional remarks of my own.



Kolmogorov-Chaitin Complexity

Nikos A. Salingaros, The University of Texas at San Antonio.

Trying to measure the complexity of a system is not straightforward. The simplest measure of a system's complexity reflects not so much its intrinsic complexity, or the complexity of the process that generated it, but the complexity of the system's description. As such, this approach is not without its limitations.

We can measure the complexity of a one-dimensional system by using a very simple notion: the length of its description in some programming language. This measure is known as the Kolmogorov-Chaitin Complexity. Analyzing a string of letters or numbers might reveal a pattern that repeats, and in that case, its description becomes shorter. I'm going to use binary entries of 0 and 1 in 70-character strings, yet the discussion is valid for any letters or digits. For example, it is obvious that this information string is not complex at all:

1111111111111111111111111111111111111111111111111111111111111111111111

Its generative algorithm is: repeat the digit 1 70 times, or "1 x 70". On the other hand, the more complex string

1010101010101010101010101010101010101010101010101010101010101010101010

still has only minimal complexity, since its algorithm is a short one: "10 x 35". But it has slightly more complexity than the first trivial string. The next orderly sequence is very different, yet it also possesses low complexity:

1111111111111111111111111111111111100000000000000000000000000000000000

It has the generating algorithm "1 x 35, 0 x 35". Something non-trivial occurs only at the 36th position. For genuinely complex strings of digits, we face two distinct types of complexity. First, there is organized or coherent complexity, such as the following binary sequence:

1000110101100011010110001101011000110101100011010110001101011000110101

Analyzing this information string for patterns, we discover hidden symmetries, and those allow us to reduce its generating code to alternating 10001 with 10101 7 times, or "10001 alternate 10101 x 7". We do have a complex string, yet internal patterns reduce the coding length of its generating algorithm.

The second type of complexity represents random, or disorganized complexity, as for example the following random sequence:

0010001111010101100001011100101111110101111111101001110001101000100101

It has maximal complexity, because its shortest description is exactly as long as the string itself. There is no possible compression, such as hidden patterns in its descriptive rule. We can be sure of that, since we generated this string by using a random-number generator.

Probability arguments produce one complication, however: it is quite possible to obtain internal patterns in a randomly-generated string. But you cannot count on it, and might have to try an astronomical number of times before you come up with a regular pattern.

The key mechanism revealed by these examples is the presence or not of patterns, which then permit compression of the informational string. Total compression is equivalent to informational collapse, and this occurs in cases of extremely low complexity. In examples of strings with non-trivial complexity, the ability to compress them reduces their complexity measure, but never to near zero.

I will now compute the length of the description of each algorithm for the informational strings presented as examples. This is a simple character count of the description in English, and I obtain the following values for all the strings:

  • 1 x 70 → 4 characters
  • 10 x 35 → 5 characters
  • 1 x 35, 0 x 35 → 9 characters
  • 10001 alternate 10101 x 7 → 21 characters
  • [Random] → 70 characters

Since we partially depend upon the English language for our descriptions, these algorithms are not expressed in the most efficient way; hence their length is not unique. The shortest description length obviously varies with the programming language used. But that doesn't matter here, as the model serves to illustrate one-dimensional complexity in a very simple manner. For example, the above complex string with internal patterns could be generated equally by the description "1000110101 x 7", which has a character count of only 12. This number is different from 21, yet still falls between 9 and 70; that is, between a trivially-simple string and a random string, which is the key point.

Another issue is that human perception plays an important role in describing complexity, even though this has nothing to do with the actual algorithm. Informationally-minimal strings do indeed look trivially simple to us. They are of no interest, as they do not communicate a message. Thus, extreme simplicity is not useful, and we require complexity above some threshold to transmit a message. Humans pay attention to ordered information, but too much information overwhelms our cognitive system. And complex strings of very different complexity may at first appear to be equally complex. It is only after we "see" a pattern that we can recognize the relative complexity of an informational string.

The Kolmogorov-Chaitin complexity has obvious limitations because it ignores the degree of internal symmetries, such as repetition and nesting, yet those directly affect our cognition. For example, the string referred to above is generated by the two "words" 10001 and 10101, each one of which is internally symmetric. This important feature is not measured by the character count. In fact, the algorithm in terms of the symmetric words gives us a higher character count (21) than the simpler repetition that takes no symmetry into account (12), which is counter-intuitive. We are most interested in, and wish to investigate, systems of information with complex ordered patterns.

Another important phenomenon not described by this model is when a very simple algorithm generates complexity iteratively throughout the entire system. Here, a simple rule such as a cellular automaton computes configurations in which every entry is constantly changing. With very few exceptions, there is no final state for a cellular automaton: the generated pattern continues to grow and transform. Any one of these states could be irreducibly complex, i.e. not compressible. When we try to find a description of such a system in terms of an algorithm that generates the information string sequentially (not in the way it was actually generated), it appears highly complex.

My interest in this topic arises from the description of design ingeneral, and architectural styles, or "form languages", in particular.Each form language is a description of how a building looks like andalso how it is constructed, and I wish to measure the relativecomplexity of different form languages. Traditional form languagesadapt to climate, local materials, human needs and uses, and culture,and are therefore irreducibly complex. They have many forces andconstraints to adapt to, and have evolved step-by-step over time. Eventhough the buildings that distinct form languages generate obviouslylook very different, all of them (languages and resulting designs)have an equivalent measure of complexity.

A word count of a form language's verbal description is a very roughmeasure of its Kolmogorov-Chaitin complexity. I ask my architecturestudents to fill out a rather standard checklist for a form language,and use their word processor to obtain a word count. Then students areable to compare their assigned form languages by comparing thesenumbers. This exercise separates minimalist from more complexbuildings. Nevertheless, I don't know how to solve the problem ofdistinguishing evolved form languages with high complexity, fromrandomly-generated and thus non-adaptive form languages.

Some architects introduce deliberate randomness for the sake of aninnovative "look". Nothing has evolved here, since no adaptation isinvolved. The design method is performed as an instantaneous artisticact. Those architects often use a "shortcut" to design their building,which, while not generating any adaptations, could be very complexindeed. This case is analogous to a simple rule (that we don't know)generating a random string. It would be useful to understand when aform language is simple, and to distinguish the two cases when it iscomplex due to adaptation, or complex because it is generatedrandomly.

Perhaps a future essay could discuss these matters.

REFERENCES.

"Kolmogorov complexity", entry from Wikipedia.

A. Klinger & N. A. Salingaros (2000) "A Pattern Measure", Environment and Planning B: Planning and Design, volume 27, pages 537-547. Republished by the Cornell University Library Arxiv, 28 August 2011.



And there you have it. Thanks, Nikos, for this introduction. Now I'd like to elaborate on some of the points raised in this essay with a few thoughts of my own.

First, the numerical value of the Kolmogorov-Chaitin (KC) complexity metricdepends on the programming language used to describe the generatingalgorithm. In the original three volumesof The Art of Computer Programming, computer scientist Don Knuthillustrates algorithms using an assembly language he created that is representative of computer capabilities around1970. In volume four, written decades later, he introduced a revised assembly language representative of computers from the presentmillennium. If one translates algorithms from the old language intothe new, they often become shorter, because the new language is insome sense more powerful. It's not that older computers couldn'tperform the same calculations that newer ones do -- they just took alot longer and needed a lot more instructions.

For example, older computers were often limited to working with 8 or16 bit integer quantities. To compute the area of a circle using thefloating point value of pi (3.1415926...) to 16 decimal digits ofaccuracy took a large number of instructions, just as if you had to dothe multiplication long hand with paper and pencil, digit by digit.In contrast, modern computers typically work with 64 or even 128 bitsat a time, and have built-in instructions for working with floatingpoint numbers, which means we can code such calculations with manyfewer instructions, typically just one per operation.

Second, computing the KC metric requires knowing the algorithm thatgenerated the string. Often, this algorithm is not available to theend user. Strictly speaking, we want the length of the shortestprogram capable of producing the given string. There is always thetrivial program

print "sss"

where the actual string is putinside the double quotes in place of "sss". This program is only 8characters longer than the actual string, so the KC complexity of astring of N letters is at most N+8, but is often less, as Nikosillustrated. A human looking at the string can often see patterns thatenable a shorter description, again as illustrated by Nikos, but wewonder, can a computer find such patterns, and, can we know we havereached the smallest possible description?

For any one specificstring, we may be able to convince ourselves that we have the shortestpossible algorithm for generating it. But the task in general isimpossible: one cannot write a computer program (in any language) thatis capable of computing the KC complexity of an arbitrary inputstring. The Wikipedia article sited above has a proof.Essentially, it is because computing KC complexity is equivalent tothe halting problem (computing whether an arbitrary program willalways halt), which is also impossible. In fact, for similar reasons,mathematicians cannot prove every true fact; this is called Godel'sincompleteness theorem.

Third, KC complexity is related to the topic of data compression. Whenyou watch a movie or (hi-def) TV show on your computer, you are ableto enjoy it in real time only because the data that makes up the moviehas been compressed. This is possible because real movies do not showrandom noise; instead, they show coherent images of people andbuildings that change only slightly from one frame to the next.The data for the movie is a sequence of many thousands of whole numbersspecifying the amount of red, green and blue light in every pixel ofthe screen in every frame. The TV displays many frames per second togive the illusion of motion. Much like in the examples earlier, thesimilarities from one frame to the next allow compression. One method,called "run length compression", replaces repeating symbols with acount, much like when Nikos writes something like '1 x 7' to mean1111111. Other more sophisticated methods exist that take into accountthe space-time structure of the data in movies -- the fixed size twodimensional grid repeated with minor changes over time -- to achieveeven greater amounts of compression.

This connects back to the point that you need to know the algorithm thatgenerated the string (or data file) before you can compute the KCcomplexity. If you view a movie data file as individual bytes of data,they will look very random indeed, and you may find it difficult tocompress it very much. However, if you know in advance that the datais not in fact arbitrary, but represents a movie with a given screenresolution, you will expect to find certain patternsand correlations in the data, which your compression algorithm canexploit.

Fourth and finally, KC complexityis also related to the question of how wedefine randomness. True randomness, like radioactive decay,requires infinitely many bits. Computers, which are finite, can onlycompute what mathematicians call pseudo-random numbers:sequences that look random on naive inspection, even though they areactually generated by deterministic algorithms -- often quite shortalgorithms, in fact, with quite low KC complexity values. Inparticular, pseudo-random number algorithms always wind up repeatingthemselves after a finite number of random draws. For instance, agenerator whose internal state consists of integers from 1 to N willnecessarily repeat after at most N draws, and possibly much soonerdepending on the algorithm. Knuth's Volume 2 (mentioned above) goesinto a lot of detail on generators, and subsequent researchers havecome up with even better ones, but they are all still finite.

One can pretend that a computer has access to an unbounded amount ofmemory space and cpu time, and discuss algorithms like cellularautomata whose state can grow arbitrarily large over time. While suchalgorithms might seem to escape from the tyranny of having a finitenumber of bits, in practice they cannot really solve the problem: asthe size of their internal state grows, so too will either the time toaccess it, the memory to store it, or both, to the point where anyreal computer will eventually have no resources left.

Moreover, randomness (or pseudo-randomness) is a property of thegenerating algorithm, not of the generated sequence. Suppose I have asource for generating random bits (either real random bits from aradioactive source, or pseudo-random ones from an algorithm with avery large internal state and very few correlations). Suppose too that0 and 1 are equally likely. Then, if I draw a sequence of 70 bits, thestrings

1111111111111111111111111111111111111111111111111111111111111111111111

and

0010001111010101100001011100101111110101111111101001110001101000100101

are in fact equally likely to emerge from the generator, justlike all the other strings in Nikos' examples above.

Note that I am speaking here of the specific sequence ofbits. It is true that if you count the number of zeros and ones, it ismore likely that you will get something close to 35 of each, ratherthan 70 vs 0. However, that does not make the string of 1x70any less likely than the more "random" looking example: each has achance of 2^(-70) of emerging from the generator, just like all theother 2^70 possible bit strings of length 70. However, it is the onlyone with no zeros, whereas there are many with 35 zeros and 35 ones.

Thus a string that emerges from a random process need notactually look random to a human: any specific string may haveinternal symmetries or regularities, with 1x70 being an extremeexample. That is why randomness is a property of the generatingalgorithm, not the resulting string. And even there, we have to becareful: if you know the generating algorithm (and its current internalstate), suddenly you can see through the randomness -- you can calculatesuccessive bits before they are drawn. So the most one can hope forfrom a pseudo-random number generator is that someone who does notknow the algorithm will be unable to detect any statisticallysignificant difference between the output of the generator and a true(e.g. radioactive) random source, when they are allowed to examine a "long"sequence of bits. Even "long" is subtle: not so long as to approachthe period of the generator (after which it would start to repeatitself), but otherwise long enough to have a lot of statistical power(i.e. 70 bits is not very many for testing). Good modern pseudo-randomgenerators can pass a series of stringent statistical tests known asDie Hard that examines millions of bits for a wide variety of correlation patterns.

For readers curious to learn more, an excellentstarting point is the book Information Theory, Inference, and Learning Algorithms by David MacKay, which is free for download on the web, but which is also well worth purchasing, in part for its beautiful typography (courtesy of Knuth's mathematical typesetting system "TeX" -- yes, the same Knuth -- it is a very small world).

If you enjoyed this article, you might also like Algorithmic Art, from which comes the picture at the top of today's post.It is a visualization of algorithmic complexity and randomness all mixed together with (hopefully) some aesthetic interest.

Also, as you may recall, Nikos wrote two previous Guest articles,Why Monotonous Repetition is Unsatisfying, and Applications of the Golden Mean to Architecture.He has also writtenThe"Life" of a Carpet: an Application of the Alexander Rules, whichis the best (most "mathematically" oriented) article I have yet seen onthe web on the topic of what characteristics of a work of art make itaesthetically pleasing.

As usual, please post questions, comments and other suggestions using the box below, or G-mail me directly at the address mentioned in the Welcome post. You can sign up for email alerts about new posts by entering your address in the widget on the sidebar, or follow @ingThruMath on Twitter, for a tweet about each new post. The Contents page has a complete list of previous articles in historical order. You may also want to use the 'Topic' and 'Search' widgets in the side-bar to find other articles of related interest.

If you enjoyed this article, click the "M"button below to email it to a friend, or the "t" button toTweet the link, or the "f" button to share it on Facebook.

Hiç yorum yok:

Yorum Gönder